Right now a Syntax format is <id>/<version>. And you can call Syntax#getType()#getId() and Syntax#getVersion().
Note that we also have a qualifier but it’s used only for when displaying the syntax into string and we introduced it to mark a syntax as experimental.
What we are missing is the concept of syntax variants. This can be useful for example if you wish to list all syntaxes that exist for a base syntax (e.g. all Markdown syntaxes or all XWiki syntaxes).
Not sure what you mean by several variants? Either you’re talking about a specific Syntax (such as Markdown) which is allowed to have several variants (but not all at the same time). Or you’re talking about a Syntax instance which can have 0 or 1 variant (e.g. XWiki 2.1 with no variant and Markdown Github with github variant only).
2 variants: markdown+commonmark+0.28/1.0 (in this case 0.28 is the commonmark spec version, 1.0 is the implementation version of that syntax in XWiki).
hmm, shouldn’t we consider this as a unique variant of the same syntax then?
You’d have:
Markdown+github/1.0
Markdown+github-commonmark-0.xx/1.0
Markdown+github-commonmark-0.28/1.0
All those would be syntaxes with a unique variant, wouldn’t that work for you? I’m not sure to see the need for having several variants, sounds like it would be pretty rare to reuse a variant in several syntaxes.
No, it wouldn’t work. I really don’t see why there would be only a single variant per syntax. This would negate the rationale for materializing variants in the first place (since you’d need to implement custom parsing to extract the various variant parts yourself, as we have now - and since it wouldn’t be standardized it would be impossible to do!)
Note that if I follow what you said above it would mean that 0.28 would be parsed as a variant and completely decorrelated from commonmark variant, so I find this example a bit odd.
For me the idea would be to define a syntax as the combination of:
a syntax type
a variant
a version
Semantically I find that weird to have in the same syntax different variants. I now understand the need you have for parsing different components of a variant, but then I would distinguish: the global unique variant (maybe made of various components) and the different variant components.
A variant is not something that exists outside of the syntax. It’s a variant of the syntax and the list matters. If you take github or commonmark outside of any context it doesn’t mean anything. What makes sense is [commonmark, 0.28] in this example.
I think you’re overdoing it and you’re trying to put too much meaning to variants. They’re just variations to the syntax and the idea of supporting more than 1 is to allow for expansion. I really don’t see the problem of returning List<String> instead of String and considering that + is a variant delimiter. If you don’t need more than 1 variant the don’t use it. If your variant is named commonmark-0.28 then it’s also fine. But the idea is to support several variants here because I’m pretty sure that we need more than one. I already proved it with the example markdown+commonmark+0.28. Note that there can also be markdown+commonmark+0.27.
I really don’t see why. Why a syntax would only have a single variant? Maybe you don’t like the name Variant? We can pick something else although I like it. It could be “specifier” too if you prefer that.
Again I’m putting examples of syntaxes with different variants, differing only on the last variant:
Ok let’s cut it there, it’s not a big deal. I agree in general with the idea and indeed it’s not mandatory to use a list of variants if I don’t want to
Thanks @surli for the feedback. I’d also like to have @tmortagne’s opinion on this proposal too, especially as he’s the one who started the concept of using + in syntax types.
We need to decide where we put the variants. In my local proof of concept code I’ve started, I’ve put it in the Syntax and not in the SyntaxType and considered the SyntaxType to represent the base syntax:
/**
* Represents a wiki syntax that the user can use to enter wiki content. A syntax is made of four parts:
* <ul>
* <li>a base syntax type (e.g. {@ode xwiki}, {@code confluence}, {@code mediawiki}, etc).</li>
* <li>zero or more variants, which represent Syntax variations. For example the {@code markdown} syntax has
* the {@code commonmark} variant and the {@code github} variant.</li>
* <li>a version ({@code 1.0}, {@code 2.0}, etc.</li>
* <li>an optional qualifier which is a free form string adding some additional information about the Syntax when
* serialized as a String). Can be used for example to mark a Syntax as experimental.</li>
* </ul>
* The syntax id string format is: <code><type>[+<variant>]*/<version></code>.
* Examples:
* <ul>
* <li>{@code xwiki/2.1}</li>
* <li>{@code markdown+commonmark/1.2}</li>
* <li>{@code sometype+variant1+...+variantN/1.0}</li>
* </ul>
*
* @version $Id: 89454c157252098191b80c6bcc94a07c0d6d2c2d $
* @since 2.0RC1
*/
Now this will have a consequence on the following for ex:
/**
* Confluence wiki syntax.
*/
public static final SyntaxType CONFLUENCE = register("confluence", "Confluence");
/**
* Confluence XHTML based syntax.
*
* @since 5.3M1
*/
public static final SyntaxType CONFLUENCEXHTML = register("confluence+xhtml", CONFLUENCE.getName());
Thus this would break backward compatibility I think, even though I find it more logical to have it in the Syntax than in the SyntaxType.
So I’m going to refactor my code and go in the direction of moving the variants concept to SyntaxType instead.