Syntax support in translations/translation parameters

At the moment, translations are supposed to be plain text. However, in practice we have frequently the need to put a link in a translation or highlight some part of it as bold text. The documented best practice is to inject XWiki syntax or HTML as parameters into the message, i.e., to have a placeholder where the syntax would be. However, this means we need to interpret the result as XWiki syntax or HTML and this can have unintended side effects like XWIKI-19149. There are workarounds we are currently using but this is usually quite ugly.

I propose introducing support for a small subset of XWiki syntax in translations. My proposal would be to support the following features in addition to the existing support for Message Format like placeholders and choices:

  • Links, but without support for formatting in link labels (link labels would be parsed as plain text and therefore not need double escaping). Possibly, support for parameters could also be limited to those directly relevant for links like anchor and queryString and not generic parameters.
  • Text formatting, in particular bold and italics, possibly also underline, striked out, monospace, superscript, subscript.
  • Escaping using ~

The main question is where to support this syntax. There are two options that could also be combined:

  1. In the translation message itself. This means translators need to understand the syntax but compared to placeholders translators know what the syntax is about (vs. having an opaque placeholder).
  2. In parameters as currently documented. This means translators only need to know about placeholders but they might not understand what they mean and might, e.g., unknowingly reverse link start and end syntax.

We have existing translations for both variants, so unless we implement both we need to adapt existing translations.

As we probably need to develop a new parser for translations, anyways, we could also support other features. In particular I want to propose the following feature as I believe it could be useful:

  1. Named placeholders. Make it possible to use names instead of numbers for placeholders, for example, {first_name} instead of {0}.

I have also created a design page with some more details about the proposed changes.

I’m open for input and I would in particular also appreciate input by translators like @Simpel, @shirahara, @xrichard or @safronovyua. I’m mentioning you because you’ve been active in forum posts related to translations, I’m of course also looking for input from all other translators. Also, if you have different ideas how to improve translations/translation syntax, please let us know.

I also suggest organizing a meeting to discuss these options to hopefully already reach some consensus. Please indicate in this poll when you would be available. I’ll close the poll tomorrow, January 11th at 14:00 CET and announce the meeting the time here. You can of course also still join the meeting if you haven’t filled the poll.

Independently of the meeting, please also express your opinions here, in particular if you (possibly) won’t participate in the meeting.

You mean full support for Message Format as we have now, right?

Or do you mean to drop this full support and only support something partial?

Thanks

Yes, I mean full support for message format, I’ve clarified the initial post, though I believe at least in XWiki itself we’re only using choices and placeholders (but I haven’t checked).

Do you mind giving me a hint to understand what would be changed on UI for localization by implementing the syntax support?

This would make the localization experience much better. It is not really possible to understand what the variables mean if they only consist of numbers.

Probably not much. I haven’t investigated this yet but once we’ve settled on some syntax, we might be able to provide a better editing experience for this syntax in Weblate, e.g., by providing syntax highlighting, Weblate seems to support using CodeMirror for editing translations.

Basically, with official syntax support in translations, you would probably see more translations like XWiki Platform/Help.Translations — Japanese @ Weblate XWiki.org: Did you know that you can improve XWiki? Take 5 … or XWiki Platform/Help.Translations — Japanese @ Weblate XWiki.org: When editing in WYSIWYG, type ##[## and then start typing … that contain syntax besides just placeholders (at the moment, according to our best practices, such translations shouldn’t exist). If syntax support is just added for parameters, there will be placeholders instead of that syntax.

Ah I see. If syntax support can avoid strings like this with number placeholders, it should be really helpful for translators to decide how to order them if needed.

According to the poll, 10:00 CET seems to be the winning time for both days, so I suggest, tomorrow, Thursday at 10:00 CET for the meeting. To join the meeting, go to translationsyntax

I’ve just updated the meeting URL to translationsyntax (based on BigBlueButton) as it should provide a better (more stable) experience.

Thank you all for attending the meeting! Together I think we found some very interesting new solutions. I’ve tried moving the contents of the meeting notes to the design page. Feel free to add anything I’ve missed.

The main outcome which is also my new proposal is the following idea:

  • Instead of supporting syntax in parameters or the translation message we instead support passing arbitrary (XDOM) blocks as parameters. Such a block could be a link or a text with italic/bold format.
  • When some block (like a link) requires a translated text as content, we put that text in a separate translation but mention the connection for both translations. In many cases of bold text or italic text, the content is actually a parameter, anyways, so no additional translation will be required.

This is basically the same as what vue-i18n suggests and thus this would be compatible with vue-i18n without any further changes.

To make creating the blocks easier, we would provide a script API that allow creating common blocks (like links, emphasize, bold, italics) easily without writing syntax (that would need escaping again).

For this proposal, we would need to implement our own parser for MessageFormat to be able to cleanly insert blocks at the place of parameters.

We also discussed possible improvements to translation syntax like named parameters and easier choice syntax. While we agree that both is necessary, it is independent of the chosen solution with block parameters as it only affects the translations themselves and not the API for accessing them (okay, apart from parameter names). When developing the new parameter support, these improvements should be kept in mind such that they can be easily added later but the focus would be at first to implement the parameter support for the current translation format.

Anything I’ve missed and in particular any further suggestions?

Thanks. I’ve looked quickly at Translation Syntax Support (Proposal.TranslationSyntaxSupport) - XWiki to look for an example of the proposed syntax but couldn’t find one.

Could you provide some examples of a translation with a link and a translation with some bold content for ex?

This looks inventive. However, at first sight, this seems quite complex to me and not so great for translators as it seems to hide the fact that a translation is used in the context of a link or a bold text (for ex), which can be useful IMO for translators to know (to decide what words to use).

It also seems to make it harder for translators to reconcile that several translations are used to translate a single text (how would these translations be shown as one in the weblate UI for ex)?

PS: You’ve probably raised these points during the meeting and I apologize if that’s the case. I’d have liked to attend the meeting but couldn’t.

PPS: Does it mean we’re going to get references for translations (ie the ability to reuse a translation in another translation?) :wink: We discussed that in the past.

Thanks!

Sorry for taking so long to reply, I postponed replying as I was not working on this (and I’m actually still not working on it).

At the moment, nothing would change related to translation syntax. An example:

At the moment, core.viewers.history.extension.label is

{0}Version{1} coming from extension {2}{3} {4}{5}

where the parameters are set as "<a href='$_versionURL'>", '</a>', "<strong>", $escapetool.xml($documentExtension.name), $escapetool.xml($documentExtension.id.version) and '</strong>'. The new proposal would be to have

{0} coming from extension {1}

and then a second string like core.viewers.history.extension.label.versionLinkText with content Version and maybe a third string core.viewers.history.extension.label.versionExtensionText with content {0} {1} in case translators need to add or customize something in the text of the extension text.

The new translation would then be used with a LinkBlock with the label set to the value of core.viewers.history.extension.label.versionLinkText and a FormatBlock with the content set to the value of core.viewers.history.extension.label.versionExtensionText (or directly the extension name and version).

Part of the proposal is to make it mandatory to add comments to all of these translations to describe their connection (that’s what I meant with “mention the connection for both translations”). Also, the rules would make clear that such fragments must not be re-used in other places. Further, we should also add a rule that in the case there are any numbers involved, the numbers must be provided as parameters to all parts and the whole even when they aren’t used in English just in case any articles or adjectives must be changed based on the number.

In the future, we definitely want to support named parameters and then we could have something like {versionLink} coming from extension {extensionNameAndVersion}.

As I’ve said above, the idea would be to insert a comment that mentions their connection. Comments are visible in the Weblate UI. We should use comments a lot more than they are used to today as translators are frequently not aware in which context a string is used, e.g., if it is the label of button to execute an action or a heading. For example these refactoring strings like Move log or Copy log could both describe a button to move the log or a label for the log of the move job (the latter is correct but the German translation was actually the former until very recently).

The fact that translations are used as link label is imho not very different from having several translations for a dialog with a text and several buttons - they belong together and should refer to each other but the connection needs to be made explicitly by adding comments.

The arguments for this solution vs. extending translation syntax were the following:

  1. There seem to be countless needs for extra content, this is not limited to simple formatting and links. We also need CSS classes, icons, select elements - and probably more. However, we also don’t want to support macros in translations as it would make the whole thing a lot more complex and create security problems again.
  2. Frequently, the content of a link or formatting will be a parameter, anyways. So we hope that at least in many cases, no extra translation key will be necessary.
  3. It is easy to implement with our current translation syntax and thus we don’t need to create new translation bundles with new syntax to support this solution. Further, supporting parameters in arbitrary places like parameters (for injecting a dynamic CSS class) makes it hard to implement this cleanly as we would need to do much more than just insert some blocks at the place of a placeholder.
  4. It is the same solution that vue-i18n implements so it will be the same regardless if a string is used in Vue.js or on the server side.

No. As I’ve said, there is no change to translation syntax, but as it is the case already you can insert one translation into another using parameters - just make sure that the connection is obvious and the inserted string isn’t used in other contexts.

2 Likes