Support for figure captions in XWiki syntax and the WYSIWYG editor

For improvements for images in XWiki, support for captions in the WYSIWYG editor shall be implemented. The current caption supported is based on the figure and the figure caption macro. These macros also support tables. In HTML5 output these macros are rendered as <figure>/<figcaption> tags. CKEditor actually supports these HTML5 tags and then nicely integrates a possibility to enable captions in the image dialog and the caption can then be edited inline below the image. This is currently disabled in XWiki as we use XHTML 1.0 syntax for CKEditor.

In different discussions on the chat, the idea has been developed to extend the existing XWiki syntax to support image captions natively without a macro. In particular, the idea is to use the syntax [[Figure Caption>>>image:some_image.jpg]]. This is already valid syntax but the part before the >>> is currently just lost during parsing.

I’m trying to summarize the pro/contra arguments here to allow having a decision:

Advantages of the new caption syntax:

  • Clean wiki syntax
  • Clear difference to the existing figure macro such that we can be sure there won’t be something inside that is not an image and thus cannot be handled by CKEditor.
  • Would correspond to <figure><img ... /><figcaption>Caption text</figcaption></figure> in HTML5 using the existing events for figures and captions (but requires changes in the parser at least).

Disadvantages of the new caption syntax:

  • No equivalent in other supported syntaxes including XHTML 1.0
  • Existing users need to migrate (manually?)
  • The figure macro would still be needed unless we also implement something similar for tables and thus we would now have two ways to express the same thing (+ it should be kept for backwards compatibility I assume)
  • The image syntax is no longer always an inline element which might complicate rendering (?)

To support the WYSIWYG editor we need to change CKEditor to user HTML5 which is partially supported by CKEditor as the figure tag doesn’t exist in XHTML 1.0. From my understanding, this requires in particular a new HTML5 parser based on the current XHTML parser. As CKEditor basically produces extended XHTML, it is not strictly required to be able to parse HTML that is not valid XML. We already have an HTML cleaner that could also be extended for HTML5 if this is desired, though.

An alternative proposal would be to keep the existing macros and just modify the WYSIWYG editor experience by basically hiding the figure and figureCaption macro from the editor and instead relying on the builtin support of <figure>/<figcaption> HTML5 tags.

Advantages of keeping the figure macro:

  • Not two ways to achieve the same
  • Easy migration for existing users
  • Less adaptation needed for existing parsers
  • Supports all syntaxes that support macros

Disadvantages of keeping the figure macro:

  • Less clean syntax
  • Requires “hiding” the macro in the HTML for the WYSIWYG editor, I’m not sure if this is easily possible

For this alternative we would still need HTML 5 output for CKEditor as described above.

For tables the story is unfortunately a bit more complicated: While CKEditor 4 supports them, it supports them using the caption tag inside the table. Quite annoyingly, CKEditor 5 instead uses the <figure> and <figcaption> tags instead which matches the HTML5 output the macro currently generates, see here for their editor recommendation. In XWiki syntax, we could use a caption attribute to store the content of the caption element inside the table, this wouldn’t support wiki syntax inside the caption though while HTML5 supports other elements inside the <caption>-tag. Similar to images, we could also keep the existing macro and maybe transform it into the HTML CKEditor 4 expects, but this would then be different from both the current HTML output which is unfortunate as, should we migrate to CKEditor 5 one day, we would need to change it back.

If you have more arguments for this decision or also just an opinion, I would be happy to hear this. Thank you!

No, that’s not the reason, at least not from CKEditor’s point of view. The editor should have no problem handling image captions using <figure>/<figcaption> tags. The problem is that these tags have no correspondent in wiki syntax so they were simply lost when converting the HTML to wiki syntax. Thus we need two things:

  1. make sure the caption information ends up in the XDOM when the HTML is parsed (which probably means extending the existing HTML parser or using another HTML5 one)
  2. make sure the caption information ends up in the wiki syntax when the XDOM is rendered (which probably means modifying the XWiki 2.1 syntax renderer)

Which is what you explained below.

I don’t see how we could do this without hard-coding the figure macro into the CKEditor image feature, which I don’t like. Basically the image CKEditor plugin would have to:

  • on load: look for figure macro calls and replace them with images with captions
  • on save: replace images with captions with figure macro calls

But what happens with the other content inside the figure macro that is not an image in this case?.. I think I prefer the native syntax solution.

Okay, but XWiki also shouldn’t produce invalid XHTML 1.0, which it would if it produced <figure>/<figcaption> tags in XHTML 1.0 (regardless what the input syntax is) - and it seems also strange to expect these tags in an XHTML 1.0 parser. So I hope we agree that we need to switch to HTML5 for CKEditor or do you really suggest to let the XHTML renderer produce invalid <figure>/<figcaption> tags?

My idea was to do this transformation on the server side, not in CKEditor. The HTML5 renderer already produces figure/figcaption-tags for the figure macro so we would just need to modify the output such that it is not recognized as a macro and then on the other hand the parser would need to translate the figure/figcaption-tags into the macro. As I have already said, I am not sure if this is possible/a good idea as I don’t know the parser/renderer code enough.

The idea for me was to keep the figure macro a macro in this case, i.e., remove the macro information only if the content consists only of a figure and (possibly) a caption. In CKEditor 5, there is support for tables inside figure so we could then also upgrade tables with captions to native CKEditor support.

Not just that but any wiki content supported by the HTML figure tag.

We wouldn’t drop the figure/figureCaption macros so there’s an equivalent (the macros) and no need to migrate if users don’t need/want to. It’s likely that we would just move these 2 macros outside of platform and into a contrib extension.

It should always be inline AFAIK.

I agree. Now CKEditor 4 doesn’t fully support HTML5 but it should support well the HTML5 generated from XWiki syntax.

But then the figure macro will be hard-coded in the HTML5 parser and renderer. I mean, they will be aware of the existence of a figure macro. It’s not a binary dependency, but still it doesn’t feel right.

In the meantime, I created a draft implementation of most parts for supporting the native syntax locally and started opening pull requests for the first parts of that implementation. While testing with the id macro for XRENDERING-629: The figure and figure caption should not be part of the editable content by michitux · Pull Request #191 · xwiki/xwiki-rendering · GitHub, I found that it is important to support the id macro inside the caption (otherwise you cannot reference the figure using reference macro), but the native image widget in CKEditor does not support <span>-tags (see here for the allowed tags) and from testing with my actual implementation I can confirm that it does not work, there is no way to insert the id macro from the toolbar when inside the caption and the macro widget is also not displayed when the id macro is present in the native syntax. I experimented with changing the CKEditor source to allow <span> (and <p> and <div>), but the widget is still not displayed, the macro metadata seems to be lost and the content macro:id is editable (which shouldn’t be the case). While I still think it is a good idea to switch to HTML5, I’m getting the impression that using the native caption support in CKEditor doesn’t really help us.

@mflorea As you have more insights into CKEditor, maybe you can provide some guidance how to make it possible to use (inline) macros inside the caption element of the image2 plugin if you still think this is the solution we should choose. Alternatively, I guess, we could adjust the image dialog to instead trigger the insertion/removal of the figure/figcaption macros.

AFAIK we’re going to drop using this CKEditor plugin and have our own one to implement Improvements for images in XWiki

But maybe we should first integrate the caption with it and then move to the new plugin later on, as they’re not on the exact same timeframe.

@MichaelHamann I managed to make it work with:

require(['deferred!ckeditor'], function(ckeditorPromise) {
  ckeditorPromise.done(function(ckeditor) {
    ckeditor.on('instanceCreated', function(event) {
      event.editor.once('configLoaded', function(event) {
        // Both configurations should go in the CKEditor.EditSheet page (JSX).
        // 1. Allow figure and figcaption in order to enable the image caption feature.$1.elements, {
          figure: true,
          figcaption: true
        }, true);
        // 2. Configure the content allowed inside an image caption.
        // The allowed content depends on the source content syntax. See allowedContentBySyntax.
        // Assuming for now that the XWiki 2.1 syntax allows only inline content inside the image caption.
        event.editor.config['xwiki-image'].captionAllowedContent = {
          '$1': {
            elements: {
              // Formatting
              span: true, strong: true, em: true, ins: true, del: true, sub: true, sup: true, tt: true, pre: true,
              // Others
              a: true, img: true,
              // The requiredContent configuration for the xwiki-macro widget specifies both span (for inline macros)
              // and div (for block macros). We need to allow the div in order to be able to insert inline macros.
              div: true
            // The elements above can have any attribute, through the parameter (%%) syntax.
            attributes: '*',
            styles: '*',
            classes: '*'
          '$2': {
            // The XWiki syntax doesn't support parameters for the following inline elements.
            elements: {br: true}
          '$3': {
            // Wiki syntax macros can output any HTML.
            match: CKEDITOR.plugins.xwikiMacro.isMacroOutput,
            attributes: '*',
            styles: '*',
            classes: '*'

      // This should go in xwiki-image CKEditor plugin.
      event.editor.on('widgetDefinition', function(event) {
        if ( === 'image') {
          // Adds support for configuring the allowed content for image caption (currently hard-coded).
 = event.editor.config['xwiki-image'].captionAllowedContent;

I don’t think we’re going to drop the image plugin entirely. We’re going to implement our own dialog to insert and edit images, but we probably want to reuse some of the code from the image plugin, especially the image widget (what you see and interact with inside the edit area), which @MichaelHamann refers to in his comment.

Thank you very much, I can confirm that this seems to work, though I had to add the <p>-tag to the allowed tags to avoid that CKEditor always wraps the content inside a <div> (it then always wraps it inside a <p>-tag, but that is probably okay, see also the related discussion in PR#191 in xwiki-rendering.

I have found another issue with the proposed syntax: As far as I can see, it doesn’t work for links inside captions. While we can forbid them in CKEditor, this seems like a significant restriction. What I’m wondering is if we could alternatively render something like the following syntax into an HTML5 figure:

(% class="figure image" %)


(% class="figcaption" %)
Caption content

Note that the image class here is only required for CKEditor to recognize this as an image. We could of course still also use the syntax [[Caption content>>image:image.png]] whenever possible, but I have the impression that the checks if this is possible will be quite complex and possibly error-prone (I already have some of them but they are definitely not complete yet). Alternatively, this syntax could also just be supported for parsing such that users can write it but it would be converted to the above-mentioned syntax when you use the WYSIWYG editor.

If you’re suggesting to officially include the following in the syntax documentation then I’m not in favor.

For several reasons:

  • complex
  • introduces new class parameter that suddenly become APIs

For users, it seems much nicer to be able to write:

[[Caption content>>image:image.png]] 

where caption content could include links as in:

[[Hello [[world>>]]]]>>image:image.png]] 

This probably requires some wikimodel parsing changes but it should be doable.

Somewhat related JIRA: [XRENDERING-2] XHTML renderer should protect itself from link inside link from the XDOM - JIRA

I assume you mean [[Hello [[world>>]]>>image:image.png]] ?

This syntax will be difficult to parse as currently, a prefix of the proposed syntax - [[Hello [[world>>]] - is parsed as a link with label Hello [[world from what my tests show. In rendering, we then escape this to [[Hello ~~[~~[world>>]], though.

Regarding XRENDERING-2: Nested links are still not allowed in HTML5, see the documentation of the a-element so I do not think it makes sense to add better support for them.

We have an explicit test that the syntax [[[[>>reference]] is a link to reference with label [[. I do not see how we can implement nested link support without breaking the use of [[ in link labels, so this is definitely a breaking change. To make the parsing less error-prone, I would suggest to remove support for [[ in links. Is that what you had in mind?

With the solution by @mflorea, when we allow macros in figure captions (to allow the id macro), we also allow non-inline content (in particular standalone macros). From an HTML5 point of view this is okay, but if we want to convert this to an image reference syntax, this is a problem as we would suddenly have non-inline content in it. I can see several solutions of different complexity:

  • Create a way to allow just inline content in CKEditor (i.e., just allow inline macros, not standalone macros).
  • Strip the non-inline content during the conversion to XWiki syntax (not very user-friendly).
  • Extend the syntax even more to allow non-inline content in image references (or even references in general, this is valid HTML5). My proposal here would be to then always require a wrapping group syntax directly inside the reference syntax, like [[((( .... )))>>image:reference.png]].
  • Use a different syntax for storing captioned images either just when there is non-inline content or in general.

Not sure I understand. Are you saying that the following is valid in HTML5:

<p><a href=""><div><p>test</p></div></a></p>


No, this is not valid, but

<a href=""><div><p>test</p></div></a>

is valid. The content model of the <a>-element is transparent. In the examples at the end of the documentation of the <a>-element there are two further examples that demonstrate this.

ok I see thanks, I understand now (the <a> can be “standalone” in HTML5).

I think I like the new syntax for standalone image/links.