Using german Umlaut in a text anchor

Hi,

first of all I am running a XWiki version 12.10.6.
I am writing my documentations in german actually and thats why the umlauts are needed for me in specific cases.
I want to add a text anchor in order to link to it from other points of the document or even completely different pages. Therefore I am using headers because they automatically create an anchor.
When this header / anchor contains a german Umlaut then it seems to me that the anchor function doesnt work properly.
In this case I wanted to write the word “ausgewählt” and then add a link to the anchor but this didnt work.
When I replaced it with “ae” instead of “ä” then it worked.
When I tried to replace the “ä” with html encoded string it didnt work.

Am I doing anything wrong with Umlauts or is there a problem in this version while using anchors?

MaDo, although I’m german speaking - as IT company - we write all our documentation in EN … anyway, one option that comes to my mind: if you add a TOC (Table of contents) to such a document, I assume clicking the link in the TOC will work (automatic anchor). If this is the case, then you can inspect the URL to see what the anchor was. If the TOC does not work, I suggest to create a Jira

If you wish to use non-ascii characters in your url, you normally need to urlencode the url to properly encode the character.

I take that back. I tested it, and it worked just fine. I guess modern browsers have caught up to allowing utf8 urls. Sweet.

I went to my xwiki sandbox page, and added this code into the xwiki source editor

= Anchors - ascii =

{{id name="ae"/}}ae

= Anchors - non ascii =

{{id name="ä"/}}ä

And then I accessed the page via different urls (I wanted a long page just to make sure that the url jumped to the right marker)

https://example.com/xwiki/bin/view/Sandbox/#ae
https://example.com/xwiki/bin/view/Sandbox/#ä

and it worked fine.

If you don’t like using the code editor, you can use the ID macro to add your anchor.

Lastly, if you want to add the anchor to the heading directly, just place the Id macro at the beginning of the header you wish to change.

It even works with Chinese. The Id Macro was added just before the S in “Style”
https://example.com/xwiki/bin/view/Sandbox/#这个

Screenshot 2022-02-14 at 14.24 1

I hope that helps.


final observation:

When I cut and pasted the url to another window, the utf8 characters were automatically urlencoded.
e.g.
https://example.com/xwiki/bin/view/Sandbox/#这个
became
https://example.com/xwiki/bin/view/Sandbox/#%E8%BF%99%E4%B8%AA

Thanks for your input @Beat_Burgener and @pdwalker.
Unfortunately I am not that familiar with xwiki yet. And I dont even know how to access the xwiki source editor or how to use an ID macro on xwiki.

But thats of course another problem. :stuck_out_tongue:

I will try to figure it out.

The visual editor makes it a piece of cake!

Edit your page and move your cursor to where you want your anchor to be (I put it before the H in Headings):

  • Screenshot 2022-02-17 at 01.52.00

Next, look at your edit toolbar, look for the + and select it. From the menu, select “Other Macros”

  • Screenshot 2022-02-17 at 01.50.24

Type in ‘id’ to find the Id Macro, select it, click “Select”

  • Screenshot 2022-02-17 at 01.50.49

Put in your text for the anchor under the Name field and then click “Submit”

  • Screenshot 2022-02-17 at 01.51.39

When you return to the editing page, you will see the following (assuming you’re using the visual editor)

  • Screenshot 2022-02-17 at 01.51.49

If you need to edit the properties of the Id macro, just double click on that little box.

Now, you can use that Id to jump to a particular spot in the page, like so:

  • Screenshot 2022-02-17 at 01.59.22

As for specifying the targets when creating links inside XWiki, that’s an exercise left to the reader. :wink:

I think the creator of this topic means the automatic created anchors by XWiki for all headings and not created with the macro.

For examle, the following heading…
grafik

… has this anchor:

<h1 id="HDCberschrift2" class="wikigeneratedid"><span>Überschrift 2</span></h1>

I get the error with all german umlauts or french apostroph.

<h1 id="HE4C4F6D6FCDCE9E8" class="wikigeneratedid"><span>äÄöÖüÜéè</span></h1>

I just gave it test. What xwiki does with the characters is convert them into their unicode representation. In this case Ü = unicode U+00DC or “DC” which you can see on the string.

So if you have a header with Überschrift and try to use the anchor of #HÜberschrift, it will fail, but #HDCberschrift will work.

If you specifically want the Ü character as part of the anchor, then you have to do it by hand by using the anchor macro.

You’re right, using the anchor macro I can create anchors with those special chars.

But is the dirrerent behavior working as expected or just a bug?

This is working as expected as the feature is designed for HTML 4/XHTML 1.0 where ids must not contain any non-ASCII characters (see the specification for the details). In HTML 5, no such restrictions exist, there the only requirement is that the id contains at least one character and no ASCII whitespace.

We could discuss extending the allowed characters in the generator of automatic ids to in particular include all non-ASCII characters (other ASCII characters are problematic as most of them have some special meaning and would require escaping in many contexts). However, there is a huge problem with backwards-compatibility: If you have any links to the old ids, these links will break when the generator is updated. And there is no possibility to assign two ids to an element. It might be possible to design some automatic migration that tries updating links based on a mapping from old to new ids (by generating ids with both generators and comparing them) but this is far from trivial. It is certainly also possible to make this configurable and to default to the new implementation for new installations. Feel free to open a Jira issue for this improvement but unless somebody sponsors the development there is no guarantee this will be implemented.

From this point of view, you’re totally right. :wink: