Issues with UTF-8 using confluence xml tool

Hi Team,

I’m facing some error with encoding while using confluence streams xml.
I was able to import some spaces from confluence but this in particular have some attachments with documents. Any clue with this issue?

Configuration:
image

The error:

Invalid UTF-8 middle byte 0xdd (at char #61, byte #-1)
class org.xwiki.filter.FilterException: Faild to parse XML source
    at org.xwiki.filter.xml.internal.input.AbstractXMLInputFilterStream.read(AbstractXMLInputFilterStream.java:64)
    at org.xwiki.filter.internal.job.FilterStreamConverterJob.runInternal(FilterStreamConverterJob.java:97)
    at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246)
    at org.xwiki.job.AbstractJob.run(AbstractJob.java:223)
    at org.xwiki.filter.script.internal.ScriptFilterStreamConverterJob.run(ScriptFilterStreamConverterJob.java:75)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: class com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0xdd (at char #61, byte #-1)
    at com.ctc.wstx.sr.StreamScanner.constructFromIOE(StreamScanner.java:653)
    at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1017)
    at com.ctc.wstx.sr.StreamScanner.getNext(StreamScanner.java:770)
    at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2804)
    at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1122)
    at com.ctc.wstx.evt.WstxEventReader.nextEvent(WstxEventReader.java:283)
    at javanet.staxutils.BaseXMLEventWriter.add(BaseXMLEventWriter.java:527)
    at javanet.staxutils.XMLStreamUtils.copy(XMLStreamUtils.java:383)
    at org.xwiki.filter.xml.internal.input.AbstractXMLInputFilterStream.read(AbstractXMLInputFilterStream.java:62)
    at org.xwiki.filter.internal.job.FilterStreamConverterJob.runInternal(FilterStreamConverterJob.java:97)
    at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246)
    at org.xwiki.job.AbstractJob.run(AbstractJob.java:223)
    at org.xwiki.filter.script.internal.ScriptFilterStreamConverterJob.run(ScriptFilterStreamConverterJob.java:75)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: class java.io.CharConversionException: Invalid UTF-8 middle byte 0xdd (at char #61, byte #-1)
    at com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:318)
    at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:215)
    at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:88)
    at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
    at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1011)
    at com.ctc.wstx.sr.StreamScanner.getNext(StreamScanner.java:770)
    at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2804)
    at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1122)
    at com.ctc.wstx.evt.WstxEventReader.nextEvent(WstxEventReader.java:283)
    at javanet.staxutils.BaseXMLEventWriter.add(BaseXMLEventWriter.java:527)
    at javanet.staxutils.XMLStreamUtils.copy(XMLStreamUtils.java:383)
    at org.xwiki.filter.xml.internal.input.AbstractXMLInputFilterStream.read(AbstractXMLInputFilterStream.java:62)
    at org.xwiki.filter.internal.job.FilterStreamConverterJob.runInternal(FilterStreamConverterJob.java:97)
    at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246)
    at org.xwiki.job.AbstractJob.run(AbstractJob.java:223)
    at org.xwiki.filter.script.internal.ScriptFilterStreamConverterJob.run(ScriptFilterStreamConverterJob.java:75)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

Thanks in advance

According to the error, it seems the Confluence entities.xml file is invalid. Either that or there is a bug in the XML parser we are using (woostox) but I doubt it. It seems to refer to character 61, but I’m not sure how accurate it is (maybe it’s character 61 of some element and not the whole file).

Hi @tmortagne,
Thanks.
I had to to some cleans in the entities.xml file to handle the encoding issues.

One more question.

I have a working instance of xwiki on production but i’m doing it on fresh instalation on for test purposes.

When i try to configure the navigation panel it goes to a infinite loop and it shows a blank page.

image

Any clue?
Thanks in advance

Not without more information (like what exactly does your browser receives in the network tab).

But It would probably be better to create a new thread for this problem, as people might miss it because it does have much to do with the current one.