we are currently preparing a system migration and trying to clean up our existing XWiki (v10.10) content before moving to a new platform.
Over the years, many attachments (especially images) have been left orphaned because they were deleted in the editor but never removed from the page’s attachments. Our goal is to:
Export a complete page/module (with all its children)
Include only valid (linked) attachments
Exclude unused/orphaned files
Get this as a clean .xar file that can be reimported
We’ve tried multiple approaches:
Manual .xar export via UI → doesn’t seem to include attachments (at least in v10.10)
REST API access → works partially, but doesn’t return child pages reliably or lists empty <pages/>
Script-based HTML crawling + attachment download → possible, but not ideal for clean reimport
Main question: Is there a reliable way (via UI, script, extension, or newer version) to export a .xar file that includes all children + their used attachments, without pulling in orphaned files? OR orphaned Files aside, is there a way to export with attachments in older version like ours in the first place?
Any advice from the community would be highly appreciated – especially if there’s an extension or export method we’ve missed.
Just to be clear. The Old Version runs on V10.10. Thats a very old version. Sure exporting attachments was included in that area? Were talking min 5 years ago.
Ah…we’re onto sth here…that looks drastically different then what i am using. So pls explain to me bcuz i’m dumb. Is that an extra admin extension which i can download or should that be included in the standard 10.10?
yes that’s the Admin UI export. If you use that, it’s sure you’ll get the attachments in the XAR since that feature exists since the beginning of XWiki.
Export everything and then reimport only what you want. Or unzip the XAR, remove the pages you don’t want (and update the package.xml file), and re-zip.
ok, gotcha, thats what i did now. The whole zip is just 2GB while the application on the server is way above 25. Also i cant find anything in the folders. Is there any button i have to click? The options history and backup dont indicate to include attachments. Also, what is the webhome.xml and why is it so big in every folder?
I made the whole backup file from xar to zip and treid to find the attachments somewhere, especially pictures. I cant find them and by the whole backup size i was assuming, they were not included.
The original plan was to take the modules one by one. Letting them beeing cleaned by chatgpt by comparinng the links to the attachments and getting them reimport. Chatgpt was not able to find any attachments so it assumed its not a feature of this version.
Attachments are stored in XAR, they are just a bit hidden as you would probably expect them to be in a separate folder. To find out, you need to check the page in any IDE by opening the XML file and looking for the attachment tag. Also a hint that the exported page contains attachments - is a certain size (i.e my exported Sandbox page WebHome.xml almost ~5MB size.)
In an exported XAR file from XWiki (with history and attachments), you can verify that attachments are included by checking both the XML and the file structure.
What to check in the XML
Open the document.xml file inside the page’s directory and look for:
Hello again. Thanks for your answer so far. I was able to find out how and in which xml’s all the data is stored. Together with my new friend Claude i wrote a program to find and delete orphaned Data
so far so good, where we still struggle a little is the included History Data. Is there any chance to export a xar of a page (not the whole wiki) without history data? Chatgpt told me about a extension which can delete history. Is that otherwise my only chance?