Creating a offline HTML backup

Hey!

I’m currently looking at backing up all of our pages via HTML but it appears I can only use the URL whilst logged in. Is there anyway I can automate this process?

I’m currently looking at this guide: http://www.xwiki.org/xwiki/bin/view/Documentation/UserGuide/Features/Exports#HHTMLExport

Any help would be really appreciated.

Thanks.

Hi the HTML export is not a backup solution. There’s plenty of loss with it. You should use the XAR export for a backup solution but you need to read https://www.xwiki.org/xwiki/bin/view/Documentation/AdminGuide/Backup#HUsingtheXWikiExportfeature first since even with that you’ll loose data.

Best solution is to backup at the DB level + copy permanent dir dir. See https://www.xwiki.org/xwiki/bin/view/Documentation/AdminGuide/Backup

Thanks

Thanks Vincent,

We currently backup our wiki using VMware snapshots but we have a requirement to browse our documentation in a disaster recovery situation. Ideally avoiding a restore process and something that will give us direct access to the raw content.

Is this not possible with Xwiki then?

Regards,
Danny.

Short answer is no since I don’t think we have a REST end point for doing an HTML export.

One idea might be to add a scheduler job that would execute regularly and perform the export. It would call the export API. The problem is that it won’t work since it’ll be missing things in the context.request.

For fun, example of the Export API in a wiki package:

{{groovy}}
import com.xpn.xwiki.export.html.*
import org.xwiki.model.reference.*
  
HtmlPackager packager = new HtmlPackager();
packager.setName("TestVMA");
packager.setDescription("Test VMA");
packager.addPageReferences([new DocumentReference("xwiki", "Sandbox", "WebHome")]);
packager.export(xcontext.getContext());
{{/groovy}}

Slightly related: See https://jira.xwiki.org/browse/XWIKI-9123 and the comments there.

Thank you! :smiley:
I think this is what we need. If I was to use your example script where would the export be saved?

The zip is returned in the response. FYI the backend code does this:

        context.getResponse().setContentType("application/zip");
        context.getResponse().addHeader("Content-disposition",
            "attachment; filename=" + Util.encodeURI(this.name, context) + ".zip");

Sorry this is the first time I’ve ever used Groovy scripts, so please bear with me. If I use the scheduler application can I specify where the file is output?

No, you can’t :slight_smile:

But here’s how to make it work:

{{groovy}}
import com.xpn.xwiki.export.html.*
import org.xwiki.model.reference.*

if (request.confirm == '1') {
    HtmlPackager packager = new HtmlPackager()
    packager.setName("TestVMA")
    packager.setDescription("Test VMA")
    packager.addPageReferences([new DocumentReference("xwiki", "Sandbox", "WebHome")])
    packager.export(xcontext.getContext())
}
{{/groovy}}
  • Step 2: Use curl or wget to call it, for example:

curl http://localhost:8080/xwiki/bin/get/Main/Export/?confirm=1 -o VMA.zip

1 Like

Thanks Vincent for all your help but unfortunately that doesn’t seem to work with our setup, I just get a broken 1kb zip file that is created by the curl command.

Curl does allow me to use the export commands that I originally mentioned thought along with a username and password parameter, I have managed to download some of our pages but I can’t download them all. I just need to get a better understanding of using the commands and I will figure out some way of doing it.

Hello @vmassol,

Does it export all the pages of Sandbox or only “WebHome”?

I think it is only WebHome, doesn’t it?

If yes, how to export all the pages?

Best regards,

My example was only for Sandbox.WebHome. You need to adapt the script for your needs.