Force the usage of safe XML reader

Hi devs,

In order to ensure that we avoid XXE attacks, I’d like to propose that we set some rules (enforced at the build level), so that we fail the build when one of the following construct is used:

DocumentBuilderFactory.newInstance();
SAXParserFactory.newInstance();
XMLInputFactory.newInstance();
javax.xml.transform.TransformerFactory.newInstance()
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
new SAXReader();
new SAXBuilder();

(this list comes from Sonarqube: Java static code analysis )

Instead, the idea would be to promote using a helper method from the XWiki Commons XML module (XMLUtils for example). The idea is that these helper methods will be configured to be safe against XXE attacks and thus developers won’t be able to forget to protect against XXE attacks. Similar to what we already have at xwiki-commons/xwiki-commons-core/xwiki-commons-xml/src/main/java/org/xwiki/xml/XMLUtils.java at a818817f33240c77ab07c5fcedd8b46e38305b09 · xwiki/xwiki-commons · GitHub

BTW I’m not very familiar with LSParser and I don’t know if we can use this for all our XML to parse or if there are some constraints to consider.

In any case, at this point, this post is about whether we’re ok with the idea?

Next step would be to propose an API in the XML module and a rule implementation (decide if it’s a maven enforcer, a checkstyle rule, a Spoon rule, etc).

WDYT?

Thanks

Another possibility is simply to make the sonarqube rule Java static code analysis a blocker rule that fails our build since we already have that concept.

This is much simpler to implement for the checking part. However, from a dev POV it would still mean duplicated code and it might still be good to offer some helper methods in the XWiki XML module.

+1

Sounds like a job for Spoon as it’s very similar to our current rule about forbidding to call File#deleteOnExit().

Note that I’ve been wanting to send a post about deciding which of the security rules from Sonarqube we should consider as blocker rules failing our build. This thread can be the start of this with rule Java static code analysis being the first such security rule.

See the following for other security-related rules:

hmm the good news is that the XML parser XXE attacks rule is already a blocker issue in our sonarcloud rules and thus should already fail our build: SonarCloud