Enabling XML parser validation

Enabling XML parser validation

By default, the parser (that is, the DocumentBuilder instance) returned by the factory will be non-validating. (Well-formedness errors will show up as a parse exception either way.) This means any careless (or malicious) user could insert any random and unexpected elements into the XML document and cause you all manner of grief. Of course, this is what DTDs and Schemas are for: They define the legal structure of a given XML document. Turning on validation mode is simply a matter of calling setValidating(true) on the DocumentBuilderFactory instance before having it return you a new DocumentBuilder. (And obviously, the input XML document must declare a DTD or schema somewhere to validate against.) Now, validation errors will generate additional exceptions when the DocumentBuilder attempts to parse the file into a Document.

Note: You can refine the exact behavior of these exceptions by creating a custom org.xml.sax.ErrorHandler implementation and registering it with your DocumentBuilder instance prior to having it parse your file—you can intercept the validation exceptions and do whatever you want, although it is likely you will wrap them in a custom exception and just re-throw them (to process them elsewhere in your parsing logic). You will be given a default ErrorHandler implementation if you omit this step, but it adds an extra line of output warning you of this point.

Tags: , , ,

This entry was posted on Sunday, May 11th, 2008 at 4:31 am and is filed under xml. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

 

Leave a Reply