Use the
xni.XMLGrammarBuilder
class from Xerces to do some
extra checking on your schemas.
Whether you create a W3C XSD Schema
with the same tools you use to create other XML documents or use a
specialized schema-generation tool to create one, parsing it against
the schema in the Schema for Schemas appendix of the W3C Schema
recommendation (http://www.w3.org/TR/xmlschema-1/#normative-schemaSchema)
may alert you to some problems, such as whether you mistyped the name
of a schema definition element or put one schema definition element
inside of another where it doesn’t belong. There are
other potential errors that this won’t catch,
though; for example, what if your maxOccurs
value
for one element is less than the minOccurs
value
for the same element?
The xni.XMLGrammarBuilder
class manages this
multidocument validation by creating a compiled version of that
schema in memory and then re-using that compiled version for each
instance document passed to it. Like any compiler, it makes various
integrity checks as it compiles. If you’re
developing an XSD schema, this round of checks can help you before
you’ve created your first document that conforms to
that schema.
Imagine that you just drafted a schema, badschema.xsd, which has the following problems:
In the content model for the order
element, the
itemNum
element has a maxOccurs
value of 1
and a minOccurs
value of 4
. If the value must be greater than or
equal to 4 or and less than or equal to 1, that
doesn’t leave any valid values!
It declares the itemNum
element to be of type
itemTypist
. While the schema does declare a type
called itemType
, it has no type called
itemTypist
, and this certainly
isn’t one of the primitive or derived datatypes
listed in the XML Schema datatypes recommendation (http://www.w3.org/TR/xmlschema-2/).
It declares the monthType
type as being an integer
with values between 1 and 12, inclusive. The
orderMonthType
is declared as an extension of
monthType
, but with a greater range of values
allowed (0 to 12). This is illegal because a restricted type
definition must restrict the allowable values, not expand them.
Here is the schema:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="orders"> <xs:complexType> <xs:sequence> <xs:element ref="order" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="order"> <xs:complexType> <xs:sequence> <xs:element name="itemNum" type="itemTypist" maxOccurs="1" minOccurs="4"/> <!-- line 16 --> <xs:element name="orderMonth" type="orderMonthType" maxOccurs="1"/> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name="itemType"> <xs:restriction base="xs:string"> <xs:pattern value="d{3}-d{4}"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="monthType"> <xs:restriction base="xs:integer"> <xs:minInclusive value="1"/> <xs:maxInclusive value="12"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="orderMonthType"> <xs:restriction base="monthType"> <!-- line 37 --> <xs:minInclusive value="0"/> <xs:maxInclusive value="12"/> </xs:restriction> </xs:simpleType> </xs:schema>
Before trying the following command, make sure that your classpath includes both the xercesImpl.jar and the xercesSamples.jar files that come with the Java Xerces distribution (Version 2.6.2 or later). You can download the Xerces distribution from http://xml.apache.org/xerces2-j/download.cgi. While in the working directory, enter this command:
java -cp xercesImpl.jar;xercesSamples.jar xni.XMLGrammarBuilder -a badschema.xsd
Use a colon (:) between JAR filenames if you are
working in a Unix environment. The
xni.XMLGrammarBuilder
’s
-a
switch names the schema to parse. The error
messages it outputs list the problems with the schema:
[Error] badschema.xsd:37:38: FacetValueFromBase: Value '0' of facet 'minInclusive' must be from the value space of the base type. [Error] badschema.xsd:16:46: p-props-correct.2.1: {min occurs} = '4' must not be greater than {max occurs} = '1' for 'element'. [Error] badschema.xsd:16:46: src-resolve: Cannot resolve the name 'itemTypist' to a(n) type definition component.
I know of no other utility that lets you check XML 1.0 DTDs for
correctness in this way. Previously, to check a particular DTD
I’d created, I used to throw together a simple
document that conformed to it and then validated that document to see
if a parser would find any problem with the DTD itself on the way to
parsing the document. Having done this many times, I particularly
appreciate
xni.XMLGrammarBuilder
’s ability
to check schema integrity with no need for any sample
documents.
—Bob DuCharme
3.23.101.60