Chapter 14. XML Schema

As a final look at XML, and, in particular, XML topics that are particularly hot right now, we’ll spend a bit of time discussing XML Schema. Although the last several chapters have focused specifically on using Java to manipulate XML, in this chapter we look at XML Schema as a whole. In fact, XML Schema is still relatively new and the support for specific Java classes and interfaces to manipulate XML Schema has been slow in surfacing.

Despite the difficulty in using XML Schema directly through Java, the specification for XML Schema is important enough to warrant further discussion. In this chapter, we first spend time discussing whether using XML Schema is a stable and good choice, particularly as compared to continuing to use DTDs. We then spend a bit of time discussing how XML Schema closely maps to Java, and how that relationship may cause some significant changes to the way XML content is stored.

To DTD or Not To DTD

Although nearly every XML content author and developer has been hearing about XML Schema for almost a year, there is still quite a bit of uncertainty as to whether XML Schema is ready to be used in “prime time.” While some of this concern is based on the changes and immaturity of the XML Schema specification, the majority seems to be based on a familiarity with DTDs. Many XML developers still use only DTDs for document constraints, despite the wave of publicity that XML Schema has received. There are quite a few reasons for this resistance to change, and all are important in deciding for yourself if you need to use XML Schema.

Stability of the XML Schema Specification

One of the largest problems that XML Schema is still attempting to overcome is the rapid change in its own specification (which can be read online at http://www.w3.org/TR/xmlschema-1/ and http://www.w3.org/TR/xmlschema-2/). Within six months (from August of 1999 to March of 2000), three revisions of this specification were released; while this in itself is neither unusual nor problematic, the significant changes introduced through the revisions are. Each revision basically made schemas corresponding to previous revisions obsolete and therefore useless. This generated a lot of frustration and discontent in the XML community. In addition, the complexity of the specification has only seemed to increase over the lifecycle of XML Schema, and this complexity has compounded the community’s uncertain feelings towards XML Schema.

Despite all this “negative press,” XML Schema still promises to be at least as significant as the XML namespaces specification, and arguably as important as the original XML 1.0 specification itself. While the unhappiness at writing schemas that later become useless is understandable, the XML Schema specification and working group have always maintained that until the specification is complete and final at the W3C, changes are unavoidable. In fact, many of the authors frustrated at the changes are the same voices that made suggestions and criticisms about items that should be changed; in other words, not using XML Schema because it has changed a lot is simply a poor idea. Almost every change, including minor ones, by the XML Schema working group has assisted in the clarity and usability of the specification.

Enhanced Document Constraints

It would be almost impossible to even briefly discuss schemas without emphasizing (not for the first time in this book) the ease of constraining data through their use. You would be hard-pressed to find anyone, even those dead-set on continuing to use DTDs, who would deny the flexibility and ease of setting data constraints with XML Schema. In fact, the arguably more important uses of XML Schema that we discuss later in this chapter have been overshadowed by this fact! The truth is that any application that seeks to enforce strict data type and range constraints with an XML-based medium must elect to use XML Schema. Days if not weeks of time and effort can be saved.

In addition to traditional constraints, XML Schema allows content model constraints for generic data formats to be built. These constraints can then be shared and referenced from other schemas by using XLink and XPointer. DTDs, not being XML themselves, are extremely limited in this respect. It would not be unusual to see large applications using DTDs that are thousands of lines long. This is hardly an object-oriented approach to data, let alone a maintainable approach to data validation.

Namespace Issues with DTDs

We’ve already looked at how parsing an XML document that uses namespaces and needs to be validated can cause significant problems for DTDs. Remember this code:

DOMParser parser = new DOMParser(  );

// Turn on namespace support
parser.setFeature("http://xml.org/sax/features/namespaces", true);

// Turn on validation
parser.setFeature("http://xml.org/sax/features/validation", true);

// Parse
parser.parse(  );

// Get results
Document doc = parser.getDocument(  );

When this code is compiled within an application, running the application generates the following fatal error (this example is the specific verbiage from Apache Xerces, but your results should be similar):

org.xml.sax.SAXParseException: Document root element "JavaXML:Book", must
 match DOCTYPE root "JavaXML:Book".
        at org.apache.xerces.framework.XMLParser.reportError
            (XMLParser.java:1318)
        at org.apache.xerces.validators.dtd.DTDValidator
            .reportRecoverableXMLError(DTDValidator.java:1602)
        at org.apache.xerces.validators.dtd
            .DTDValidator.rootElementSpecified(DTDValidator.java:576)
        at org.apache.xerces.framework.XMLParser
            .scanAttributeName(XMLParser.java:2076)
        at org.apache.xerces.framework.XMLDocumentScanner
            .scanElement(XMLDocumentScanner.java, Compiled Code)
        at org.apache.xerces.framework
            .XMLDocumentScanner$ContentDispatcher.dispatch
            (XMLDocumentScanner.java, Compiled Code)
        at org.apache.xerces.framework.XMLDocumentScanner
            .parseSome(XMLDocumentScanner.java, Compiled Code)
        at org.apache.xerces.framework
            .XMLParser.parse(XMLParser.java:1208)
        at org.apache.xerces.framework
            .XMLParser.parse(XMLParser.java:1247)

This is because DTDs are ignorant of namespaces, but the mechanism handling the root element as well as the constructs nested within it is not. This difference in functionality causes conflicts between validation and namespace processing. XML documents often require both, making XML Schema an even more attractive solution for document constraints. Additionally, XML Schema’s close parallels to Java and the possibility of future integration are extremely promising.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.48.181