Unlike the DOM, SAX is not an object model. A SAX parser represents a document using events initiated sequentially while parsing the document tags. For example, the portion of the XML document demonstrated in Listing 15.1 will translate into the following sequence of events:
startElement
characters
endElement
... <my-element> this is sample text </my-element> ... |
SAX is designed in such a way that it is very easy to use the API with different implementations of parsers. The only requirement is that parsers should implement a set of standard interfaces—otherwise, there is considerable freedom in implementing SAX-compliant parsers.
Currently, there are several implementations of SAX parsers available in Java and other languages. Some of the most well known include
Crimson
Xerces
Oracle XML Parser for Java 2
Microsoft MSXML
Let's have a quick look at each of them.
Crimson is a part of Apache XML Project. It supports XML 1.0 through multiple APIs, such as
SAX 2.0
SAX 2 Extensions 1.0
DOM Level 2 Core
JAXP 1.1
Crimson ships with Sun products. It is available in binaries and source code. At the date of this writing, there are plans to move the Crimson codebase into the Xerces Java 2 project. Crimson can be found at http://xml.apache.org/crimson.
Note
In Chapter 16, we will discuss the Java API for XML Processing, JAXP, in detail. Crimson is one of the JAXP-compliant parsers.
Xerces is also a child of the Apache XML Project. When this book reaches its readers, Xerces Java Parser 2 will probably be available, which will include the codebase from Crimson and many new features. Version 1.4.4, the latest at the date of writing this book, includes the following:
SAX 1.0
SAX 2.0
SAX 2 Extensions 1.0
DOM Level 1
DOM Level 2 (Core + Events + Traversal and Ranges)
JAXP 1.1
XML Schema Recommendation Version 1.0
Note
Xerces is also one of the JAXP-compliant parsers, which we will discuss in Chapter 16, “Working with XML and Java.”
Xerces can be found at http://xml.apache.org/xerces-j/. Xerces2 may be found at http://xml.apache.org/xerces2-j/.
Xerces is available for several programming languages.
XML Parser for Java 2 is a part of Oracle XDK—a family of XML libraries that can be used by developers to stick together XML and Oracle 8i databases. The support is provided for
SAX 2.0
DOM Level 2 (Core + Mutation Event + Traversal)
XML Namespaces Recommendation
JAXP 1.1 (was available only in XDK Beta Release at the date of writing)
SAX 2 Extensions 1.0 (was available only in XDK Beta Release at the date of writing)
Oracle's XML Developer kit can be obtained by visiting http://www.oracle.com/xml/ and registering to become a member of the Oracle Technology Network. Registration is free.
Oracle also provides several implementations for different programming languages.
Microsoft MSXML is a COM-based implementation of various APIs for accessing XML 1.0. The APIs are provided through an object model and are best used from non-Java platforms, such as C++, VisualBasic, and so on. The features supported are
SAX
DOM
SAX-DOM integration
XML Schema Recommendation Version 1.0
For additional information about MSXML, visit its Web site at http://www.microsoft.com/xml.
Now that we know when to choose SAX and which parsers can be used with it, it's time to talk about implementing SAX-driven applications.
3.145.173.78