Implementations of the DOM

There are several implementations of the DOM API available. Most of them are available free of charge; some are distributed under the terms of open-source license. Let's look at a few of them.

Crimson/Xerces

Crimson is a subproject of the Apache XML project derived from the Sun Project X parser. It is implemented in Java and is shipped as a part of Sun's products. Currently, it is one of the most common XML parsers in the Java world; however, Sun plans to make it part of another Apache project, Xerces Java 2, which we will discuss in the next section.

Crimson supports the following APIs:

  • JAXP— Java API for XML Parsing—an XML processor-independent API, which will be discussed later in Chapter 16 “Working with XML and Java.”

  • SAX— The Simple API for XML. For details on this event-based API, refer to Chapter 15 “Parsing XML Based on Events.”

  • SAX 2.0— Version 2 of the SAX, which will be also discussed in Chapter 15.

  • DOM Level 2 Core.

Xerces is another baby of the Apache XML project, which is a new-generation XML processor. In addition to the features and APIs supported by Crimson, it also includes

  • Implementation of DOM Level 2 Core and Events

  • Partial implementation of the DOM Level 3 Core, Abstract Schemas, and Load and Save

Currently, implementations of Xerces are available in Java, C++, and Perl.

Note

Crimson, Xerces, GNUJAXP, and other JAXP-compliant XML processors will be discussed in detail in Chapter 16.


MSXML

The Microsoft's XML parser MSXML incorporates not only fully supported DOM Level 1 API (partial support for level 2), but also an alternative approach to parsing XML documents—event-based parsing (SAX parser), which we will discuss in Chapter 15.

The product can be downloaded from the Microsoft Web site.

http://www.microsoft.com/xml

DOM Parsing: Pros and Cons

The DOM API is quite simple and straightforward to use and this is the strongest advantage of it. The tree structure of the XML document corresponds directly to the DOM tree structure and the API gives a lot of freedom in accessing tree elements.

The price for that is the performance. Try loading a five-meg document with a DOM parser and you'll run out of patience—try a bigger document and you'll run out of memory.

Unfortunately, the DOM requires the whole document to be loaded into memory before the DOM tree is made available to a developer. That is why the DOM is ideal for applications operating with small XML documents and its benefits quickly fade after the size of documents reaches hundreds or thousands of megabytes.

In the next chapter, we will discuss alternative approaches to XML parsing, which are free of such problems—event-based parsing and the SAX parser. But, before that, we'll give you some useful links to DOM-related resources.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.225.213