Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4. Parsing and Creating XML Documents with StAX

Jeff Friesen¹

(1)

Dauphin, MB, Canada

Java also includes the StAX API for parsing and creating XML documents. Chapter 4 introduces you to StAX.

What Is StAX?

Streaming API for XML (StAX) is a Java API for parsing an XML document sequentially from start to finish and also for creating XML documents. StAX was introduced by Java 6 as an alternative to SAX and DOM and is located midway between these “polar opposites.”

StAX Versus SAX and DOM

Because Java already supports SAX and DOM for document parsing and DOM for document creation, you might be wondering why another XML API is needed. The following points justify StAX’s presence in core Java:

StAX (like SAX) can be used to parse documents of arbitrary sizes. In contrast, the maximum size of documents parsed by DOM is limited by the available memory, which makes DOM unsuitable for mobile devices with limited amounts of memory.
StAX (like DOM) can be used to create documents. In contrast to DOM, which can create documents whose maximum size is constrained by available memory, StAX can create documents of arbitrary sizes. SAX cannot be used to create documents.
StAX (like SAX) makes infoset items available to applications almost immediately. In contrast, these items are not made available by DOM until after it finishes building the tree of nodes.
StAX (like DOM) adopts the pull model, in which the application tells the parser when it’s ready to receive the next infoset item. This model is based on the iterator design pattern (see http://sourcemaking.com/design_patterns/iterator ), which results in an application that’s easier to write and debug. In contrast, SAX adopts the push model, in which the parser passes infoset items via events to the application, whether or not the application is ready to receive them. This model is based on the observer design pattern (see http://sourcemaking.com/design_patterns/observer ), which results in an application that’s often harder to write and debug.

Summing up, StAX can parse or create documents of arbitrary size, makes infoset items available to applications almost immediately, and uses the pull model to put the application in charge. Neither SAX nor DOM offers all of these advantages.

Exploring StAX

Java implements StAX through types stored in the javax.xml.stream, javax.xml.stream.events, and javax.xml.stream.util packages. This section introduces you to various types from the first two packages while showing you how to use StAX to parse and create XML documents.

Stream-Based Versus Event-Based Readers and Writers

StAX parsers are known as document readers , and StAX document creators are known as document writers. StAX classifies document readers and document writers as stream-based or event-based.

A stream-based reader extracts the next infoset item from an input stream via a cursor (infoset item pointer). Similarly, a stream-based writer writes the next infoset item to an output stream at the cursor position. The cursor can point to only one item at a time, and always moves forward, typically by one infoset item.

Stream-based readers and writers are appropriate when writing code for memory-constrained environments such as Java ME Embedded, because you can use them to create smaller and more efficient code. They also offer better performance for low-level libraries, where performance is important.

An event-based reader extracts the next infoset item from an input stream by obtaining an event. Similarly, an event-based writer writes the next infoset item to the stream by adding an event to the output stream. In contrast to stream-based readers and writers, event-based readers and writers have no concept of a cursor.

Event-based readers and writers are appropriate for creating XML processing pipelines (sequences of components that transform the previous component’s input and pass the transformed output to the next component in the sequence), for modifying an event sequence, and more.

Parsing XML Documents

Document readers are obtained by calling the various “create” methods that are declared in the javax.xml.stream.XMLInputFactory class. These creational methods are organized into two categories: methods for creating stream-based readers and methods for creating event-based readers.

Before you can obtain a stream-based or an event-based reader, you need to obtain an instance of the factory by calling one of the newFactory() static methods, such as XMLInputFactory newFactory():

XMLInputFactory xmlif = XMLInputFactory.newFactory();

Note

You can also call the XMLInputFactory newInstance() static method but might not want to do so because its same-named but parameterized companion method has been deprecated to maintain API consistency, and it’s possible that newInstance() will be deprecated as well.

The newFactory() methods follow an ordered lookup procedure to locate the XMLInputFactory implementation class. This procedure first examines the javax.xml.stream.XMLInputFactory system property and lastly returns the system-default implementation (returned from XMLInputFactory newDefaultFactory()). If there is a service configuration error, or if the implementation is not available or cannot be instantiated, the method throws an instance of the javax.xml.stream.FactoryConfigurationError class.

After creating the factory, call XMLInputFactory’s void setProperty(String name, Object value) method to set various features and properties as necessary. For example, you might execute xmlif.setProperty(XMLInputFactory.IS_VALIDATING, true); (true is passed as a java.lang.Boolean object via autoboxing—see http://docs.oracle.com/javase/tutorial/java/data/autoboxing.html ) to request a DTD-validating stream-based reader. However, the default StAX factory implementation throws java.lang.IllegalArgumentException because it doesn’t support DTD validation. Similarly, you might execute xmlif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, true); to request a namespace-aware event-based reader, which is supported.

Parsing Documents with Stream-Based Readers

A stream-based reader is created by calling one of XMLInputFactory’s createXMLStreamReader() methods, such as XMLStreamReader createXMLStreamReader(Reader reader). These methods throw javax.xml.stream.XMLStreamException when the stream-based reader cannot be created.

The following code fragment creates a stream-based reader whose source is a file named recipe.xml:

Reader reader = new FileReader("recipe.xml");

XMLStreamReader xmlsr = xmlif.createXMLStreamReader(reader);

The low-level javax.xml.stream.XMLStreamReader interface offers the most efficient way to read XML data with StAX. This interface’s boolean hasNext() method returns true when there is a next infoset item to obtain; otherwise, it returns false. The int next() method advances the cursor by one infoset item and returns an integer code that identifies this item’s type.

Instead of comparing next()’s return value with an integer value, you would compare this value against a javax.xml.stream.XMLStreamConstants infoset constant, such as START_ELEMENT or DTD—XMLStreamReader extends the XMLStreamConstants interface.

Note

You can also obtain the type of the infoset item that the cursor is pointing to by calling XMLStreamReader’s int getEventType() method. Specifying “Event” in the name of this method is unfortunate because it confuses stream-based readers with event-based readers.

The following code fragment uses the hasNext() and next() methods to codify a parsing loop that detects the start and end of each element:

while (xmlsr.hasNext())

{

switch (xmlsr.next())

{

case XMLStreamReader.START_ELEMENT:

// Do something at element start.

break;

case XMLStreamReader.END_ELEMENT:

// Do something at element end.

}

XMLStreamReader also declares various methods for extracting infoset information. For example, QName getName() returns the qualified name (as a javax.xml.namespace.QName instance) of the element at the cursor position when next() returns XMLStreamReader.START_ELEMENT or XMLStreamReader.END_ELEMENT.

Note

QName describes a qualified name as a combination of namespace URI, local part, and prefix components. After instantiating this immutable class (via a constructor such as QName(String namespaceURI, String localPart, String prefix)), you can return these components by calling QName’s String getNamespaceURI(), String getLocalPart(), and String getPrefix() methods.

Listing 4-1 presents the source code to a StAXDemo application that reports an XML document’s start and end elements via a stream-based reader.

import java.io.FileNotFoundException;

import java.io.FileReader;

import javax.xml.stream.FactoryConfigurationError;

import javax.xml.stream.XMLInputFactory;

import javax.xml.stream.XMLStreamException;

import javax.xml.stream.XMLStreamReader;

import static java.lang.System.*;

public class StAXDemo

{

public static void main(String[] args)

{

if (args.length != 1)

{

err.println("usage: java StAXDemo xmlfile");

return;

}

try

{

XMLInputFactory xmlif = XMLInputFactory.newFactory();

FileReader fr = new FileReader(args[0]);

XMLStreamReader xmlsr = xmlif.createXMLStreamReader(fr);

while (xmlsr.hasNext())

{

switch (xmlsr.next())

{

case XMLStreamReader.START_ELEMENT:

out.println("START_ELEMENT");

out.printf(" Qname = %s%n",

xmlsr.getName());

break;

case XMLStreamReader.END_ELEMENT:

out.println("END_ELEMENT");

out.printf(" Qname = %s%n", xmlsr.getName());

}

catch (FactoryConfigurationError fce)

{

err.printf("FCE: %s%n", fce.toString());

}

catch (FileNotFoundException fnfe)

{

err.printf("FNFE: %s%n", fnfe.toString());

}

catch (XMLStreamException xmlse)

{

err.printf("XMLSE: %s%n", xmlse.toString());

}

Listing 4-1

StAXDemo (Version 1)

After verifying the number of command-line arguments, Listing 4-1’s main() method creates a factory, uses the factory to create a stream-based reader that obtains its XML data from the file identified by the solitary command-line argument, and enters a parsing loop. Whenever next() returns XMLStreamReader.START_ELEMENT or XMLStreamReader.END_ELEMENT, XMLStreamReader’s getName() method is called to return the element’s qualified name.

Compile Listing 4-1 as follows:

javac StAXDemo.java

Run the resulting application to dump Listing 1-2’s movie XML content, as follows:

java StAXDemo movie.xml

You should observe the following output:

START_ELEMENT

Qname = movie

START_ELEMENT

Qname = name

END_ELEMENT

Qname = name

START_ELEMENT

Qname = language

END_ELEMENT

Qname = language

END_ELEMENT

Qname = movie

Note

XMLStreamReader declares a void close() method that you will want to call to free any resources associated with this stream-based reader when your application is designed to run for an extended period of time. Calling this method doesn’t close the underlying input source.

Parsing Documents with Event-Based Readers

An event-based reader is created by calling one of XMLInputFactory’s createXMLEventReader() methods , such as XMLEventReader createXMLEventReader(Reader reader). These methods throw XMLStreamException when the event-based reader cannot be created.

The following code fragment creates an event-based reader whose source is a file named recipe.xml:

Reader reader = new FileReader("recipe.xml");

XMLEventReader xmler = xmlif.createXMLEventReader(reader);

The high-level javax.xml.stream.XMLEventReader interface offers a somewhat less efficient but more object-oriented way to read XML data with StAX. This interface’s boolean hasNext() method returns true when there is an event to obtain; otherwise, it returns false. The XMLEvent nextEvent() method returns the next event as an object whose class implements a subinterface of the javax.xml.stream.events.XMLEvent interface.

Note

XMLEvent is the base interface for handling markup events. It declares methods that apply to all subinterfaces; for example, Location getLocation() (return a javax.xml.stream.Location object whose int getCharacterOffset() and other methods return location information about the event) and int getEventType() (return the event type as an XMLStreamConstants infoset constant, such as START_ELEMENT and PROCESSING_INSTRUCTION—XMLEvent extends XMLStreamConstants). XMLEvent is subtyped by other javax.xml.stream.events interfaces that describe different kinds of events (such as Attribute) in terms of methods that return infoset item-specific information (such as Attribute’s QName getName() and String getValue() methods).

The following code fragment uses the hasNext() and nextEvent() methods to codify a parsing loop that detects the start and end of an element:

while (xmler.hasNext())

{

switch (xmler.nextEvent().getEventType())

{

case XMLEvent.START_ELEMENT:

// Do something at element start.

break;

case XMLEvent.END_ELEMENT:

// Do something at element end.

}

Listing 4-2 presents the source code to a StAXDemo application that reports an XML document’s start and end elements via an event-based reader.

import java.io.FileNotFoundException;

import java.io.FileReader;

import javax.xml.stream.FactoryConfigurationError;

import javax.xml.stream.XMLEventReader;

import javax.xml.stream.XMLInputFactory;

import javax.xml.stream.XMLStreamException;

import javax.xml.stream.events.EndElement;

import javax.xml.stream.events.StartElement;

import javax.xml.stream.events.XMLEvent;

import static java.lang.System.*;

public class StAXDemo

{

public static void main(String[] args)

{

if (args.length != 1)

{

err.println("usage: java StAXDemo xmlfile");

return;

}

try

{

XMLInputFactory xmlif = XMLInputFactory.newFactory();

FileReader fr = new FileReader(args[0]);

XMLEventReader xmler = xmlif.createXMLEventReader(fr);

while (xmler.hasNext())

{

XMLEvent xmle = xmler.nextEvent();

switch (xmle.getEventType())

{

case XMLEvent.START_ELEMENT:

out.println("START_ELEMENT");

out.printf(" Qname = %s%n",

((StartElement) xmle). getName());

break;

case XMLEvent.END_ELEMENT:

out.println("END_ELEMENT");

out.printf(" Qname = %s%n",

((EndElement) xmle). getName());

}

catch (FactoryConfigurationError fce)

{

err.printf("FCE: %s%n", fce.toString());

}

catch (FileNotFoundException fnfe)

{

err.printf("FNFE: %s%n", fnfe.toString());

}

catch (XMLStreamException xmlse)

{

err.printf("XMLSE: %s%n", xmlse.toString());

}

Listing 4-2

StAXDemo (Version 2)

After verifying the number of command-line arguments, Listing 4-2’s main() method creates a factory, uses the factory to create an event-based reader that obtains its XML data from the file identified by the solitary command-line argument, and enters a parsing loop. Whenever nextEvent() returns XMLEvent.START_ELEMENT or XMLEvent.END_ELEMENT, StartElement’s or EndElement’s getName() method is called to return the element’s qualified name.

After compiling Listing 4-2, run the resulting application to dump Listing 1-3’s article XML content, as follows:

java StAXDemo article.xml

You should observe the following output:

START_ELEMENT

Qname = article

START_ELEMENT

Qname = abstract

START_ELEMENT

Qname = code

END_ELEMENT

Qname = code

END_ELEMENT

Qname = abstract

START_ELEMENT

Qname = body

END_ELEMENT

Qname = body

END_ELEMENT

Qname = article

Note

You can also create a filtered event-based reader to accept or reject various events by calling one of XMLInputFactory’s createFilteredReader() methods, such as XMLEventReader createFilteredReader(XMLEventReader reader, EventFilter filter). The javax.xml.stream.EventFilter interface declares a boolean accept(XMLEvent event) method that returns true when the specified event is part of the event sequence; otherwise, it returns false.

Creating XML Documents

Document writers are obtained by calling the various “create” methods that are declared in the javax.xml.stream.XMLOutputFactory class . These creational methods are organized into two categories: methods for creating stream-based writers and methods for creating event-based writers.

Before you can obtain a stream-based or an event-based writer, you need to obtain an instance of the factory by calling one of the newFactory() static methods, such as XMLOutputFactory newFactory():

XMLOutputFactory xmlof = XMLOutputFactory.newFactory();

Note

You can also call the XMLOutputFactory newInstance() static method but might not want to do so because its same-named but parameterized companion method has been deprecated to maintain API consistency, and it’s possible that newInstance() will be deprecated as well.

The newFactory() methods follow an ordered lookup procedure to locate the XMLOutputFactory implementation class. This procedure first examines the javax.xml.stream.XMLOutputFactory system property and lastly returns the system-default implementation (returned from XMLOutputFactory newDefaultFactory()). If there is a service configuration error, or if the implementation is not available or cannot be instantiated, the method throws an instance of the FactoryConfigurationError class.

After creating the factory, call XMLOutputFactory’s void setProperty(String name, Object value) method to set various features and properties as necessary. The only property currently supported by all writers is XMLOutputFactory.IS_REPAIRING_NAMESPACES. When enabled (by passing true or a Boolean object, such as Boolean.TRUE, to value), the document writer takes care of all namespace bindings and declarations, with minimal help from the application. The output is always well formed with respect to namespaces. However enabling this property adds some overhead to the job of writing the XML.

Creating Documents with Stream-Based Writers

A stream-based writer is created by calling one of XMLOutputFactory’s createXMLStreamWriter() methods , such as XMLStreamWriter createXMLStreamWriter(Writer writer). These methods throw XMLStreamException when the stream-based writer cannot be created.

The following code fragment creates a stream-based writer whose destination is a file named recipe.xml:

Writer writer = new FileWriter("recipe.xml");

XMLStreamWriter xmlsw = xmlof.createXMLStreamWriter(writer);

The low-level XMLStreamWriter interface declares several methods for writing infoset items to the destination. The following list describes a few of these methods:

void close() closes this stream-based writer and frees any associated resources. The underlying writer is not closed.
void flush() writes any cached data to the underlying writer.
void setPrefix(String prefix, String uri) identifies the namespace prefix to which the uri value is bound. This prefix is used by variants of the writeStartElement(), writeAttribute(), and writeEmptyElement() methods that take namespace arguments but not prefixes. Also, it remains valid until the writeEndElement() invocation that corresponds to the last writeStartElement() invocation. This method doesn’t create any output.
void writeAttribute(String localName, String value) writes the attribute identified by localName and having the specified value to the underlying writer. A namespace prefix isn’t included. This method escapes the &, <, >, and " characters.
void writeCharacters(String text) writes text’s characters to the underlying writer. This method escapes the &, <, and > characters.
void writeEndDocument() closes any start tags and writes corresponding end tags to the underlying writer.
void writeEndElement() writes an end tag to the underlying writer, relying on the internal state of the stream-based writer to determine the tag’s prefix and local name.
void writeNamespace(String prefix, String namespaceURI) writes a namespace to the underlying writer. This method must be called to ensure that the namespace specified by setPrefix() and duplicated in this method call is written; otherwise, the resulting document will not be well formed from a namespace perspective.
void writeStartDocument() writes the XML declaration to the underlying writer.
void writeStartElement(String namespaceURI, String localName) writes a start tag with the arguments passed to namespaceURI and localName to the underlying writer.

Listing 4-3 presents the source code to a StAXDemo application that creates a recipe.xml file with many of Listing 1-5’s infoset items via a stream-based writer.

import java.io.FileWriter;

import java.io.IOException;

import javax.xml.stream.FactoryConfigurationError;

import javax.xml.stream.XMLOutputFactory;

import javax.xml.stream.XMLStreamException;

import javax.xml.stream.XMLStreamWriter;

import static java.lang.System.*;

public class StAXDemo

{

final static String NS1 = "http://www.w3.org/1999/xhtml";

final static String NS2 = "http://www.javajeff.ca/";

public static void main(String[] args)

{

try

{

XMLOutputFactory xmlof =

XMLOutputFactory.newFactory();

FileWriter fw = new FileWriter("recipe.xml");

XMLStreamWriter xmlsw =

xmlof.createXMLStreamWriter(fw);

xmlsw.writeStartDocument();

xmlsw.setPrefix("h", NS1);

xmlsw.writeStartElement(NS1, "html");

xmlsw.writeNamespace("h", NS1);

xmlsw.writeNamespace("r", NS2);

xmlsw.writeStartElement(NS1, "head");

xmlsw.writeStartElement(NS1, "title");

xmlsw.writeCharacters("Recipe");

xmlsw.writeEndElement();

xmlsw.writeStartElement(NS1, "body");

xmlsw.setPrefix("r", NS2);

xmlsw.writeStartElement(NS2, "recipe");

xmlsw.writeStartElement(NS2, "title");

xmlsw.writeCharacters("Grilled Cheese Sandwich");

xmlsw.writeEndElement();

xmlsw.writeStartElement(NS2, "ingredients");

xmlsw.setPrefix("h", NS1);

xmlsw.writeStartElement(NS1, "ul");

xmlsw.writeStartElement(NS1, "li");

xmlsw.setPrefix("r", NS2);

xmlsw.writeStartElement(NS2, "ingredient");

xmlsw.writeAttribute("qty", "2");

xmlsw.writeCharacters("bread slice");

xmlsw.writeEndElement();

xmlsw.setPrefix("h", NS1);

xmlsw.writeEndElement();

xmlsw.setPrefix("r", NS2);

xmlsw.writeEndElement();

xmlsw.writeEndDocument();

xmlsw.flush();

xmlsw.close();

}

catch (FactoryConfigurationError fce)

{

err.printf("FCE: %s%n", fce.toString());

}

catch (IOException ioe)

{

err.printf("IOE: %s%n", ioe.toString());

}

catch (XMLStreamException xmlse)

{

err.printf("XMLSE: %s%n", xmlse.toString());

}

Listing 4-3

StAXDemo (Version 3)

Although Listing 4-3 is fairly easy to follow, you might be somewhat confused by the duplication of namespace URIs in the setPrefix() and writeStartElement() method calls. For example, you might be wondering about the duplicate URIs in xmlsw.setPrefix("h", NS1); and its xmlsw.writeStartElement(NS1, "html"); successor.

The setPrefix() method call creates a mapping between a namespace prefix (the value) and a URI (the key) without generating any output. The writeStartElement() method call specifies the URI key, which this method uses to access the prefix value, which it then prepends (with a colon character) to the html start tag’s name before writing this tag to the underlying writer.

Compile Listing 4-3 and run the resulting application. You should discover a recipe.xml file in the current directory.

Creating Documents with Event-Based Writers

An event-based writer is created by calling one of XMLOutputFactory’s createXMLEventWriter() methods , such as XMLEventWriter createXMLEventWriter(Writer writer). These methods throw XMLStreamException when the event-based writer cannot be created.

The following code fragment creates an event-based writer whose destination is a file named recipe.xml:

Writer writer = new FileWriter("recipe.xml");

XMLEventWriter xmlew = xmlof.createXMLEventWriter(writer);

The high-level XMLEventWriter interface declares the void add(XMLEvent event) method for adding events that describe infoset items to the output stream implemented by the underlying writer. Each argument passed to event is an instance of a class that implements a subinterface of XMLEvent (such as Attribute and StartElement).

To save you the trouble of implementing these interfaces, StAX provides javax.xml.stream.EventFactory. This utility class declares various factory methods for creating XMLEvent subinterface implementations. For example, Comment createComment(String text) returns an object whose class implements the javax.xml.stream.events.Comment subinterface of XMLEvent.

Because these factory methods are declared abstract, you must first obtain an instance of the EventFactory class. You can easily accomplish this task by invoking XMLEventFactory’s XMLEventFactory newFactory() static method, as follows:

XMLEventFactory xmlef = XMLEventFactory.newFactory();

You can then obtain an XMLEvent subinterface implementation, as follows:

XMLEvent comment = xmlef.createComment("ToDo");

Listing 4-4 presents the source code to a StAXDemo application that creates a recipe.xml file with many of Listing 1-5’s infoset items via an event-based writer.

import java.io.FileWriter;

import java.io.IOException;

import java.util.Iterator;

import javax.xml.stream.FactoryConfigurationError;

import javax.xml.stream.XMLEventFactory;

import javax.xml.stream.XMLEventWriter;

import javax.xml.stream.XMLOutputFactory;

import javax.xml.stream.XMLStreamException;

import javax.xml.stream.events.Attribute;

import javax.xml.stream.events.Namespace;

import javax.xml.stream.events.XMLEvent;

import static java.lang.System.*;

public class StAXDemo

{

final static String NS1 = "http://www.w3.org/1999/xhtml";

final static String NS2 = "http://www.javajeff.ca/";

public static void main(String[] args)

{

try

{

XMLOutputFactory xmlof =

XMLOutputFactory.newFactory();

FileWriter fw = new FileWriter("recipe.xml");

XMLEventWriter xmlew;

xmlew = xmlof.createXMLEventWriter(fw);

final XMLEventFactory xmlef =

XMLEventFactory.newFactory();

XMLEvent event = xmlef.createStartDocument();

xmlew.add(event);

Iterator<Namespace> nsIter;

nsIter = new Iterator<Namespace>()

{

int index = 0;

Namespace[] ns;

{

ns = new Namespace[2];

ns[0] = xmlef.createNamespace("h", NS1);

ns[1] = xmlef.createNamespace("r", NS2);

}

@Override

public boolean hasNext()

{

return index != 2;

}

@Override

public Namespace next()

{

return ns[index++];

}

@Override

public void remove()

{

throw new UnsupportedOperationException();

}

};

event = xmlef.createStartElement("h", NS1, "html", null, nsIter);

xmlew.add(event);

event = xmlef.createStartElement("h", NS2, "head");

xmlew.add(event);

event = xmlef.createStartElement("h", NS1, "title");

xmlew.add(event);

event = xmlef.createCharacters("Recipe");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "title");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "head");

xmlew.add(event);

event = xmlef.createStartElement("h", NS1, "body");

xmlew.add(event);

event = xmlef.createStartElement("r", NS2, "recipe");

xmlew.add(event);

event = xmlef.createStartElement("r", NS2, "title");

xmlew.add(event);

event = xmlef.createCharacters("Grilled Cheese " + "Sandwich");

xmlew.add(event);

event = xmlef.createEndElement("r", NS2, "title");

xmlew.add(event);

event = xmlef.createStartElement("r", NS2, "ingredients");

xmlew.add(event);

event = xmlef.createStartElement("h", NS1, "ul");

xmlew.add(event);

event = xmlef.createStartElement("h", NS1, "li");

xmlew.add(event);

Iterator<Attribute> attrIter;

attrIter = new Iterator<Attribute>()

{

int index = 0;

Attribute[] attrs;

{

attrs = new Attribute[1];

attrs[0] = xmlef.createAttribute("qty", "2");

}

@Override

public boolean hasNext()

{

return index != 1;

}

@Override

public Attribute next()

{

return attrs[index++];

}

@Override

public void remove()

{

throw new UnsupportedOperationException();

}

};

event = xmlef.createStartElement("r", NS2, "ingredient", attrIter, null);

xmlew.add(event);

event = xmlef.createCharacters("bread slice");

xmlew.add(event);

event = xmlef.createEndElement("r", NS2, "ingredient");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "li");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "ul");

xmlew.add(event);

event = xmlef.createEndElement("r", NS2, "ingredients");

xmlew.add(event);

event = xmlef.createEndElement("r", NS2, "recipe");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "body");

xmlew.add(event);

event = xmlef.createEndElement("h", NS1, "html");

xmlew.add(event);

xmlew.flush();

xmlew.close();

}

catch (FactoryConfigurationError fce)

{

err.printf("FCE: %s%n", fce.toString());

}

catch (IOException ioe)

{

err.printf("IOE: %s%n", ioe.toString());

}

catch (XMLStreamException xmlse)

{

err.printf("XMLSE: %s%n", xmlse.toString());

}

Listing 4-4

StAXDemo (Version 4)

Listing 4-4 should be fairly easy to follow; it’s the event-based equivalent of Listing 4-3. Notice that this listing includes the creation of java.util.Iterator instances from anonymous classes that implement this interface. These iterators are created to pass namespaces or attributes to XMLEventFactory’s StartElement createStartElement(String prefix, String namespaceUri, String localName, Iterator<? extends Attribute> attributes, Iterator<? extends Namespace> namespaces) method. (You can pass null to this parameter when an iterator isn’t applicable; for example, when the start tag has no attributes.)

Compile Listing 4-4 and run the resulting application. You should discover a recipe.xml file in the current directory.

XMLEventWriter declares a void add(XMLEventReader reader) convenience method for adding a stream of input events to an output stream in one method call. Listing 4-5 presents the source code to a Copy application that uses this method to copy an XML file to another XML file.

import java.io.FileReader;

import java.io.FileWriter;

import javax.xml.stream.XMLEventReader;

import javax.xml.stream.XMLEventWriter;

import javax.xml.stream.XMLInputFactory;

import javax.xml.stream.XMLOutputFactory;

import static java.lang.System.*;

public class Copy

{

public static void main(String[] args) throws Exception

{

if (args.length != 2)

{

err.println("usage: java copy xmlfile1 xmlfile2");

return;

}

XMLInputFactory xmlif = XMLInputFactory.newFactory();

FileReader fr = new FileReader(args[0]);

XMLEventReader xmler = xmlif.createXMLEventReader(fr);

XMLOutputFactory xmlof = XMLOutputFactory.newFactory();

FileWriter fw = new FileWriter(args[1]);

XMLEventWriter xmlew;

xmlew = xmlof.createXMLEventWriter(fw);

xmlew.add(xmler);

xmlew.flush();

xmlew.close();

}

Listing 4-5

Copy

For brevity, I added a throws Exception clause to main()’s header.

Compile Listing 4-5 as follows:

javac Copy.java

Run the resulting application to copy a recipe.xml file to a _recipe.xml backup file:

java Copy recipe.xml _recipe.xml

If the source XML file doesn’t have an XML declaration, it will be added to the destination XML file.

Exercises

The following exercises are designed to test your understanding of Chapter 4’s content:

1.
Define StAX.
2.
What packages make up the StAX API?
3.
True or false: A stream-based reader extracts the next infoset item from an input stream by obtaining an event.
4.
How do you obtain a document reader? How do you obtain a document writer?
5.
What does a document writer do when you call XMLOutputFactory’s void setProperty(String name, Object value) method with XMLOutputFactory.IS_REPAIRING_NAMESPACES as the property name and true as the value?
6.
Create a ParseXMLDoc application that uses a StAX stream-based reader to parse its single command-line argument, an XML document. After creating this reader, the application should verify that a START_DOCUMENT infoset item has been detected, and then enter a loop that reads the next item and uses a switch statement to output a message corresponding to the item that has been read: ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, or START_ELEMENT. When START_ELEMENT is detected, output this element’s name and local name, and output the local names and values of all attributes. The loop ends when the END_DOCUMENT infoset item has been detected. Explicitly close the stream reader followed by the file reader upon which it’s based. Test this application with Exercise 1-21’s books.xml file.

Summary

StAX is a Java API for parsing an XML document sequentially from start to finish and also for creating XML documents. Java implements StAX through types stored in the javax.xml.stream, javax.xml.stream.events, and javax.xml.stream.util packages.

StAX parsers are known as document readers, and StAX document creators are known as document writers. StAX classifies document readers and document writers as stream-based or event-based.

Document readers are obtained by calling the various “create” methods that are declared in the XMLInputFactory class. Document writers are obtained by calling the various “create” methods that are declared in the XMLOutputFactory class.

The low-level XMLStreamReader interface offers the most efficient way to read XML data with StAX. This interface’s boolean hasNext() method returns true when there is a next infoset item to obtain; otherwise, it returns false. The int next() method advances the cursor by one infoset item and returns an integer code that identifies this item’s type.

The low-level XMLStreamWriter interface declares several methods for writing infoset items to the destination. Examples include void writeAttribute(String localName, String value) and void writeCharacters(String text).

The high-level XMLEventReader interface offers a somewhat less efficient but more object-oriented way to read XML data with StAX. This interface’s boolean hasNext() method returns true when there is an event to obtain; otherwise, it returns false. The XMLEvent nextEvent() method returns the next event as an object whose class implements a subinterface of the XMLEvent interface.

Chapter 5 introduces Java’s XPath API for simplifying DOM node access.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4. Parsing and Creating XML Documents with StAX

Create new playlist

Sign In

Sign Up

4. Parsing and Creating XML Documents with StAX

What Is StAX?

StAX Versus SAX and DOM

Exploring StAX

Stream-Based Versus Event-Based Readers and Writers

Parsing XML Documents

Note

Parsing Documents with Stream-Based Readers

Note

Note

Note

Parsing Documents with Event-Based Readers

Note

Note

Creating XML Documents

Note

Creating Documents with Stream-Based Writers

Creating Documents with Event-Based Writers

Exercises

Summary

Table of Contents for
4. Parsing and Creating XML Documents with StAX