You need to parse an XML document into an object graph, and you would like to avoid using either the DOM or SAX APIs directly.
Use the Commons Digester to transform an XML document into an object graph. The Digester allows you to map an XML document structure to an object model in an external XML file containing a set of rules telling the Digester what to do when specific elements are encountered. In this recipe, the following XML document containing a description of a play will be parsed into an object graph:
<?xml version="1.0"?> <plays> <play genre="tragedy" year="1603" language="english"> <name>Hamlet</name> <author>William Shakespeare</author> <summary> Prince of Denmark freaks out, talks to ghost, gets into a crazy nihilistic funk, and dies in a duel. </summary> <characters> <character protagonist="false"> <name>Claudius</name> <description>King of Denmark</description> </character> <character protagonist="true"> <name>Hamlet</name> <descr> Son to the late, and nephew of the present king </descr> </character> <character protagonist="false"> <name>Horatio</name> <descr> friend to Hamlet </descr> </character> </characters> </play> </plays>
This XML document contains a list of play
elements
describing plays by William Shakespeare. One play element describes
“Hamlet”; it includes a
name
, author
, and
summary
element as well as a
characters
element containing
character
elements describing characters in the
play. After parsing a document with Digester, each
play
element will be represented by a
Play
object with a set of properties and a
List
of Character
objects:
public class Play { private String genre; private String year; private String language; private String name; private String author; private String summary; private List characters = new ArrayList( ); // accessors omitted for brevity // Add method to support adding elements to characters. public void addCharacter(Character character) { characters.add( character ); } } public class Character { private String name; private String description; private boolean protagonist; // accessors omitted for brevity }
The Digester maps XML to objects using a set of rules, which can be
defined either in an XML file, or they can be constructed
programmatically by creating instances of Rule
and
adding them to an instance of Digester
. This
recipe uses an XML file to create a set of rules that tell the
Digester how to translate an XML document to a
List
of Play
objects:
<?xml version="1.0"?> <digester-rules> <pattern value="plays/play"> <object-create-rule classname="xml.digester.Play"/> <set-next-rule methodname="add" paramtype="java.lang.Object"/> <set-properties-rule/> <bean-property-setter-rule pattern="name"/> <bean-property-setter-rule pattern="summary"/> <bean-property-setter-rule pattern="author"/> <!-- Nested Pattern for Characters --> <pattern value="characters/character"> <object-create-rule classname="xml.digester.Character"/> <set-next-rule methodname="addCharacter" paramtype="xml.digester.Character"/> <set-properties-rule/> <bean-property-setter-rule pattern="name"/> <bean-property-setter-rule pattern="descr" propertyname="description"/> </pattern> </pattern> </digester-rules>
This mapping document (or rule sets) can be explained in very
straightforward language. It is telling Digester how to deal with the
document, “When you see an element matching the
pattern plays/play
, create an instance of
xml.digester.Play
, set some properties, and push
it on to a Stack
(object-create-rule
). If you encounter an element
within a play
element that matches
characters/character
, create an instance of
xml.digester.Character
, set some properties, and
add it to the Play
object.” The
following code creates an instance of Digester
from the XML rule sets shown previously, producing a plays List
, which contains one Play
object:
import org.apache.commons.digester.Digester; import org.apache.commons.digester.xmlrules.DigesterLoader; List plays = new ArrayList( ); // Create an instance of the Digester from the XML rule set URL rules = getClass( ).getResource("./play-rules.xml"); Digester digester = DigesterLoader.createDigester(rules); // Push a reference to the plays List on to the Stack digester.push(plays); // Parse the XML document InputStream input = getClass( ).getResourceAsStream("./plays.xml"); Object root = digester.parse(input); // The XML document contained one play "Hamlet" Play hamlet = (Play) plays.get(0); List characters = (List) hamlet.getCharacters( );
Digester is simple, but there is one concept you need to understand:
Digester uses a Stack
to relate objects to one
another. In the previous example, set-next-rule
tells the Digester to relate the top of the Stack
to the next-to-top of the Stack
. Before the XML
document is parsed, a List
is pushed onto the
Stack
. Every time the Digester encounters a
play
element, it will create an instance of
Play
, push it onto the top of the
Stack
, and call add( )
with
Play
as an argument on the object next to the top
of the stack. Since the List
is next to the top of
the Stack
, the Digester is simply adding the
Play
to the playList
. Within
the pattern element matching plays/play
, there
is another pattern element matching
characters/character
. When an element matching
characters/character
is encountered, a
Character
object is created, pushed onto the top
of the Stack
, and the addCharacter( )
method is called on the next to top of the
Stack
. When the Character
object is pushed onto the top of the Stack
, the
Play
object is next to the top of the
Stack
; therefore, the call to
addCharacter( )
adds a
Character
to the List
of
Character
objects in the Play
object.
Digester can be summed up as follows: define patterns to be matched
and a sequence of actions (rules) to take when these patterns are
encountered. Digester is essentially short-hand for your own SAX
parser, letting you accomplish the same task without having to deal
with the complexity of the SAX API. If you look at the source for the
org.apache.commons.digester.Digester
class, you
see that it implements
org.xml.sax.helpers.DefaultHandler
and that a call
to parse( )
causes Digester
to
register itself as a content handler on an instance of
org.xml.sax.XMLReader
. Digester is simply a
lightweight shell around SAX, and, because of this, you can parse XML
just as fast with the Digester as with a system written to the SAX
API.
Digester rule sets can be defined in an external XML document, or programmatically in compiled Java code, but the general rules are the same. The following code recreates the rule set defined in the previous XML rule set:
import org.apache.commons.digester.BeanPropertySetterRule; import org.apache.commons.digester.Digester; import org.apache.commons.digester.ObjectCreateRule; import org.apache.commons.digester.Rules; import org.apache.commons.digester.SetNextRule; import org.apache.commons.digester.SetPropertiesRule; Digester digester = new Digester( ); Rules rules = digester.getRules( ); // Add Rules to parse a play element rules.add( "plays/play", new ObjectCreateRule("xml.digester.Play")); rules.add( "plays/play", new SetNextRule("add", "java.lang.Object") ); rules.add( "plays/play", new SetPropertiesRule( ) ); rules.add( "plays/play/name", new BeanPropertySetterRule("name") ); rules.add( "plays/play/summary", new BeanPropertySetterRule("summary") ); rules.add( "plays/play/author", new BeanPropertySetterRule("author") ); // Add Rules to parse a character element rules.add( "plays/play/characters/character", new ObjectCreateRule("xml.digester.Character")); rules.add( "plays/play/characters/character", new SetNextRule("addCharacter", "xml.digester.Character")); rules.add( "plays/play/characters/character", new SetPropertiesRule( ) ); rules.add( "plays/play/characters/character/name", new BeanPropertySetterRule("name") ); rules.add( "plays/play/characters/character/description", new BeanPropertySetterRule("description") );
While this is perfectly acceptable, think twice about defining Digester rule sets programmatically. Defining rule sets in an XML document provides a very clear separation between the framework used to parse XML and the configuration of the Digester. When your rule sets are separate from compiled code, it will be easier to update and maintain logic involved in parsing; a change in the XML document structure would not involve changing code that deals with parsing. Instead, you would change the model and the mapping document. Defining Digester rule sets in an XML document is a relatively new Digester feature, and, because of this, you may find that some of the more advanced capabilities of Digester demonstrated later in this chapter are not available when defining rule sets in XML.
More information about Digester
XML rule sets can be found in the
package document for
org.apache.commons.digester.xmlrules
(http://jakarta.apache.org/commons/digester/apidocs/index.html).
3.144.37.196