6.2. Turning XML Documents into Objects

Problem

You need to parse an XML document into an object graph, and you would like to avoid using either the DOM or SAX APIs directly.

Solution

Use the Commons Digester to transform an XML document into an object graph. The Digester allows you to map an XML document structure to an object model in an external XML file containing a set of rules telling the Digester what to do when specific elements are encountered. In this recipe, the following XML document containing a description of a play will be parsed into an object graph:

<?xml version="1.0"?>

<plays>
  <play genre="tragedy" year="1603" language="english">
    <name>Hamlet</name>
    <author>William Shakespeare</author>
    <summary>
      Prince of Denmark freaks out, talks to ghost, gets into a
      crazy nihilistic funk, and dies in a duel.
    </summary>
    <characters>
      <character protagonist="false">
        <name>Claudius</name>
        <description>King of Denmark</description>
      </character>
      <character protagonist="true">
        <name>Hamlet</name>
        <descr>
          Son to the late, and nephew of the present king
        </descr>
      </character>
      <character protagonist="false">
        <name>Horatio</name>
        <descr>
          friend to Hamlet
        </descr>
      </character>
    </characters>
  </play>
</plays>

This XML document contains a list of play elements describing plays by William Shakespeare. One play element describes “Hamlet”; it includes a name, author, and summary element as well as a characters element containing character elements describing characters in the play. After parsing a document with Digester, each play element will be represented by a Play object with a set of properties and a List of Character objects:

public class Play {
    private String genre;
    private String year;
    private String language;
    private String name;
    private String author;
    private String summary;
    private List characters = new ArrayList( );

    // accessors omitted for brevity

    // Add method to support adding elements to characters.
    public void addCharacter(Character character) {
        characters.add( character );
    }
}

public class Character {
    private String name;
    private String description;
    private boolean protagonist;

    // accessors omitted for brevity
}

The Digester maps XML to objects using a set of rules, which can be defined either in an XML file, or they can be constructed programmatically by creating instances of Rule and adding them to an instance of Digester. This recipe uses an XML file to create a set of rules that tell the Digester how to translate an XML document to a List of Play objects:

<?xml version="1.0"?>

<digester-rules>
  <pattern value="plays/play">
    <object-create-rule classname="xml.digester.Play"/>
    <set-next-rule methodname="add" paramtype="java.lang.Object"/>
    <set-properties-rule/>
    <bean-property-setter-rule pattern="name"/>
    <bean-property-setter-rule pattern="summary"/>
    <bean-property-setter-rule pattern="author"/>
    
    <!-- Nested Pattern for Characters -->
    <pattern value="characters/character">
      <object-create-rule classname="xml.digester.Character"/>
      <set-next-rule methodname="addCharacter" 
                     paramtype="xml.digester.Character"/>
      <set-properties-rule/>
      <bean-property-setter-rule pattern="name"/>
      <bean-property-setter-rule pattern="descr" 
                                 propertyname="description"/>
    </pattern>
  
  </pattern>
</digester-rules>

This mapping document (or rule sets) can be explained in very straightforward language. It is telling Digester how to deal with the document, “When you see an element matching the pattern plays/play, create an instance of xml.digester.Play, set some properties, and push it on to a Stack (object-create-rule). If you encounter an element within a play element that matches characters/character, create an instance of xml.digester.Character, set some properties, and add it to the Play object.” The following code creates an instance of Digester from the XML rule sets shown previously, producing a plays List, which contains one Play object:

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.xmlrules.DigesterLoader;

List plays = new ArrayList( );

// Create an instance of the Digester from the XML rule set
URL rules = getClass( ).getResource("./play-rules.xml");
Digester digester = DigesterLoader.createDigester(rules);

// Push a reference to the plays List on to the Stack
digester.push(plays);

// Parse the XML document
InputStream input = getClass( ).getResourceAsStream("./plays.xml");
Object root = digester.parse(input);

// The XML document contained one play "Hamlet"
Play hamlet = (Play) plays.get(0);
List characters = (List) hamlet.getCharacters( );

Discussion

Digester is simple, but there is one concept you need to understand: Digester uses a Stack to relate objects to one another. In the previous example, set-next-rule tells the Digester to relate the top of the Stack to the next-to-top of the Stack. Before the XML document is parsed, a List is pushed onto the Stack. Every time the Digester encounters a play element, it will create an instance of Play, push it onto the top of the Stack, and call add( ) with Play as an argument on the object next to the top of the stack. Since the List is next to the top of the Stack, the Digester is simply adding the Play to the playList. Within the pattern element matching plays/play, there is another pattern element matching characters/character. When an element matching characters/character is encountered, a Character object is created, pushed onto the top of the Stack, and the addCharacter( ) method is called on the next to top of the Stack. When the Character object is pushed onto the top of the Stack, the Play object is next to the top of the Stack; therefore, the call to addCharacter( ) adds a Character to the List of Character objects in the Play object.

Digester can be summed up as follows: define patterns to be matched and a sequence of actions (rules) to take when these patterns are encountered. Digester is essentially short-hand for your own SAX parser, letting you accomplish the same task without having to deal with the complexity of the SAX API. If you look at the source for the org.apache.commons.digester.Digester class, you see that it implements org.xml.sax.helpers.DefaultHandler and that a call to parse( ) causes Digester to register itself as a content handler on an instance of org.xml.sax.XMLReader. Digester is simply a lightweight shell around SAX, and, because of this, you can parse XML just as fast with the Digester as with a system written to the SAX API.

Digester rule sets can be defined in an external XML document, or programmatically in compiled Java code, but the general rules are the same. The following code recreates the rule set defined in the previous XML rule set:

import org.apache.commons.digester.BeanPropertySetterRule;
import org.apache.commons.digester.Digester;
import org.apache.commons.digester.ObjectCreateRule;
import org.apache.commons.digester.Rules;
import org.apache.commons.digester.SetNextRule;
import org.apache.commons.digester.SetPropertiesRule;

Digester digester = new Digester( );

Rules rules = digester.getRules( );

// Add Rules to parse a play element
rules.add( "plays/play", new ObjectCreateRule("xml.digester.Play"));
rules.add( "plays/play", new SetNextRule("add", "java.lang.Object") );
rules.add( "plays/play", new SetPropertiesRule( ) );
rules.add( "plays/play/name", new BeanPropertySetterRule("name") );
rules.add( "plays/play/summary", new BeanPropertySetterRule("summary") );
rules.add( "plays/play/author", new BeanPropertySetterRule("author") );

// Add Rules to parse a character element
rules.add( "plays/play/characters/character", new 
           ObjectCreateRule("xml.digester.Character"));
rules.add( "plays/play/characters/character", 
           new SetNextRule("addCharacter", "xml.digester.Character"));
rules.add( "plays/play/characters/character", new SetPropertiesRule( ) );
rules.add( "plays/play/characters/character/name", 
           new BeanPropertySetterRule("name") );
rules.add( "plays/play/characters/character/description", 
           new BeanPropertySetterRule("description") );

While this is perfectly acceptable, think twice about defining Digester rule sets programmatically. Defining rule sets in an XML document provides a very clear separation between the framework used to parse XML and the configuration of the Digester. When your rule sets are separate from compiled code, it will be easier to update and maintain logic involved in parsing; a change in the XML document structure would not involve changing code that deals with parsing. Instead, you would change the model and the mapping document. Defining Digester rule sets in an XML document is a relatively new Digester feature, and, because of this, you may find that some of the more advanced capabilities of Digester demonstrated later in this chapter are not available when defining rule sets in XML.

See Also

More information about Digester XML rule sets can be found in the package document for org.apache.commons.digester.xmlrules (http://jakarta.apache.org/commons/digester/apidocs/index.html).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.105.215