6.3. Namespace-Aware Parsing

Problem

You need to parse an XML document with multiple namespaces.

Solution

Use Digester to parse XML with multiple namespaces, using digester.setNamespaceAware(true), and supplying two RuleSet objects to parse elements in each namespace. Consider the following document, which contains elements from two namespaces: http://discursive.com/page and http://discursive.com/person:

<?xml version="1.0"?>

<pages xmlns="http://discursive.com/page"
       xmlns:person="http://discursive.com/person">
  <page type="standard">
    <person:person firstName="Al" lastName="Gore">
      <person:role>Co-author</person:role> 
    </person:person>
    <person:person firstName="George" lastName="Bush">
      <person:role>Co-author</person:role> 
    </person:person>
  </page>
</pages>

To parse this XML document with the Digester, you need to create two separate sets of rules for each namespace, adding each RuleSet object to Digester with addRuleSet( ). A RuleSet adds Rule objects to an instance of Digester. By extending the RuleSetBase class, and setting the namespaceURI in the default constructor, the following class, PersonRuleSet, defines rules to parse the http://discursive.com/person namespace:

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.RuleSetBase;

public class PersonRuleSet extends RuleSetBase {
    public PersonRuleSet( ) {
        this.namespaceURI = "http://discursive.com/person";
    }

    public void addRuleInstances(Digester digester) {
        digester.addObjectCreate("*/person", Person.class);
        digester.addSetNext("*/person", "addPerson");
        digester.addSetProperties("*/person");
        digester.addBeanPropertySetter("*/person/role", "role");
    }
}

PersonRuleSet extends RuleSetBase , which is an implementation of the RuleSet interface. RuleSetBase adds support for namespaces with a protected field namespaceURI. The constructor of PersonRuleSet sets the namespaceURI field to http://discursive.com/person, which tells the Digester to apply these rules only to elements and attributes in the http://discursive.com/person namespace. PageRuleSet extends RuleSetBase and provides a set of rules for the http://discursive.com/page namespace:

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.RuleSetBase;

public class PageRuleSet extends RuleSetBase {
    public PageRuleSet( ) {
        this.namespaceURI = "http://discursive.com/page";
    }

    public void addRuleInstances(Digester digester) {
        digester.addObjectCreate("*/page", Page.class);
        digester.addSetNext("*/page", "addPage");
        digester.addSetProperties("*/page");
        digester.addBeanPropertySetter("*/page/summary", "summary");
    }
}

Both RuleSet implementations instruct the Digester to create a Page or a Person object whenever either element is encountered. The PageRuleSet instructs the Digester to create a Page object when a page element is encountered by using a wildcard pattern—*/page. Both PageRuleSet and PersonRuleSet use digester.addSetNext( ) to add the objects just created to the next object in the Stack. In the following code, an instance of Pages is pushed onto the Digester Stack, and both RuleSet implementations are added to a Digester using addRuleSet() :

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.ObjectCreateRule;
import org.apache.commons.digester.RuleSetBase;
import org.apache.commons.digester.Rules;
import org.apache.commons.digester.SetNextRule;

Pages pages = new Pages( );
        
Digester digester = new Digester( );
digester.setNamespaceAware(true);
digester.addRuleSet( new PageRuleSet( ) );
digester.addRuleSet( new PersonRuleSet( ) );
        
digester.push(pages);

InputStream input = getClass( ).getResourceAsStream("./content.xml");
digester.parse(input);

Page page = (Page) pages.getPages( ).get(0);
System.out.println(page);

Because the PageRuleSet adds each Page object to the next object on the Stack, the Pages object has an addPage( ) method that accepts a Page object.

Discussion

Each of the RuleSet implementations defined a set of rules in compiled Java code. If you prefer to define each set of rules in an XML file, you may use the FromXmlRuleSet instead of the RuleSetBase, as follows:

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.xmlrules.FromXmlRuleSet;

Pages pages = new Pages( );
        
Digester digester = new Digester( );
digester.setNamespaceAware(true);

// Add page namespace
digester.setRuleNamespaceURI("http://discursive.com/page");
URL pageRules = getClass( ).getResource("./page-rules.xml");
digester.addRuleSet( new FromXmlRuleSet( pageRules ) );
    
// Add person namespace
digester.setRuleNamespaceURI("http://discursive.com/person");
URL personRules = getClass( ).getResource("./person-rules.xml");
digester.addRuleSet( new FromXmlRuleSet( personRules ) );
        
digester.push(pages);

InputStream input = getClass( ).getResourceAsStream("./content.xml");
digester.parse(input);

Page page = (Page) pages.getPages( ).get(0);
System.out.println(page);

Calling digester.setRuleNamespaceURI( ) associates the Rules contained in each FromXmlRuleSet with a specific namespace. In the Solution, the RuleSetBase protected field namespaceURI was used to associate RuleSet objects with namespaces. In the previous example, the namespace is specified by calling setRuleNamespaceURI( ) before each FromXmlRuleSet is added to the digester because there is no access to the protected member variable, namespaceURI, which FromXmlRuleSet inherits from RuleSetBase. person-rules.xml contains an XML rule set for parsing the http://discursive.com/person namespace:

<?xml version="1.0"?>

<!DOCTYPE digester-rules PUBLIC 
        "-//Jakarta Apache //DTD digester-rules XML V1.0//EN" 
        "http://jakarta.apache.org/commons/digester/dtds/digester-rules.dtd">

<digester-rules>
  <pattern value="*/page">
    <object-create-rule classname="com.discursive.jccook.xml.bean.Page"/>
    <set-next-rule methodname="addPage"/>
    <set-properties-rule/>
    <bean-property-setter-rule pattern="summary" name="summary"/>
  </pattern>
</digester-rules>

page-rules.xml contains an XML rule set for parsing the http://discursive.com/page namespace:

<?xml version="1.0"?>

<!DOCTYPE digester-rules PUBLIC 
        "-//Jakarta Apache //DTD digester-rules XML V1.0//EN" 
        "http://jakarta.apache.org/commons/digester/dtds/digester-rules.dtd">

<digester-rules>
  <pattern value="*/person">
    <object-create-rule classname="com.discursive.jccook.xml.bean.Person"/>
    <set-next-rule methodname="addPerson"/>
    <set-properties-rule/>
    <bean-property-setter-rule pattern="role"/>
  </pattern>
</digester-rules>

See Also

For more information relating to the use of namespaces in the Digester, refer to the Javadoc for the org.apache.commons.digester package at http://jakarta.apache.org/commons/digester/apidocs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.237.131