Using a Document

Once we have our initial Document object (either from instantiating one directly or building one using the JDOM input classes), we can act on the Document independently of any particular format or API. There are no ties to SAX, DOM, or the original format of the data. There is also no coupling to the output format, as we will see in the next section. Any JDOM Document object can be output to any format desired!

The Document object itself has methods that deal with the four components it can have: a DocType (referencing an external DTD, or providing internal definitions), ProcessingInstructions, a root Element, and Comments. Each of these objects maps to an XML equivalent, and provides a Java representation of those constructs in XML.

The Document DocType

The JDOM DocType object is a simple representation of a DOCTYPE declaration in an XML document. Assume we have the following XHTML file:

<!DOCTYPE html 
      PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
             "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
>

<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <!-- etc -->
</html>

This code will print out the element, public ID, and system ID from the JDOM DocType object that maps to the declaration:

DocType docType = doc.getDocType(  );
  System.out.println("Element: " + docType.getElementName(  ));
  System.out.println("Public ID: " + docType.getPublicID(  ));
  System.out.println("System ID: " + docType.getSystemID(  ));

Its output is:

Element: html
Public ID: -//W3C//DTD XHTML 1.0 Transitional//EN
System ID: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd

JDOM 1.0 supports referencing external DTDs, but does not yet allow inline definition of constraints.[7] A DocType object can be created with the name of the element being constrained (typically the root element of the document), and a system and public ID may be supplied to specify the location of an external DTD to reference. We can add a reference to the Document object with the following code:

Document doc = new Document(new Element("foo:bar"));
doc.setDocType(new DocType(
    "html",
    "-//W3C//DTD XHTML 1.0 Transitional//EN",
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"));

The DocType object is automatically created by the selected Builder implementation if the JDOM Document is constructed from existing XML data.

Processing Instructions

The ProcessingInstruction class provides a Java representation of an XML PI, with simple accessor and mutator methods. You can get a list of all PIs[8] from a Document using the following code:

// Get all PIs
List pis = doc.getProcessingInstructions(  );

// Iterate through them, printing out target and data
for (int i=0, size=pis.size(  ); i<size; i++) {
    ProcessingInstruction pi = (ProcessingInstruction)pis.get(i);
    String target = pi.getTarget(  );
    String data = pi.getData(  );
}

You can also retrieve a list of all PIs with a specific target name using getProcessingInstructions(String target).

A PI can be constructed by providing the target and data to the ProcessingInstruction constructor:

ProcessingInstruction pi = 
  new ProcessingInstruction("cocoon-process", "type="xslt"");

This would result in the following PI representation:

<?cocoon-process type="xslt"?>

There are several additional helper methods added to the class. It is common to supply the data for a PI in name/value pairs, as in the following example:

<?xml-stylesheet href="XSLJavaXML.wml.xsl" type="text/xsl" media="wap"?>

To accommodate this, the ProcessingInstruction class provides a constructor that accepts a Map of values:

Map map = new HashMap(  );
map.put("href", "XSL\JavaXML.wml.xsl");  // escape the ''
map.put("type", "text/xsl");
map.put("media", "wap");
ProcessingInstruction pi = 
  new ProcessingInstruction("xml-stylesheet", map);

The ProcessingInstruction class also has convenience methods to retrieve the data of the PI in name/value pair format. The most basic of these is the getValue( ) method. This method takes the name of the name/value pair being searched for in the PI’s data, and returns its value if located, or an empty String is returned if the name/value pair cannot be found. For example, the following code would determine the media type for the xml-stylesheet PI shown earlier:

String mediaType = pi.getValue("media");

The resulting value would be the String “wap”, which can then be used throughout the application. Since the data of a PI is not required to be in name/value pair form, getData( ) is also provided, which returns the raw String data for the ProcessingInstruction object. Adding ProcessingInstructions to a JDOM Document object can be done in any of the following ways:

Document doc = new Document(new Element("root"))
    .addProcessingInstruction(
               
        new ProcessingInstruction("instruction-1", "one way"))
               
    .addProcessingInstruction("instruction-2", "convenient way");

Here, a PI is added through:

addProcessingInstruction(ProcessingInstruction pi)

by supplying a created ProcessingInstruction object, and through the convenience method:

addProcessingInstruction(String target, String data)

which performs the same task using the supplied data.

Elements

The core of any Document is the data within it, which is enclosed within that Document’s elements. The JDOM Element class is the Java representation of one of those elements, and provides access to all the data for the element it represents. A JDOM Element instance is namespace-aware, and all methods that operate upon the Element class and its Attributes can be invoked with a single String name, or the String local name of the Element and a Namespace reference (which we look at next). In other words, the following methods are all available to an Element instance:

// Create Element
Element element = new Element("elementName");

// Create Element with namespace
Element element = new Element ("elementName", Namespace.getNamespace(
                  "JavaXML", "http://oreilly.com/catalog/javaxml/"));

// Add an attribute
element.addAttribute("attributeName");
element.addAttribute("attributeName", Namespace.getNamespace(
                    "JavaXML", "http://www.oreilly.com/catalog/javaxml/"));

// Search for attributes with a specific name
List attributes = element.getAttributes("searchName");

The root element for a document is retrieved from the JDOM Document using doc.getRootElement( ). Each Element then has methods provided to retrieve its children, through the getChildren( ) method. For convenience, the Element class provides several variations on getChildren( ), providing a means to retrieve a specific Element through its namespace and local name, to retrieve all Elements with a specific name in the default namespace, or to retrieve all nested Elements regardless of name:

public class Element {

    // Retrieve all nested Elements for this Element
    public List getChildren(  );

    // Retrieve all nested Elements with the specified name 
    //  (in the default namespace)
    public List getChildren(String name);

    // Retrieve all nested Elements with the specified name
    //   and namespace
    public List getChildren(String name, Namespace ns);

    // Retrieve the Element with the specified name - if multiple
    //   Elements exists with this name, return the first
    public Element getChild(String name) throws NoSuchElementException;

    // Retrieve the Element with the specified name - if multiple
    //   Elements exists with this name, return the first
    public Element getChild(String name, Namespace ns) 
        throws NoSuchElementException;

    // Other methods

}

The versions that retrieve a specific Element can throw a NoSuchElementException, or in the case of the version that returns a List, an empty List. Children can be retrieved by name (with or without namespace), or all children can be retrieved regardless of name. To retrieve a child by name, use getChild( ), and to retrieve all children, use getChildren( ). Consider the following XML document:

<?xml version="1.0"?>

<linux-config>
  <gui>
    <window-manager>
      <name>Enlightenment</name>
      <version>0.16.2</version>
    </window-manager>

    <window-manager>
      <name>KWM for KDE</name>
      <version>1.1.2</version>
    </window-manager>
  </gui>
  <sound>
    <card>
      <name>Sound Blaster Platinum</name>
      <irq>7</irq>
      <dma>0</dma>
      <io start="D800" stop="D81F" />
    </card>
  </sound>
</linux-config>

When the document structure is known ahead of time, as in this example, a specific Element and its value can be retrieved from the JDOM Document object easily:

Element root = doc.getRootElement(  );

String windowManager = root.getChild("gui")
                           .getChild("window-manager")
                           .getChild("name")
                           .getContent(  );

String soundCardIRQ = root.getChild("sound")
                          .getChild("card")
                          .getChild("irq")
                          .getContent(  );

Note that here, only the first element named window-manager will be returned, which is the defined behavior of getChild(String name). To get all elements with a name, getChildren(String name) should be used:

List windowManagers = root.getChild("gui")
                          .getChildren("window-managers");

When an Element has pure textual data, it can be retrieved through the getContent( ) method as demonstrated in the previous example. When an Element has only Element children, they can be retrieved using getChildren( ) . In the fairly rare case that an Element has a combination of text content, child elements, and comment elements, it’s said to have mixed content . The mixed content of an Element can be obtained through the getMixedContent( ) method. This method returns a List of the content that contains String, Element, ProcessingInstruction, and Comment objects.

Note

Technically, getContent( ) actually returns the String data held within an Element. This can be seen as different than the content of the Element itself. Additionally, getChildren( ) technically only returns the nested Elements, not all the child objects of an Element. The task of retrieving all content of an Element is left to the more complicated getMixedContent( ) method. This simplification eases the task of manipulating XML files for Java developers, removing the need to perform instanceof operations on all method call results. The method names then, while not technically accurate, are modeled after developer and user patterns.

Elements are commonly added to other Elements through the addChild(Element) method. You can add several elements to a JDOM Document at once:

               element
               
    .addChild(new Element("son").setContent("snips and snails"))
               
    .addChild(new Element("daughter").setContent("sugar and spice")
               
        .addChild(new Element("grandchild"))
               
    );

This example chains together the adding of elements for convenience. This shorthand is possible because addChild( ) returns the Element to which it was added. You must be very careful when placing parentheses so this technique will work correctly. With one mismatched parenthesis, what were supposed to be siblings may become parent and child! Child elements can be removed using the methods removeChild( ) and removeChildren( ). They take the same parameters as getChild( ) and getChildren( ).

Elements are constructed with their names. To accommodate namespaces, there are four constructors:

// Get a namespace reference
Namespace ns = Namespace.getNamespace("JavaXML", 
                              "http://www.oreilly.com/catalog/javaxml/");

// Create an element: JavaXML:Book
Element element1 = new Element("Book", ns);

// Create an element: JavaXML:Book
Element element2 = new Element("Book", "JavaXML",
                              "http://www.oreilly.com/catalog/javaxml/");

// Create an element: Book
Element element3 = new Element("Book", "http://www.oreilly.com/catalog/javaxml/");

// Create an element: Book
Element element4 = new Element("Book");

The first two Element instances, element1 and element2, have equivalent names, as the Element class will handle storing the supplied name and namespace. The third instance, element3, is assigned to the default namespace, and that namespace is given a URI. The fourth instance creates an Element without a namespace.

Element content is set using setContent(String content). This replaces any existing content within the Element, including any Element children. To add the String as an additional “piece” of the Element’s overall mixed content, use the addChild(String content) method.

One powerful feature of JDOM is that Elements can be added and removed by manipulating the List returned from an invocation of getChildren( ). Here the last “naughty” child is removed from the root (to set an example for the others):

// Get the root Element
Element root = doc.getRootElement(  );

// Get all "naughty" children
List badChildren = root.getChildren("naughty");

// Get rid of the last naughty child
if (badChildren.size(  ) > 0) {
    badChildren.remove(badChildren.size(  )-1);
}

The Java 2 collection classes support features like set arithmetic and high-speed sorting, so while the convenience methods on JDOM objects are, well, convenient, for the advanced tasks, it’s useful to manipulate the List objects directly. We now can look at adding namespace mappings to our Document object, as well as adding and accessing JDOM Attributes.

Namespaces

The XML namespaces Recommendation defines the process by which namespace prefixes are mapped to URIs. For a namespace prefix to be used, the prefix should be mapped to a URI through the xmlns:[namespace prefix] attribute. In using JDOM, all namespace-prefixes-to-URI mappings are handled automatically by JDOM at output time.

You have seen that XML namespaces are handled through the org.jdom.Namespace class, which doubles as a factory for creating new namespaces:

Namespace ns = Namespace.getNamespace("prefix", "uri");

The ns object can then be used by Element and Attribute objects. Additionally, the Namespace class will only create new objects when needed; requests for existing namespaces receive a reference to the existing object.

Attributes

An attribute of an Element is retrieved using the getAttribute(String name) method. This method returns an Attribute object whose value is retrieved using getValue( ). The following code gets the “size” attribute on the given element.

element.getAttribute("size").getValue(  );

A variety of convenient methods are provided for accessing the attribute’s value as a specific data type. These include methods for the Java primitives, such as getIntValue( ) , getFloatValue( ), getBooleanValue( ), and getByteValue(). The methods throw a DataConversionException if the value does not exist or could not be converted to the requested type. There are matching companions for each of these methods that allow a default value to be passed in, which is returned instead of throwing an exception if the requested data conversion cannot be done. This code snippet retrieves the size as an int, or returns if a conversion cannot occur:

element.getAttribute("size")
       .getIntValue(0);

Adding attributes to an element is equally simple. An attribute can be added using an Element’s addAttribute(String name, String value) method, or you can use the more formal addAttribute(Attribute attribute) method. The Attribute constructor takes in the name of the Attribute to create (either as a single String parameter, or as a namespace prefix and local name) and the value to assign to the Attribute:

doc.getRootElement(  )
    .addAttribute("kernel", "2.2.14")                      // easy way
                  
    .addAttribute(new Attribute("dist", "Red Hat 6.1"));   // formal way

Comments

The JDOM Comment object represents data that is not part of the functional data of the Document, but is used for human readability and convenience. In XML it’s represented by <!-- this syntax --> . Comments in JDOM are represented by the Comment class with instances kept either at the document level, or as children of an Element; in other words, both the JDOM Document object and its Elements can have comments.

To obtain the comments for a Document, the getContent( ) method is provided, which returns a List containing all the Comment objects of the document as well as the root Element. Comments placed before the root appear in the list before the root, and those placed after the root appear later in the output. To obtain the comments for an Element, getMixedContent( ) should be called, which returns all Comment, Element, and String (textual data) objects nested within the Element in the order in which they appear. As an example, assume we have the following XML file:

<?xml version="1.0"?>

<!-- A comment at the root level: Java and XML, by Brett McLaughlin -->
<JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/">
  <JavaXML:Title>Java and XML</JavaXML:Title>

  <!-- A comment nested within the JavaXML:Book element: Contents -->
  <JavaXML:Contents>
     You're reading the contents!
  </JavaXML:Contents>
</JavaXML:Book>

Normally, the comments are not needed by applications, but should they be, this code would retrieve them:

List docContent = doc.getContent(  );
List elemContent = root.getMixedContent(  );

for (int i=0, size=docContent.size(  ); i<size; i++) {
    Object o = docContent.get(i);
    if (o instanceof Comment) {
        Comment c = (Comment)o;
        String text = c.getText(  );
    }
}

for (int i=0, size=elemContent.size(  ); i<size; i++) {
    Object o = elemContent.get(i);
    if (o instanceof Comment) {
        Comment c = (Comment)o;
        String text = c.getText(  );
    }
}

The Comment constructor takes in the text of the comment as its sole argument. The Document object provides a means for comments to be added through the addComment(Comment) method, and the Element class provides addChild(Comment) for the same purpose:

// Create the Comment
Comment docComment = new Comment("A comment at the root level");

// Add the comment to the Document object
doc.addComment(docComment);

// Create another Comment
Comment elemComment = new Comment("A comment nested within an element");

// Add the comment to an Element
doc.getRootElement(  )
   .getChild("Contents")
   .addChild(elemComment);


[7] Support for inline constraints is likely be added to a minor revision of JDOM, which may be available at the time of this book’s publication.

[8] JDOM does support ProcessingInstruction objects nested within Elements in a Document. These nested PIs are not returned through the Document-level PI methods; because nested PIs are relatively uncommon, they are not specifically addressed here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.79.63