Once we
have our initial Document
object (either from
instantiating one directly or building one using the JDOM input
classes), we can act on the Document
independently
of any particular format or API. There are no ties to SAX, DOM, or
the original format of the data. There is also no coupling to the
output format, as we will see in the next section. Any JDOM
Document
object can be output to any format
desired!
The Document
object itself has methods that deal
with the four components it can have: a DocType
(referencing an external DTD, or providing internal definitions),
ProcessingInstruction
s, a root
Element
, and Comment
s. Each of
these objects maps to an XML equivalent, and provides a Java
representation of those constructs in XML.
The JDOM DocType
object is a simple representation of a
DOCTYPE
declaration in an XML document. Assume we
have the following XHTML file:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" > <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <!-- etc --> </html>
This code will print out the element, public ID, and system ID from
the JDOM DocType
object that maps to the
declaration:
DocType docType = doc.getDocType( ); System.out.println("Element: " + docType.getElementName( )); System.out.println("Public ID: " + docType.getPublicID( )); System.out.println("System ID: " + docType.getSystemID( ));
Its output is:
Element: html Public ID: -//W3C//DTD XHTML 1.0 Transitional//EN System ID: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
JDOM 1.0 supports referencing external DTDs, but does not yet allow
inline definition of constraints.[7] A DocType
object can be created with
the name of the element being constrained (typically the root element
of the document), and a system and public ID may be supplied to
specify the location of an external DTD to reference. We can add a
reference to the Document
object with the
following code:
Document doc = new Document(new Element("foo:bar")); doc.setDocType(new DocType( "html", "-//W3C//DTD XHTML 1.0 Transitional//EN", "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"));
The DocType
object is automatically created by the
selected Builder
implementation if the JDOM
Document
is constructed from existing XML data.
The
ProcessingInstruction
class provides a Java representation of
an XML PI, with simple accessor and mutator methods. You can get a
list of all PIs[8] from a Document
using the following
code:
// Get all PIs List pis = doc.getProcessingInstructions( ); // Iterate through them, printing out target and data for (int i=0, size=pis.size( ); i<size; i++) { ProcessingInstruction pi = (ProcessingInstruction)pis.get(i); String target = pi.getTarget( ); String data = pi.getData( ); }
You can also retrieve a list of all PIs with a specific
target name using getProcessingInstructions(String
target)
.
A PI can be constructed by providing the target and data to the
ProcessingInstruction
constructor:
ProcessingInstruction pi = new ProcessingInstruction("cocoon-process", "type="xslt"");
This would result in the following PI representation:
<?cocoon-process type="xslt"?>
There are several additional helper methods added to the class. It is common to supply the data for a PI in name/value pairs, as in the following example:
<?xml-stylesheet href="XSLJavaXML.wml.xsl" type="text/xsl" media="wap"?>
To accommodate this, the ProcessingInstruction
class provides a constructor that accepts a Map
of
values:
Map map = new HashMap( ); map.put("href", "XSL\JavaXML.wml.xsl"); // escape the '' map.put("type", "text/xsl"); map.put("media", "wap"); ProcessingInstruction pi = new ProcessingInstruction("xml-stylesheet", map);
The ProcessingInstruction
class also has
convenience methods to retrieve the data of the PI in name/value pair
format. The most basic of these is the getValue( )
method. This method takes the name of the name/value pair being
searched for in the PI’s data, and returns its value if
located, or an empty String
is returned if the
name/value pair cannot be found. For example, the following code
would determine the media type for the
xml-stylesheet
PI shown earlier:
String mediaType = pi.getValue("media");
The resulting value would be the String
“wap”, which can then be used throughout the application.
Since the data of a PI is not required to be in name/value pair form,
getData( )
is also provided, which returns the raw
String
data for the
ProcessingInstruction
object. Adding
ProcessingInstruction
s to a JDOM
Document
object can be done in any of the
following ways:
Document doc = new Document(new Element("root")).addProcessingInstruction(
new ProcessingInstruction("instruction-1", "one way"))
.addProcessingInstruction("instruction-2", "convenient way");
Here, a PI is added through:
addProcessingInstruction(ProcessingInstruction pi)
by supplying a created ProcessingInstruction
object, and through the convenience method:
addProcessingInstruction(String target, String data)
which performs the same task using the supplied data.
The
core of any Document
is the data within it, which
is enclosed within that Document
’s elements.
The JDOM Element
class is the Java representation of one
of those elements, and provides access to all the data for the
element it represents. A JDOM Element
instance is
namespace-aware, and all methods that operate upon the
Element
class and its
Attribute
s can be invoked with a single
String
name, or the String
local name of the Element
and a
Namespace
reference (which we look at next). In
other words, the following methods are all available to an
Element
instance:
// Create Element Element element = new Element("elementName"); // Create Element with namespace Element element = new Element ("elementName", Namespace.getNamespace( "JavaXML", "http://oreilly.com/catalog/javaxml/")); // Add an attribute element.addAttribute("attributeName"); element.addAttribute("attributeName", Namespace.getNamespace( "JavaXML", "http://www.oreilly.com/catalog/javaxml/")); // Search for attributes with a specific name List attributes = element.getAttributes("searchName");
The root
element for a document is retrieved from the JDOM
Document
using doc.getRootElement( )
. Each Element
then has methods
provided to retrieve its children, through the getChildren( )
method. For convenience, the Element
class provides several variations on getChildren( )
, providing a means to retrieve a specific
Element
through its namespace and local name, to
retrieve all Element
s with a specific name in the
default namespace, or to retrieve all nested
Element
s regardless of name:
public class Element { // Retrieve all nested Elements for this Element public List getChildren( ); // Retrieve all nested Elements with the specified name // (in the default namespace) public List getChildren(String name); // Retrieve all nested Elements with the specified name // and namespace public List getChildren(String name, Namespace ns); // Retrieve the Element with the specified name - if multiple // Elements exists with this name, return the first public Element getChild(String name) throws NoSuchElementException; // Retrieve the Element with the specified name - if multiple // Elements exists with this name, return the first public Element getChild(String name, Namespace ns) throws NoSuchElementException; // Other methods }
The versions that retrieve a specific Element
can
throw a NoSuchElementException
, or in the case of
the version that returns a List
, an empty
List
. Children can be retrieved by name (with or
without namespace), or all children can be retrieved regardless of
name. To retrieve a child by name, use getChild( )
, and to retrieve all children, use getChildren( )
. Consider the following XML document:
<?xml version="1.0"?> <linux-config> <gui> <window-manager> <name>Enlightenment</name> <version>0.16.2</version> </window-manager> <window-manager> <name>KWM for KDE</name> <version>1.1.2</version> </window-manager> </gui> <sound> <card> <name>Sound Blaster Platinum</name> <irq>7</irq> <dma>0</dma> <io start="D800" stop="D81F" /> </card> </sound> </linux-config>
When the document structure is known ahead of time, as in this
example, a specific Element
and its value can be
retrieved from the JDOM Document
object easily:
Element root = doc.getRootElement( ); String windowManager = root.getChild("gui") .getChild("window-manager") .getChild("name") .getContent( ); String soundCardIRQ = root.getChild("sound") .getChild("card") .getChild("irq") .getContent( );
Note that here, only the first element named
window-manager
will be returned, which is the
defined behavior of getChild(String
name)
. To get all elements with a name,
getChildren(String name)
should be used:
List windowManagers = root.getChild("gui") .getChildren("window-managers");
When an Element
has pure textual data, it can be
retrieved through the getContent( )
method as demonstrated in the previous example. When an
Element
has only Element
children, they can be retrieved using getChildren( )
.
In the fairly rare case that an Element
has a
combination of text content, child elements, and comment elements,
it’s said to have mixed
content
. The mixed content of an
Element
can be obtained through the
getMixedContent( )
method. This method returns a
List
of the content that contains
String
, Element
,
ProcessingInstruction
, and
Comment
objects.
Technically, getContent( )
actually returns the String
data held within an Element
. This can be seen as different than the content of the Element
itself. Additionally, getChildren( )
technically only returns the nested Element
s, not all the child objects of an Element
. The task of retrieving all content of an Element
is left to the more complicated getMixedContent( )
method. This simplification eases the task of manipulating XML files for Java developers, removing the need to perform instanceof
operations on all method call results. The method names then, while not technically accurate, are modeled after developer and user patterns.
Element
s are commonly added to other
Element
s through the
addChild(Element)
method. You can add several elements to a JDOM
Document
at once:
element
.addChild(new Element("son").setContent("snips and snails"))
.addChild(new Element("daughter").setContent("sugar and spice")
.addChild(new Element("grandchild"))
);
This example chains together the adding of elements for convenience.
This shorthand is possible because addChild( )
returns the Element
to which it was added. You
must be very careful when placing parentheses so this technique will
work correctly. With one mismatched parenthesis, what were supposed
to be siblings may become parent and child! Child elements can be
removed using the methods removeChild( )
and
removeChildren( )
. They take the same parameters
as getChild( )
and getChildren( )
.
Element
s are constructed with their names. To
accommodate namespaces, there are four constructors:
// Get a namespace reference Namespace ns = Namespace.getNamespace("JavaXML", "http://www.oreilly.com/catalog/javaxml/"); // Create an element: JavaXML:Book Element element1 = new Element("Book", ns); // Create an element: JavaXML:Book Element element2 = new Element("Book", "JavaXML", "http://www.oreilly.com/catalog/javaxml/"); // Create an element: Book Element element3 = new Element("Book", "http://www.oreilly.com/catalog/javaxml/"); // Create an element: Book Element element4 = new Element("Book");
The first two Element
instances,
element1
and element2
, have
equivalent names, as the Element
class will handle
storing the supplied name and namespace. The third instance,
element3
, is assigned to the default namespace,
and that namespace is given a URI. The fourth instance creates an
Element
without a namespace.
Element content is set using
setContent(String
content)
. This
replaces any existing content within the Element
,
including any Element
children. To add the
String
as an additional “piece” of the
Element
’s overall mixed content, use the
addChild(String
content)
method.
One powerful feature of JDOM is that Element
s can
be added and removed by manipulating the List
returned from an invocation of getChildren( )
.
Here the last “naughty” child is removed from the root
(to set an example for the others):
// Get the root Element Element root = doc.getRootElement( ); // Get all "naughty" children List badChildren = root.getChildren("naughty"); // Get rid of the last naughty child if (badChildren.size( ) > 0) { badChildren.remove(badChildren.size( )-1); }
The Java 2 collection classes support features like set arithmetic
and high-speed sorting, so while the convenience methods on JDOM
objects are, well, convenient, for the advanced tasks, it’s
useful to manipulate the List
objects directly. We
now can look at adding namespace mappings to our
Document
object, as well as adding and accessing
JDOM Attribute
s.
The XML namespaces Recommendation defines the
process by which namespace prefixes are mapped to URIs. For a
namespace prefix to be used, the prefix should be mapped to a URI
through the xmlns:[namespace prefix]
attribute. In using JDOM, all
namespace-prefixes-to-URI mappings are handled automatically by JDOM
at output time.
You have seen that XML namespaces are handled through the
org.jdom.Namespace
class, which doubles as a
factory for creating new namespaces:
Namespace ns = Namespace.getNamespace("prefix", "uri");
The ns
object can then be used by
Element
and Attribute
objects.
Additionally, the Namespace
class will only create
new objects when needed; requests for existing namespaces receive a
reference to the existing object.
An attribute of an Element
is retrieved using the
getAttribute(String
name)
method. This
method returns an Attribute
object whose value is
retrieved using getValue( )
. The following code
gets the “size” attribute on the given element.
element.getAttribute("size").getValue( );
A variety of convenient methods are provided for accessing the
attribute’s value as a specific data type. These include
methods for the Java primitives, such as getIntValue( )
,
getFloatValue( )
, getBooleanValue( )
, and getByteValue()
. The methods throw
a
DataConversionException
if the value does not exist or could
not be converted to the requested type. There are matching companions
for each of these methods that allow a default value to be passed in,
which is returned instead of throwing an exception if the requested
data conversion cannot be done. This code snippet retrieves the size
as an int
, or returns
if a conversion cannot occur:
element.getAttribute("size") .getIntValue(0);
Adding attributes to an element is equally simple. An attribute can
be added using an Element
’s
addAttribute(String
name,
String
value)
method, or you
can use the more formal
addAttribute(Attribute
attribute)
method. The Attribute
constructor takes in the name of
the Attribute
to create (either as a single
String
parameter, or as a namespace prefix and
local name) and the value to assign to the
Attribute
:
doc.getRootElement( ).addAttribute("kernel", "2.2.14") // easy way
.addAttribute(new Attribute("dist", "Red Hat 6.1")); // formal way
The JDOM Comment
object represents data that is not part
of the functional data of the Document
, but is
used for human readability and convenience. In XML it’s
represented by <!-- this syntax -->
. Comments in
JDOM are represented by the
Comment
class with instances kept either at the
document level, or as children of an Element
; in
other words, both the JDOM Document
object and its
Element
s can have comments.
To obtain the comments for a Document
, the
getContent( )
method is provided, which returns a
List
containing all the Comment
objects of the document as well as the root
Element
. Comments placed before the root appear in
the list before the root, and those placed after the root appear
later in the output. To obtain the comments for an
Element
, getMixedContent( )
should be called, which returns all
Comment
, Element
, and
String
(textual data) objects nested within the
Element
in the order in which they appear. As an
example, assume we have the following XML file:
<?xml version="1.0"?> <!-- A comment at the root level: Java and XML, by Brett McLaughlin --> <JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/"> <JavaXML:Title>Java and XML</JavaXML:Title> <!-- A comment nested within the JavaXML:Book element: Contents --> <JavaXML:Contents> You're reading the contents! </JavaXML:Contents> </JavaXML:Book>
Normally, the comments are not needed by applications, but should they be, this code would retrieve them:
List docContent = doc.getContent( ); List elemContent = root.getMixedContent( ); for (int i=0, size=docContent.size( ); i<size; i++) { Object o = docContent.get(i); if (o instanceof Comment) { Comment c = (Comment)o; String text = c.getText( ); } } for (int i=0, size=elemContent.size( ); i<size; i++) { Object o = elemContent.get(i); if (o instanceof Comment) { Comment c = (Comment)o; String text = c.getText( ); } }
The Comment
constructor takes in the text of
the comment as its sole argument. The Document
object provides a means for comments to be added through the
addComment(Comment)
method, and the Element
class provides
addChild(Comment)
for the same purpose:
// Create the Comment Comment docComment = new Comment("A comment at the root level"); // Add the comment to the Document object doc.addComment(docComment); // Create another Comment Comment elemComment = new Comment("A comment nested within an element"); // Add the comment to an Element doc.getRootElement( ) .getChild("Contents") .addChild(elemComment);
[7] Support for inline constraints is likely be added to a minor revision of JDOM, which may be available at the time of this book’s publication.
[8] JDOM does support
ProcessingInstruction
objects nested within
Element
s in a Document
. These
nested PIs are not returned through the
Document
-level PI methods; because nested PIs are
relatively uncommon, they are not specifically addressed here.
3.14.79.63