Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Reading XML content with namespaces

XML namespaces, in a way, are similar to Java packages because they allow creating an additional context for grouping a set of elements. We already noted some differences in namespace handling for the XmlParser and XmlSlurper classes in the Reading XML using XmlParser and Reading XML using XmlSlurper recipes.

In this recipe, we dig a bit deeper into the details of XML namespace support in Groovy.

Getting ready

Let's use the same shakespeare.xml file we used for the Reading XML using XmlParser and Reading XML using XmlSlurper recipes.

How to do it...

XmlParser requires you to specify an element name exactly as it appears in the parsed XML, including the name of the prefix used in the actual XML content. This makes the code fragile because the namespace prefixes have to match.

In order to make code that is based on XmlParser more reliable in respect to namespaces, we can resort to the groovy.xml.Namespace class as shown in the following code:

import groovy.xml.Namespace

def xmlSource = new File('shakespeare.xml')
def bibliography = new XmlParser().parse(xmlSource)

def bib = new Namespace('http://bibliography.org', 'bib')
def lit = new Namespace('http://literature.org', 'lit')

println bibliography[bib.author].text()
println bibliography[lit.play].findAll {
  it[lit.year].text().toInteger() > 1592
}.size()

XmlSlurper has a similar API for declaring the prefixes and namespaces required to navigate the nodes, shown in the following code:

def xmlSource = new File('shakespeare.xml')
def bibliography = new XmlSlurper().parse(xmlSource)

bibliography.declareNamespace(
  bib: 'http://bibliography.org',
  lit: 'http://literature.org')

println bibliography.'bib:author'
println bibliography.'lit:play'.findAll {
  it.'lit:year'.toInteger() > 1592
}.size()

The output of both scripts is indistinguishable:
```
William Shakespare
3
```

How it works...

Both of the previous code snippets extract the author's name and the number of plays written after 1592 from our reference bibliography data XML document.

In the case of XmlParser, we declare two instances of the groovy.xml.Namespace class. When we fetch a property (for example, bibliography[bib.author]) from the Namespace object, this is really what happens:

The bib.author expression returns a value of javax.xml.namespace.QName type.
The array reference bibliography[bib.author] is translated by Groovy into a call to the getAt method of the groovy.util.Node class. This method accepts a QName as an argument and returns a node if the QName is found, as shown next:
```
QName ns = bib.author
Node n = bibliography.getAt(ns)
```

In the case of XmlSlurper, the groovy.util.slurpersupport.GPathResult class instance (returned by the parse method) has an additional method to declare namespaces called, not surprisingly, declareNamespace.

Note

Please note that unlike XmlParser, the XmlSlurper implementation (or more specifically GPathResult) does not force you to depend on namespaces or prefixes at all. You can refer to elements and attributes using their local names, and only resort to using namespace prefixes if there are same local names under different namespaces.

If you try to use a fully qualified name (for example. bib:author) before declaring the namespace within an XmlSlurper instance, you'll get no result back. Also, namespace prefixes defined by declareNamespace do not have to match prefixes appearing in the actual XML file.

The declareNamespace method takes a map of prefixes and namespaces. When those are defined, you can use them to reference elements using their fully qualified names.

There's more...

If you plan to switch between XmlParser and XmlSlurper implementations and you need to parse XML that uses namespaces, then the safest approach is to use the *: prefix for element or attribute queries. For example:

println bibliography.'*:author'.text()

Table of Contents for
Reading XML content with namespaces

Reading XML content with namespaces

Getting ready

How to do it...

How it works...

Note

There's more...

See also

Table of Contents for Reading XML content with namespaces

Create new playlist

Sign In

Sign Up

Reading XML content with namespaces

Getting ready

How to do it...

How it works...

Note

There's more...

See also

Table of Contents for
Reading XML content with namespaces