Chapter 10. Namespaces and XQuery

Namespaces are an important part of XML, and it is essential to understand the concepts behind namespaces in order to query XML documents that use them. This chapter first provides a refresher on namespaces in XML input documents in general. It then covers the use of namespaces in queries: how to declare and refer to them, and how to control their appearance in your results.

XML Namespaces

Namespaces are used to identify the vocabulary to which XML elements and attributes belong, and to disambiguate names from different vocabularies. For example, both of the XHTML and XSL-FO vocabularies have a table element, but it has a different structure in each vocabulary. Some XML documents combine elements from multiple vocabularies, and namespaces make it possible to distinguish between them.

Namespaces are defined by a W3C recommendation called Namespaces in XML. Two versions are available: 1.0 and 1.1. XQuery implementations may support either version; you should check with your product's documentation to determine which version is supported.

Namespace URIs

A namespace is identified by a URI (Uniform Resource Identifier) reference.[*] A namespace URI is most commonly an HTTP URL, such as http://datypic.com/prod. It could also be a Uniform Resource Name (URN), which might take the form urn:prod-datypic-com.

The use of a URI helps to ensure the uniqueness of the name. If a person owns the domain name datypic.com, he is likely to have some control over that domain and not use duplicate or conflicting namespace URIs within that domain. By contrast, if namespaces could be defined as any string, the likelihood of collisions would be much higher. For this reason, using relative URI references (such as prod or ../prod) is discouraged in Namespaces 1.0 and deprecated in Namespaces 1.1.

However, the use of URIs for namespaces has created some confusion. Most people, seeing a namespace http://datypic.com/prod, assume that they can access that URL in a browser and expect to get something back: a description of the namespace, or perhaps a schema. This is not necessarily the case; there is no requirement for a namespace URI to be dereferencable. No parser, schema validator, or query tool would dereference that URL expecting to retrieve any useful information. Instead, the URI serves simply as a name.

For two namespace URIs to be considered the same, they must have the exact same characters. Although http://datypic.com/prod and http://datypic.com/prod/ (with a trailing slash) would be considered "equivalent" by most people, they are considered to be different namespace URIs. Likewise, namespace URIs are case-sensitive, so http://datypic.com/prod is different from http://datypic.com/prod.

Declaring Namespaces

Namespaces are declared in XML documents using namespace declarations. A namespace declaration, which looks similar to an attribute, maps a short prefix to a namespace name. That prefix is then used before element and attribute names to indicate that they are in a particular namespace. Example 10-1 shows a document that contains two namespace declarations.

Example 10-1. Namespace declarations

<cat:catalog xmlns:cat="http://datypic.com/cat"
             xmlns:prod="http://datypic.com/prod">
  <cat:number>1446</cat:number>
  <prod:product>
    <prod:number>563</prod:number>
    <prod:name prod:language="en">Floppy Sun Hat</prod:name>
  </prod:product>
</cat:catalog>

The first namespace declaration maps the prefix cat to the namespace http://datypic.com/cat, while the second maps the prefix prod to the namespace http://datypic.com/prod. The cat and prod prefixes precede the names of elements and attributes in the document to indicate their namespace. There are two different number elements, in different namespaces. The one attribute in the document, language, is also prefixed, indicating that it is in the http://datypic.com/prod namespace.

It is important to understand that the prefixes are arbitrary and have no technical significance. Although some XML languages have conventional prefixes, such as xsl for XSLT, you can actually choose any prefix you want for your XML documents. The document shown in Example 10-2 is considered the equivalent of Example 10-1.

Example 10-2. Alternate prefixes

<foo:catalog xmlns:foo="http://datypic.com/cat"
           xmlns:bar="http://datypic.com/prod">
  <foo:number>1446</foo:number>
  <bar:product>
    <bar:number>563</bar:number>
    <bar:name bar:language="en">Floppy Sun Hat</bar:name>
  </bar:product>
</foo:catalog>

Prefixes must follow the same rules as XML names, in that they must start with a letter or underscore, and can only contain certain letters. They also may not start with the letters xml in upper- or lowercase. Generally, prefixes are kept short for clarity, usually two to four characters.

Default Namespace Declarations

You can also designate a particular namespace as the default, meaning that any unprefixed elements are in that namespace. To declare a default namespace, you simply leave the colon and prefix off the xmlns in the namespace declaration. In this example:

<product xmlns="http://datypic.com/prod">
  <number>563</number>
  <name language="en">Floppy Sun Hat</name>
</product>

the product, number, and name elements are in the http://datypic.com/prod namespace, because they are unprefixed and that is the default namespace. Default namespace declarations and regular namespace declarations can be used together in documents.

However, default namespace declarations do not apply to unprefixed attribute names. Therefore, the language attribute is not in any namespace, even though you might expect it to be in the default namespace.

Namespaces and Attributes

An attribute name can also be in a namespace. This is less common than an element in a namespace, because often attributes are considered to be indirectly associated with the namespace of the element they are on, and therefore don't need to be put in a namespace themselves. For example, general-purpose attributes in the XSLT and XML Schema vocabularies are never prefixed.

However, certain attributes, sometimes referred to informally as global attributes, can appear in many different vocabularies and are therefore in namespaces. Examples include the xml:lang attribute, which can be used in any XML document to indicate natural language, and the xsi:schemaLocation attribute, which identifies the location of the schema for a document. It makes sense that these attributes should be in namespaces because they appear on elements that are in different namespaces.

If an attribute name is prefixed, it is associated with the namespace that is mapped to that prefix. A significant difference between elements and attributes, however, is that default namespace declarations do not apply to attribute names. Therefore, an unprefixed attribute name is always in no namespace, not the default namespace. It may seem that an attribute should automatically be in the namespace of the element that carries it, but it is considered to be in no namespace for the purposes of querying and even schema validation.

The product element shown in Example 10-3 has two attributes: app:id and dept. The app:id attribute is, as you would expect, in the http://datypic.com/app namespace. The dept attribute, because it is not prefixed, is in no namespace. This is true regardless of the fact that there is a default namespace declaration that applies to the product element itself.

Example 10-3. Namespaces and attributes

<product xmlns="http://datypic.com/prod"
         xmlns:app="http://datypic.com/app"
        app:id="P123" dept="ACC">
...
</product>

Namespace Declarations and Scope

Namespace declarations are not required to appear in the outermost element of an XML document; they can appear on any element. The scope of a namespace declaration is the element on which it appears and any attributes or descendants of that element. In Example 10-4, there are two namespace declarations: one on catalog and one on product. The scope of the second namespace declaration is indicated in bold font; the prod prefix cannot be used outside this scope.

Example 10-4. Namespace declarations and scope (cat_ns.xml)

<catalog xmlns="http://datypic.com/cat">
  <number>1446</number>
  <prod:product xmlns:prod="http://datypic.com/prod">
    <prod:number>563</prod:number>
    <prod:name language="en">Floppy Sun Hat</prod:name>
  </prod:product>
</catalog>

If a namespace declaration appears in the scope of another namespace declaration with the same prefix, it overrides it. This is not recommended for namespace declarations with prefixes because it is confusing. However, it is also possible to override the default namespace, which can be useful when a document consists of several subtrees in different namespaces. Example 10-5 shows an example of this, where the product element and its descendants are in a separate namespace but do not need to be prefixed.

Example 10-5. Overriding the default namespace

<catalog xmlns="http://datypic.com/cat">
  <number>1446</number>
  <product xmlns="http://datypic.com/prod"
    <number>563</number>
    <name language="en">Floppy Sun Hat</name>
  </product>
</catalog>

When using Namespaces 1.1, the namespace specified can be a zero-length string, as in xmlns:prod="". This has the effect of undeclaring the namespace mapped to prod; that prefix will no longer be available for use in that scope. Undeclaring prefixes is not permitted in Namespaces 1.0.

As with regular namespace declarations, you can specify a zero-length string as the default namespace, as in xmlns="". This undeclares the default namespace.



[*] Or IRI (International Resource Identifier) reference if your processor supports Namespaces 1.1. IRIs allow a wider, more international set of characters. The term URI is used in this book (and in the XQuery specification) to mean "URI or IRI."

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.194.106