Chapter 5. Putting Namespaces to Use

 

Namespaces and whitespace; anything that ends with “space” in XML is a pain in the butt.

 
 --Jason Hunter

Although it’s true that namespaces have caused their fair share of confusion in the XML community, they nonetheless represent a reasonable solution to a tricky problem that is inherent in XML. As you know, XML allows you to create custom markup languages, which are languages that contain elements and attributes of your own creation. XML is incredibly flexible in this regard, but there is nothing stopping two people from creating markup languages with very similar, if not identical, elements and attributes. What happens if you need to use both of these markup languages in a single document? There would obviously be a clash between identically named elements and attributes in the languages. Fortunately, as you will learn in this hour, namespaces provide an elegant solution to this problem.

In this hour, you’ll learn

  • Why namespaces are important to XML

  • How namespace names are guaranteed to be unique

  • How to declare and reference namespaces in XML documents

  • How to use namespaces to merge schemas

Understanding Namespaces

As a young kid I was often confused by the fact that two people could have the same name yet not be related. It just didn’t register with me that it’s possible for two families to exist independently of one another with the same last name. Of course, now I understand why it’s possible, but I’m still kind of bummed by the fact that I’m not the only Michael Morrison walking around. In fact, I’m not even close to being the most famous Michael Morrison—the real name of John Wayne, the famous actor, was actually Marion Michael Morrison. But I digress.

The reason I bring up the issue of people having the same names yet not being related is because it parallels the problem in XML when different markup languages have elements and attributes that are named the same. The XML problem is much more severe, however, because XML applications aren’t smart enough to judge the difference between the context of elements from different markup languages that share the same name. For example, a tag named <goal> would have a very different meaning in a sports markup language than the same tag in a markup language for a daily planner. If you ever used these two markup languages within the same application, it would be very important for the application to know when you’re talking about a goal in hockey and when you’re talking about a personal goal. The responsibility falls on the XML developer to ensure that uniqueness abounds when it comes to the elements and attributes used in documents. Fortunately, namespaces make it possible to enforce such uniqueness without too much of a hassle.

A namespace is a collection of element and attribute names that can be used in an XML document. To draw a comparison between an XML namespace and the real world, if you considered the first names of all the people in your immediate family, they would belong to a namespace that encompasses your last name. When I call my brother by his first name, Steve, it is implied that his last name is Morrison because he is within the Morrison namespace. XML namespaces are similar because they represent groups of names for related elements and attributes. Most of the time an individual namespace corresponds directly to a custom markup language, but that doesn’t necessarily have to be the case. You also know that namespaces aren’t a strict requirement of XML documents, as you haven’t really used them throughout the book thus far.

The purpose of namespaces is to eliminate name conflicts between elements and attributes. To better understand how this type of name clash might occur in your own XML documents, consider an XML document that contains information about a video and music collection. You might use a custom markup language unique to each type of information (video and music), which means that each language would have its own elements and attributes. However, you are using both languages within the context of a single XML document, which is where the potential for problems arises. If both markup languages include an element named title that represents the title of a video or music compilation, there is no way for an XML application to know which language you intended to use for the element. The solution to this problem is to assign a namespace to each of the markup languages, which will then provide a clear distinction between the elements and attributes of each language when they are used.

In order to fully understand namespaces, you need a solid grasp on the concept of scope in XML documents. The scope of an element or attribute in a document refers to the relative location of the element or attribute within the document. If you visualize the elements in a document as an upside-down tree that begins at the top with the root element, child elements of the root element appear just below the root element as branches (see Figure 5.1). Each element in a “document tree” is known as a node. Nodes are very important when it comes to processing XML documents because they determine the relationship between parent and child elements. The scope of an element refers to its location within this hierarchical tree of elements. So, when I refer to the scope of an element or attribute, I’m talking about the node in which the element or attribute is stored.

An XML document coded in ETML can be visualized as a hierarchical tree of elements, where each leaf in the tree is known as a node.

Figure 5.1. An XML document coded in ETML can be visualized as a hierarchical tree of elements, where each leaf in the tree is known as a node.

In this figure, the hypothetical ETML example markup language first mentioned in Hour 3, “Defining Data with DTD Schemas,” is used to demonstrate how an XML document consists of a hierarchical tree of elements. Each node in the tree of an XML document has its own scope, and can therefore have its own namespace.

Scope is important to namespaces because it’s possible to use a namespace within a given scope, which means it affects only elements and attributes beneath a particular node. Contrast this with a namespace that has global scope, which means the namespace applies to the entire document. Any guess as to how you might establish a global namespace? It’s easy—you just associate it with the root element, which by definition houses the remainder of the document. You learn much more about scope as it applies to namespaces throughout the remainder of this hour. Before you get to that, however, it’s time to learn where namespace names come from.

Naming Namespaces

The whole point of namespaces is that they provide a means of establishing unique identifiers for elements and attributes. It is therefore imperative that each and every namespace have a unique name. Obviously, there would be no way to enforce this rule if everyone was allowed to make up their own names out of thin air, so a clever naming scheme was established that tied namespaces to URIs (Uniform Resource Identifiers). URIs usually reference physical resources on the Internet and are guaranteed to be unique. So, a namespace is essentially the name of a URI. For example, my web site is located at http://www.michaelmorrison.com. To help guarantee name uniqueness in any XML documents that I create, I could associate the documents with my namespace:

<mediacollection xmlns:mov="http://www.michaelmorrison.com/ns/movies">

The ns in the namespace name http://www.michaelmorrison.com/ns/movies stands for “namespace” and is often used in URL namespace names. It isn’t a necessity but it’s not a bad idea in terms of being able to quickly identify namespaces. If you don’t want to use a URI as the basis for a namespace name, you could also use the URN (Universal Resource Name) of a web resource to guarantee uniqueness. URNs are slightly different from URLs and define a unique location-independent name for a resource that maps to one or more URLs. Following is an example of using a URN to specify a namespace for my web site:

<mediacollection xmlns:mov="urn:michaelmorrison.com:ns:movies">

Keep in mind that a namespace doesn’t actually point to a physical resource, even if its URI does. In other words, the only reason namespaces are named after URIs is because URIs are guaranteed to be unique—they could just as easily be named after social security numbers. This means that within a domain name you can create URIs that don’t actually reference physical resources. So, although there may not be a directory named pets on my web server, I can still use a URI named http://www.michaelmorrison.com/ns/pets to name a namespace. The significance is that the michaelmorrison.com domain name is mine and is therefore guaranteed to be unique. This is important because it allows you to organize XML documents based upon their respective namespaces while guaranteeing uniqueness among the namespace names.

Declaring and Using Namespaces

Namespaces are associated with documents by way of elements, which means that you declare a namespace for a particular element with the scope you want for the namespace. More specifically, you use a namespace declaration, which looks a lot like an attribute of the element. In many cases you want a namespace to apply to an entire document, which means you’ll use the namespace declaration with the root element. A namespace declaration takes the following form:

xmlns:Prefix="NameSpace"

The xmlns attribute is what notifies an XML processor that a namespace is being declared. The NameSpace portion of the namespace declaration is where the namespace itself is identified. This portion of the declaration identifies a URI that guarantees uniqueness for elements and attributes used within the scope of the namespace declaration.

The Prefix part of the namespace declaration allows you to set a prefix that will serve as a shorthand reference for the namespace throughout the scope of the element in which the namespace is declared. The prefix of a namespace is optional and ultimately depends on whether you want to use qualified or unqualified element and attribute names throughout a document. A qualified name includes the Prefix portion of the namespace declaration and consists of two parts: the prefix and the local portion of the name. Examples of qualified names include mov:title, mov:director, and mov:rating. To use qualified names, you must provide Prefix in the namespace declaration. Following is a simple example of a qualified name:

<mov:title>Raising Arizona</mov:title>

By the Way

Declaring a namespace in an XML document is a little like declaring a variable in a programming language—the declared namespace is available for use but doesn’t actually enter the picture until you specify an element with a qualified name.

In this example, the prefix is mov and the local portion of the name is title. Unqualified names don’t include a prefix and are either associated with a default namespace or no namespace at all. The prefix of the namespace declaration isn’t required when declaring a default namespace. Examples of unqualified names are title, director, and rating. Unqualified names in a document look no different than if you weren’t using namespaces at all. The following code shows how the movie example would be coded using unqualified names:

<title>Raising Arizona</title>

Notice that in this example the <title> and </title> tags are used so that you would never know a namespace was involved. In this case, you are either assuming a default namespace is in use or that there is no namespace at all.

It’s important to clarify why you would use qualified or unqualified names because the decision to use one or the other determines the manner in which you declare a namespace. There are two different approaches to declaring namespaces:

  • Default declaration— The namespace is declared without a prefix; all element and attribute names within its scope are referenced using unqualified names and are assumed to be in the namespace.

  • Explicit declaration— The namespace is declared with a prefix; all element and attribute names associated with the namespace must use the prefix as part of their qualified names or else they are not considered part of the namespace.

The next sections dig a little deeper into these namespace declarations.

Default Namespaces

Default namespaces represent the simpler of the two approaches to namespace declaration. A default namespace declaration is useful when you want to apply a namespace to an entire document or section of a document. When declaring a default namespace, you don’t use a prefix with the xmlns attribute. Instead, elements are specified with unqualified names and are therefore assumed to be part of the default namespace. In other words, a default namespace declaration applies to all unqualified elements within the scope in which the namespace is declared. Following is an example of a default namespace declaration for a movie collection document:

<mediacollection xmlns="http://www.michaelmorrison.com/ns/movies">
  <movie type="comedy" rating="PG-13" review="5" year="1987">
    <title>Raising Arizona</title>
    <comments>A classic one-of-a-kind screwball love story.</comments>
  </movie>

  <movie type="comedy" rating="R" review="5" year="1988">
    <title>Midnight Run</title>
    <comments>The quintessential road comedy.</comments>
  </movie>
</mediacollection>

In this example, the http://www.michaelmorrison.com/ns/movies namespace is declared as the default namespace for the movie document. This means that all the unqualified elements in the document (mediacollection, movie, title, and so on) are assumed to be part of the namespace. A default namespace can also be set for any other element in a document, in which case it applies only to that element and its children. For example, you could set a namespace for one of the title elements, which would override the default namespace that is set in the mediacollection element. Following is an example of how this is done:

<mediacollection xmlns="http://www.michaelmorrison.com/ns/movies">
  <movie type="comedy" rating="PG-13" review="5" year="1987">
    <title>Raising Arizona</title>
    <comments>A classic one-of-a-kind screwball love story.</comments>
  </movie>

  <movie type="comedy" rating="R" review="5" year="1988">
    <title xmlns="http://www.michaelmorrison.com/ns/title">Midnight Run</title>
    <comments>The quintessential road comedy.</comments>
  </movie>
</mediacollection>

Notice in the title element for the second movie element that a different namespace is specified. This namespace applies only to the title element and overrides the namespace declared in the mediacollection element. Although this admittedly simple example doesn’t necessarily make a good argument for why you would override a namespace, it can be a bigger issue in documents where you mix different XML languages.

By the Way

Generally speaking, default namespaces work better when you’re dealing with a single namespace. When you start incorporating multiple namespaces, it is better to explicitly refer to each namespace using a prefix.

Explicit Namespaces

An explicit namespace is useful whenever you want exacting control over the elements and attributes that are associated with a namespace. This is often necessary in documents that rely on multiple schemas because there is a chance of having a name clash between elements and attributes defined in the two schemas. Explicit namespace declarations require a prefix that is used to distinguish elements and attributes that belong to the namespace being declared. The prefix in an explicit declaration is used as a shorthand notation for the namespace throughout the scope in which the namespace is declared. More specifically, the prefix is paired with the local element or attribute name to form a qualified name of the form Prefix:Local. Following is the movie example with qualified element and attribute names:

<mediacollection xmlns:mov="http://www.michaelmorrison.com/ns/movies">
  <mov:movie mov:type="comedy" mov:rating="PG-13" mov:review="5" mov:year="1987">
    <mov:title>Raising Arizona</mov:title>
    <mov:comments>A classic one-of-a-kind screwball love story.</mov:comments>
  </mov:movie>
  <mov:movie mov:type="comedy" mov:rating="R" mov:review="5" mov:year="1988">
    <mov:title>Midnight Run</mov:title>
    <mov:comments>The quintessential road comedy.</mov:comments>
  </mov:movie>
</mediacollection>

The namespace in this code is explicitly declared by the shorthand name mov in the namespace declaration; this is evident in the fact that the name mov is specified after the xmlns keyword. Once the namespace is declared, you can use it with any element and attribute names that belong in the namespace, which in this case is all of them.

I mentioned earlier that one of the primary reasons for using explicit namespaces is when multiple schemas are being used in a document. In this situation, you will likely declare both namespaces explicitly and then use them appropriately to identify elements and attributes throughout the document. Listing 5.1 is an example of a media collection document that combines both movies and music information into a single format.

Example 5.1. The Media Collection Example Document

 1: <?xml version="1.0"?>
 2:
 3: <mediacollection xmlns:mov="http://www.michaelmorrison.com/ns/movies"
 4:   xmlns:mus="http://www.michaelmorrison.com/ns/music">
 5:   <mov:movie mov:type="comedy" mov:rating="PG-13" mov:review="5"
 6:     mov:year="1987">
 7:     <mov:title>Raising Arizona</mov:title>
 8:     <mov:comments>A classic one-of-a-kind screwball love story.
 9:     </mov:comments>
10:   </mov:movie>
11:
12:   <mov:movie mov:type="comedy" mov:rating="R" mov:review="5" mov:year="1988">
13:     <mov:title>Midnight Run</mov:title>
14:     <mov:comments>The quintessential road comedy.</mov:comments>
15:   </mov:movie>
16:
17:   <mus:music mus:type="indy" mus:review="5" mus:year="1990">
18:     <mus:title>Cake</mus:title>
19:     <mus:artist>The Trash Can Sinatras</mus:artist>
20:     <mus:label>Polygram Records</mus:label>
21:     <mus:comments>Excellent acoustical instruments and extremely witty
22:       lyrics.</mus:comments>
23:   </mus:music>
24:
25:   <mus:music mus:type="rock" mus:review="5" mus:year="1991">
26:     <mus:title>Travelers and Thieves</mus:title>
27:     <mus:artist>Blues Traveler</mus:artist>
28:     <mus:label>A&amp;M Records</mus:label>
29:     <mus:comments>The best Blues Traveler recording, period.</mus:comments>
30:   </mus:music>
31: </mediacollection>

By the Way

Just because attributes are considered parts of elements, they don’t have to be fully qualified with a namespace prefix. As the media collection example reveals, attributes require a namespace prefix in order to be referenced as part of a namespace.

In this code, the mov and mus namespaces (lines 3 and 4) are explicitly declared in order to correctly identify the elements and attributes for each type of media. Notice that without these explicit namespaces it would be difficult for an XML processor to tell the difference between the title and comments elements because they are used in both movie and music entries.

By the Way

When you actually start using XML languages that aren’t of your own creation, you’ll specify a namespace that doesn’t involve your own URI. For example, the namespace for the SVG (Scalable Vector Graphics) markup language that you learn about in the next hour is http://www.w3.org/2000/svg.

Just to help hammer home the distinction between default and explicit namespace declarations, let’s take a look at one more example. This time the media collection declares the movie namespace as the default namespace and then explicitly declares the music namespace using the mus prefix. The end result is that the movie elements and attributes don’t require a prefix when referenced, whereas the music elements and attributes do. Check out the code in Listing 5.2 to see what I mean.

Example 5.2. A Different Version of the Media Collection Example Document That Declares the Movie Namespace as a Default Namespace

 1: <?xml version="1.0"?>
 2:
 3: <mediacollection xmlns="http://www.michaelmorrison.com/ns/movies"
 4:   xmlns:mus="http://www.michaelmorrison.com/ns/music">
 5:   <movie type="comedy" rating="PG-13" review="5" year="1987">
 6:     <title>Raising Arizona</title>
 7:     <comments>A classic one-of-a-kind screwball love story.</comments>
 8:   </movie>
 9:
10:   <movie type="comedy" rating="R" review="5" year="1988">
11:     <title>Midnight Run</title>
12:     <comments>The quintessential road comedy.</comments>
13:   </movie>
14:
15:   <mus:music mus:type="indy" mus:review="5" mus:year="1990">
16:     <mus:title>Cake</mus:title>
17:     <mus:artist>The Trash Can Sinatras</mus:artist>
18:     <mus:label>Polygram Records</mus:label>
19:     <mus:comments>Excellent acoustical instruments and extremely witty
20:       lyrics.</mus:comments>
21:   </mus:music>
22:
23:   <mus:music mus:type="rock" mus:review="5" mus:year="1991">
24:     <mus:title>Travelers and Thieves</mus:title>
25:     <mus:artist>Blues Traveler</mus:artist>
26:     <mus:label>A&amp;M Records</mus:label>
27:     <mus:comments>The best Blues Traveler recording, period.</mus:comments>
28:   </mus:music>
29: </mediacollection>

The key to this code is the default namespace declaration, which is identified by the lone xmlns attribute (line 3); the xmlns:mus attribute explicitly declares the music namespace (line 4). When the xmlns attribute is used by itself with no associated prefix, it is declaring a default namespace, which in this case is the music namespace.

Summary

If you’re familiar with the old sitcom Newhart, you no doubt remember the two brothers who were both named Darrel. Although brothers with the same first name make for good comedy, similar names in XML documents can be problematic. I’m referring to name clashes that can occur when elements and attributes are named the same across multiple custom markup languages. This problem can be easily avoided by using namespaces, which allow you to associate elements and attributes with a unique name. Namespaces are an important part of XML because they solve the problem of name clashing among XML documents.

This hour introduced you to namespaces and also gave you some practical insight regarding how they are used in XML documents. You began the hour by learning the basics of namespaces and their significance to XML. From there you learned how namespaces are named. You then found out how to declare and use namespaces in documents. And finally, the hour concluded by revisiting XSD schemas and uncovering a few interesting tricks involving schemas and namespaces.

Q&A

Q.

When a name clash occurs in an XML document, why can’t an XML processor resolve it by looking at the scope of the elements and attributes, as opposed to requiring namespaces?

A.

Although it is technically possible for an XML processor to resolve an element or attribute based solely on its scope, it isn’t a good idea to put that much faith in the processor. Besides, there are some situations where this simply isn’t possible. For example, what if the element causing the name clash is the root element in a document? Because it has a global scope, there is no way to determine the schema to which it belongs.

Q.

Do I have to use a namespace to uniquely identify the elements and attributes in my custom markup language?

A.

No. In fact, if you plan on using your XML documents internally and never sharing them with others, there really is no pressing need to declare a unique namespace. However, if you choose to incorporate multiple XML-based markup languages within a single document or application, you’ll need to use namespaces to keep things straight and not confuse the XML processor.

Workshop

The Workshop is designed to help you anticipate possible questions, review what you’ve learned, and begin learning how to put your knowledge into practice.

Quiz

1.

Why are namespaces named after URIs?

2.

What is the general form of a namespace declaration?

3.

What is the difference between default and explicit namespace declaration?

Quiz Answers

1.

Namespaces are named after URIs because URIs are guaranteed to be unique.

2.

The general form of a namespace declaration is xmlns:Prefix="NameSpace".

3.

A default namespace declaration is useful when you want to apply a namespace to an entire document or section of a document, whereas an explicit namespace is useful whenever you want exacting control over the elements and attributes that are associated with a namespace.

Exercises

1.

Using a domain name that you or your company owns, determine a unique namespace name that you could use with the Tall Tales document from previous hours.

2.

Modify the Tall Tales document so that the elements and attributes defined in its schema are associated with the namespace you just created.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.102.166