Chapter 9

XPath 1.0 and XPath 2.0

9.1 Introduction

The XML Path Language – XPath, as it is more commonly known – was first published as a recommendation1 by the W3C in 1999. According to its specification, XPath was created “to provide a common syntax for functionality shared between XSL Transformations [XSLT] and XPointer” (see Chapter 7, “Managing XML: Transforming and Connecting”), and its purpose is “to address parts of an XML document.” Like nearly all of the W3C specifications, XPath “operates on the abstract, logical structure of an XML document, rather than its surface syntax.”

What does it mean to say that XPath is used to “address parts of an XML document”? If we simply replace address with locate or identify or even point to, the meaning would be the same. Because querying facilities in general function to locate or identify certain information, it’s easy to see that XPath is itself a sort of query language.

The first part of this chapter deals with XPath 1.0 and the second part handles XPath 2.0.2 Even though XPath 2.0 is poised for approval as a recommendation in early 2006, we expect that many people will continue to use XPath 1.0 for some time to come. (This expectation is due in large part to the existence of only a few XPath 2.0 engines.) In addition, we find that a good understanding of the concepts in XPath 1.0 leads to faster understanding of both XPath 2.0 and XQuery 1.0.

XPath, as you just read, was designed to be a language for addressing parts of XML documents, providing functionality for other specifications, particularly XSLT. The dependence of XSLT on XPath led to the XSL Working Group (WG) having the responsibility for specifying XPath (in consultation with other WGs). As we demonstrate in Section 9.2, XPath can quite reasonably be viewed as a language for querying XML documents. As you’ll see in this chapter, XPath is used to query only one document at a time – that is, it’s suitable not for finding documents of interest but to find desired information within a known document.

In 1999, the W3C established the XML Query Working Group, with the charter to develop a language designed specifically for querying XML documents – the XML query language now known as XQuery. The requirements for this new language implied significant capabilities beyond those available in XPath 1.0, and the XSL WG recognized that many of those new capabilities would be quite useful for the planned new version of XSLT. As a result, the two WGs agreed to accept joint responsibility for designing and specifying a new version of XPath, XPath 2.0.

As development proceeded, it became obvious that the requirements for XPath 2.0 were for the most part a subset of those for XQuery and that the two languages should be designed – and specified – together. In fact, because one language is very nearly a subset of the other, both specifications are generated from the same source files, via a variety of techniques that allow the production of one specification or the other as needed.

In addition to the significant commonality in the syntax of the two languages, they share a number of other specifications, including the data model, formal semantics, and functions and operators (all discussed in Chapter 10, “Introduction to XQuery 1.0”). Consequently, our coverage of XPath 2.0 in this chapter is relatively brief, since most of the aspects of the language are covered in that other chapter.

9.2 XPath 1.0

As you read in Section 9.1 above, XPath is a language for addressing parts of XML documents. Throughout Section 9.2, you’ll learn the details of how that addressing is performed in XPath 1.0. But we think it’s a good idea to introduce you to the appearance of XPath expressions before delving into the details. In fact, because XPath 1.0 is the foundation on which XPath 2.0 is built and because XQuery is so closely related to XPath 2.0, the concepts and syntax discussed in this section apply to XQuery as well.

The notation chosen for XPath deliberately bears a resemblance to the notation used by some computer operating systems for referencing files and directories (or, if you prefer, folders) in a file system. A typical XPath expression looks something like this:

image

In a file system’s path notation, that might identify the file named “employee” in the “employee” subdirectory of the “company” directory, ignoring for the moment the notation “employee[@id=“123”].” Analogously, XPath interprets that expression to mean the “salary” element that is a child of an “employee” element having an “id” attribute whose value is “123” that is itself a child of an element named “company.”

The notation also resembles that used for URLs (Uniform Resource Locators) on the web. In this case, the first component identifies a primary resource, often the identification of a server somewhere. Subsequent components identify resources and subresources available at that server (or other primary resource).

In the file system notation, the URL notation, and the XPath notation, the context – directory structure, resource structure, or XML document – in which the expression is evaluated is strictly hierarchical, so each step in the path “drills down” deeper into the hierarchy.

Like any other query language, XPath3 comprises a number of different facets that are used together to locate some specific piece of data. Among the most interesting of these components are the context in which an XPath expression is evaluated, the steps within an XPath expression that navigate among a document’s structure, a number of axes that direct the navigation, predicates that filter out unwanted parts of the document, and expressions that express various sorts of operations in the language. We’ll discuss each of these and more in the next few sections.

Before digging into the components of XPath and their syntax and semantics, you need to know that XPath, like most W3C specifications, does not operate on the serialized, character string, form of an XML document. Instead, XPath 1.0 operates on the Infoset (see Chapter 6, “The XML Information Set [Infoset] and Beyond”) corresponding to a serialized document. Furthermore, the results of an XPath expression are not serialized XML but Infoset fragments. (More precisely, XPath 1.0 operates on instances of the XPath 1.0 Data Model, which is derived from the Infoset. Appendix B of the XPath 1.0 specification defines how an Infoset is mapped onto the XPath 1.0 Data Model.) In this chapter, to illustrate the behavior of various XPath expressions, we represent the source data as a serialized document and represent the result as though it had been serialized; we also employ a convention of indentation that highlights the relationships of child elements to their parents.

Section 9.2.8, “Putting the Pieces Together,” may help you gain a better overall picture of how XPath 1.0 does its job.

9.2.1 Expressions

The principal concept in XPath is the expression. An expression is always evaluated in a context (see Section 9.2.2), and it evaluates to a value (the recommendation calls this “an object”) that has one of four possible types: node set (an ordered collection of nodes), a Boolean value, a number, or a string.

There are a number of different kinds of expressions. Perhaps the most important is the path expression, which we cover in Section 9.2.3.

A second kind, which we might loosely call the value expression, includes:

• String and numeric literals

• Variable references

• Function invocations

• Logical expressions (“and” and “or”)

• Comparison expressions4 (=, <, >, <=, >=, and ! =)

• Arithmetic expressions (+, -, *, div [because / is used for other purposes in path expressions], and mod)

The third kind of expression in XPath is the node set expression:

• Node set expressions (|, pronounced “union” – combines two node sets into one)

String literals are any sequence of characters enclosed with quotation marks, either double quotation marks (“…”) or single quotation marks (‘… ’). Literals that are enclosed in double quotation marks cannot contain double quotation marks (which would be interpreted as ending the literal). Similarly, literals enclosed in single quotation marks cannot contain single quotation marks. In some contexts (such as XPath expressions that appear in an XML attribute), literals cannot contain certain other characters, most prominently < and &. When you need to use a less-than sign, an ampersand, or a quotation mark (double-quote or apostrophe) of the same sort that encloses your literal, you can represent them by means of a character entity reference notation, such as &lt;, &amp;, or &quote; and &apos;, respectively. You can also use a character reference notation (don’t blame us for the confusingly similar phrases – that’s the way the XML recommendation defines them), such as &x3C;, &x26;, or &x22; and &x27;, respectively. Here are some examples:

• “My favorite film is rarely shown on television.”

• ‘The films shown on television are often bowdlerized.’

• “Do you like the music in ‘The Rose’?”

• ‘What movie had the tag line “Be afraid. Be very afraid.”?’

• ‘Is the title “Bonnie and Clyde” or “Bonnie & Clyde”?’

Numbers in XPath are always treated as double-precision floating-point values. Numeric literals are either: a sequence of digits; a sequence of digits followed by a decimal point; a decimal point followed by a sequence of digits; or a sequence of digits, followed by a decimal point, followed by another sequence of digits. (In XPath, the decimal point is always a period, also called a full stop, rather than the comma used in many countries. Furthermore, XPath does not employ commas or periods to separate groups of digits, such as the three-digit groups – thousands, millions, etc. – common in many Western societies.) Some examples:

• 42

• 451.

• 3.14159

• .33333

Variable references are, syntactically, a dollar sign ($) followed by a QName5 that names a variable provided by the external context from which XPath is invoked. An obvious example is:

• $varl

The value of a variable can be of any type supported by XPath: string, double-precision floating point, Boolean, or node set. It can also be of any type supported by the invoking environment.

A function invocation is, syntactically, a function name followed by a matching pair of parentheses. Here are the principle characteristics of a function invocation:

• The parentheses may or may not enclose an argument or a comma-separated list of arguments.

• Every argument is an expression.

• The value of each argument of a function can be of any type supported by XPath (string, double-precision floating point, Boolean, or node set).

• The value returned by a function is also permitted to be any one of those data types.

• Those values may sometimes have other types, depending on the environment in which XPath is being invoked.

• Function names are QNames, but they cannot be equivalent to the name of any of these node types: comment, text, processing-instruction, or node. (You’ll read more about XPath functions in Section 9.2.7.)

Some examples of function invocations:

• fn:upper-case($name-variable)

• myfns: longest-movie (fn:doc(“http://example.com/movies”))

• true()

Logical, comparison, and arithmetic expressions are familiar to most programmers, as in the following example, which returns true if the value of $cost is less than 19.95 or if the value of $ length is greater than 30 minutes less than the length of the longest movie (otherwise, it returns false):

image

The values of logical and comparison expressions are always of type Boolean, while the value of an arithmetic expression is always double-precision floating point.

Arguably, node set expressions are the most important type of expression, largely because they are returned by path expressions; we discuss these along with paths and steps in Section 9.2.3.

9.2.2 Contexts

If you’re searching a document containing information about movies, then the most fundamental context of your searches is that document. However, once you’ve located information in a document that narrows your search a bit – such as a particular movie, or the cast of a particular movie – the additional parts of your search will typically use other nodes – perhaps the <movie> node or the <actors> node – as the context for those further search operations.

The XPath specification states that the context comprises five items:

1. A (single) node, which may be any of the seven node types (root nodes, element nodes, text nodes, attribute nodes, namespace nodes, processing instruction nodes, and comment nodes).

2. A pair of integers, one of which identifies the context position (that is, the position of the context node within its parent node, if any) and the context size (the number of child nodes within the parent of the context node).

3. A set of variable bindings that define a mapping from variable names to variable values. Variables are never created in an XPath expression, but are supplied by the external environment (such as XSLT).

4. A function library. The XPath recommendation defines a core library of 27 functions, but invoking environments are allowed to add more functions.

5. The set of namespace declarations that are in scope for the expression. Each namespace declaration provides a mapping between a namespace prefix and a namespace URL

Example 9-1 helps to illustrate the concepts of context position and size. Let’s consider the <actor> element representing the actor whose name is Tommy Lee Jones. Assuming we have somehow located that node, then:

• That node is the context node.

• The context size is 6 (the number of nodes that are children of the <actors> element).

• The context position is 3 (the <actor> element representing Jones is the third of those 6 children of the <actors> element).

Example 9-1   Determining Context Position and Context Size

image

Consider some expression that identifies one or more <actor> elements found in Example 9-1. If that expression contains a second expression, then the first expression is called the containing expression and the second is often called a subexpression. At the time when a subexpression is evaluated, there are several items in its evaluation context:

• The variable bindings

• The function library

• The set of namespace declarations

These items are always the same as the corresponding items in the context in which the containing expression is evaluated. They are (effectively) inherited from the containing expression’s content.

By contrast, the context node, the context position, and the context size of the subexpression’s context may be the same as or different from those values in the containing expression’s context (depending on the nature of the subexpression).

Evaluation of every XPath expression occurs within a context. The “outermost” expression (that is, the expression that is not contained within any other expression) must be given a node from the external environment that caused the expression to be evaluated. Subexpressions get their context node from the containing expression in which they are contained.

Several kinds of expressions, particularly steps of path expressions, may cause a different node to become the context node. For example, an expression that is given the <actors> element in Example 9-1 as its context node might begin with a step expression that causes an <actor> element to become the context node. When a different node becomes the context node, the context position and context size are recomputed based on the new context. It is also possible for the context position and context size to be changed when the context node does not change. Only one kind of expression causes this to happen – predicates, which are covered in Section 9.2.6.

9.2.3 Paths and Steps

Given the name of the language being discussed in this chapter – XPath, or XML Path Language – it’s not surprising that the most important kind of expression in the language is the path expression, also known as the location path. There are two sorts of location paths: relative location paths and absolute location paths.

Relative location paths (the full term is tedious to say over and over, so we’ll call them relative paths from here on) are a sequence of steps separated by a slash, or solidus, character: “/.” Relative paths are evaluated relative to the “current” context node. An identifying characteristic of relative paths is that they do not start with a slash.

Absolute paths comprise a leading slash, optionally followed by a relative path. The leading slash means “start the evaluation of this path expression using the root node of the document being queried as the context node.”

The notion of path is the very essence of XPath. The “elevator speech” about path expressions goes something like this:

• Start with some context, possibly the root of a document, possibly some element within a document.

• Find out what its children are, either by name or by position.

• Filter out some or all of those children based on one or more criteria.

• Repeat as necessary.

To explore this concept, let’s consider the XML document illustrated in Example 9-2.

Example 9-2   Reduced movie Example

image

The absolute path expression “/” means “address/locate/identify the root of the document.” Therefore, the path expression “/movies” will find the <movies> node that is immediately beneath the root of the document. However, “/movie” or “/yearReleased” will never find anything, because the root has no element children of those names.

If the current context node happens to be the <director> node associated with the movie An American Werewolf in London, then the relative path expression “familyName” means “address the element node or nodes that are children of the current context node and that are named ‘familyName’.”

Each step expression has three parts:

• An axis, which we cover in Section 9.2.4, that determines the navigation within the abstract tree that represents the XML document.

• A node test, discussed in Section 9.2.5, that specifies the name or the type (or both) of the nodes that are to be identified by the step.

• Zero or more predicates that provide further criteria by which the step identifies the nodes of interest.

Each step in a location path can be envisioned as navigating from some current context node to one or more other nodes (that is, nodes in a node set). There are a significant number of ways in which the steps can navigate to the new node or nodes. Each axis specifies how the path expression determines the next node or nodes from the current context node.

The complete syntax of a step expression is:

• An axis name and a node test, separated by a double colon (: :)

• Zero or more predicates, each of which is enclosed in square brackets ([…])

For example, the step expression child::familyName [2] uses the child axis, a node test that will identify element nodes named familyName, and a predicate that causes selection of the second node that satisfies the node test, if it exists.

When a step expression is evaluated, the axis, combined with the node test, is applied to the current context node, producing a node set. In our example, the child axis creates a node set containing all, and only, those nodes that are children of the current context node. The node test familyName causes all nodes without that name to be removed from the node set.

If the step contains predicates, then that node set is filtered by applying the first predicate to each node in the node set, eliminating nodes that do not satisfy the predicate. That filtering operation produces a new node set. The next predicate, if any, serves to filter that new node set, producing yet another new node set. This continues until all predicates have been applied. In our example, the predicate [2] is merely an abbreviated notation equivalent to “position( ) = 2” (the position( ) function returns the context position of each node, in turn, in the node set). In other words, if there are two or more predicates, they must all evaluate to true in order to retain the nodes identified by the axis/node test combination.

The result of the step expression is the set of all nodes along the specified axis for which the node test is satisfied and all of the predicates are true. If no nodes are identified after application of the axis and evaluation of the node test and predicates, then the result of the step expression is an empty node set. In our example, if only one child node of the current context node is named familyName (or if there are no such nodes), then application of the predicate [2] would cause the result of the step expression to be an empty node set.

If the resulting node set is not empty, then each node in that node set is used in turn as the current context node for the next step in the path expression, if any. The results of that next step, after it has been applied to each node in the previous step’s node set, is a node set that is the union of each node resulting from the application of the step to each of that previous step’s node set’s nodes.

Let’s follow a specific example. Suppose we want to determine the family name of the director of An American Werewolf in London. Starting with the first bullet in the earlier algorithm, the absolute path expression “/movies” will find the <movies> node that is immediately beneath the root of the document. The context node, after evaluating that path expression, is that <movies> node. But we’re clearly not done; we need more steps in our path.

Steps in a path are separated by slashes, as we learned earlier in this section, so we can update our path expression to “/movies/,” after which we must place a relative path expression. Since the children of the movies node all seem to be named movie, our path can be updated to “/movies/movie.” But we don’t want all of the movie nodes – only the one dealing with a specific film.

A predicate is just the thing to handle this requirement. We must add a predicate that filters out all movie nodes whose title child node does not have the value representing the film we want. The predicate looks like this: [title= “…”]. Our updated path expression is now “/movies/movie[title=“An American Werewolf in London”].” At this point, the context node is the specific movie node for that film.

But we’re interested in information about the director of that film, so we navigate to the director node: “/movies/movie[title=“An American Werewolf in London”]/director.”

And, finally, we can navigate to and retrieve the director’s family name: “/movies/movie[title=“An American Werewolf in London”]/ director/familyName.”

9.2.4 Axes and Shorthand Notations

XPath defines a modestly large, perhaps intimidating, number of axes along which step expressions determine how to identify a node set from the current context node; many of the axes depend on the document order6 of the XML tree. Let‘s list them, along with a very brief statement of what they do, before examining some of them in more detail:

• child – identifies every child node of the context node; attribute nodes and namespace nodes are not children of any node.

• descendant – identifies every descendant node of the context node (this includes child nodes, the nodes that are child nodes of those child nodes, and so forth until all offspring are identified); naturally, attribute nodes and namespace nodes are not included.

• parent – identifies the parent node, if any, of the context node; both attribute nodes and namespace nodes have a parent (even though they are not children of their parent!).

• ancestor – identifies the parent node, as well as that node’s parent, and so forth until the root of the tree has been identified.

• following-sibling – identifies all nodes that are siblings of the context node (that is, they have the same parent node) that appear, in document order, after the context node; if the context node is an attribute node or a namespace node, then the following-sibling axis produces an empty node set.

• preceding-sibling – identifies all nodes that are siblings of the context node that appear, in document order, before the context node; if the context node is an attribute node or a namespace node, then the preceding-sibling axis produces an empty node set.

• following – identifies every node in the document that appears, in document order, after the context node, excluding all descendant nodes, attribute nodes, and namespace nodes of the context node.

• preceding – identifies every node in the document that appears, in document order, before the context node, excluding all ancestor nodes, attribute nodes, and namespace nodes of the context node.

• attribute – identifies every attribute node belonging to the context node; the attribute axis produces an empty node set unless the context node is an element node.

• namespace – identifies every namespace node belonging to the context node; the namespace axis produces an empty node set unless the context node is an element node.

• self – identifies only the context node.

• descendant-or-self – identifies the context node and all of its descendants.

• ancestor-or-self – identifies the context node and all of its ancestors.

Using the XML document in Example 9-2 and the corresponding XML tree illustrated in Figure 9-1, let’s explore these axes. Some of the terminology we employ in this exploration might be unfamiliar. We urge you to keep a bookmark in Chapter 6, “The XML Information Set (Infoset) and Beyond,” particularly at Table 6-2 – Tree-Related Terminology.

image

Figure 9-1 XML Tree Representing <movies> Example.

• child – The <movies> element has three children, each of them a <movie> element. Each of the <givenName> elements has one child, which is a text node. (Note that all of these axes identify nodes by reference; the nodes they identify include not only the nodes themselves, but also their entire subtree of descendants. That is why we can apply additional steps.)

• descendant – The first <movie> element has nine descendants: (1) the <title> element, (2) its child text node (“An American Werewolf in London”), (3) the <yearReleased> element node, (4) its child text node (“1981”), (5) the <director> element node, (6) its <familyName> child element node, (7) its child text node (“Landis”), (8) the <director> node’s <givenName> child element node, and (9) its child text node (“John”).

• parent – Each <director> element has a parent that is a <movie> element node.

• ancestor – Each <director> element has three ancestors: a <movie> element node, the <movies> element node, and the root node.

• following-sibling – The <director> element nodes do not have any following siblings. But the <year-Released> elements each have a following sibling that is a <director> element node.

• preceding-sibling – The <title> element nodes do not have any preceding siblings. But the <yearReleased> elements each have a preceding sibling that is a <title> element node. Among the <movie> element nodes, two have a preceding sibling and two have a following sibling (and one has one of each).

• following – The <yearReleased> element node whose text node child contains “1981” has 25 following nodes: the <director> element node that is the <yearReleased> element node’s following sibling, the <familyName> and <givenName> element nodes that are children of that <director> node, the text node children of those two element nodes, the <movie> element nodes having descendant <familyName> nodes whose child text nodes contains “Carpenter” and “Kubrick,” and all of their descendants.

• preceding – The <yearReleased> element node whose text node child contains “1981” has two preceding nodes: its preceding sibling element node <title> and that element node’s child text node. Note that neither that <yearReleased> element nodes parent <movie> element node or its ancestor <movies> element node are preceding nodes.

• attribute – Each of the <movie> nodes in this example has one attribute, so the attribute axis of each of those nodes contains one attribute node, named myStars.

• namespace – None of the nodes in this example have namespaces, so the namespace axis of each element node is empty.

• self – The self axis for the <yearReleased> element node whose text node child contains “1981” contains exactly one node: the <yearReleased> node itself.

• descendant-or-self – The descendant-or-self axis for the <yearReleased> element node whose text node child contains “1981” contains two nodes: the <year-Released> element node whose text node child contains “1981” and that same text node child.

• ancestor-or-self – The ancestor-or-self axis for the <yearReleased> element node whose text node child contains “1981” contains four nodes: the <yearReleased> element node whose text node child contains “1981,” its parent <movie> element node, its grandparent <movies> element node, and the root node.

Now let’s put the knowledge we’ve gained so far into practice. Let‘s ask the question “In what years were all of these movies released?” The answer, framed as an XPath expression in the notation we’ve seen so far, is shown in Example 9-3.

Example 9-3   Path Expression to Find yearReleased Nodes

image

Remember that the leading slash (/) means “start with the root of the document” and that the syntax of a step expression is an axis name (child, in this example) followed by a double colon (: :) followed by a node test. (In this example, the node test is the name of the element, but you’ll learn in Section 9.2.5 about other kinds of node tests.) Also, recall that step expressions are separated by slashes.

Our example path expression has a leading slash followed by two step expressions. Therefore, its interpretation is:

• Starting with the root of the document, first create a node set containing every child element node whose name is movies (there is only one such node).

• Next, using each node in the first node set in turn as a new context node, create a node set containing every child element node whose name is movie (there are three such nodes).

• Finally, using each node in the second node set as a new context node, create a third node set containing every child element node whose name is yearReleased (there are three such nodes, one per movie node).

The answer to our query is that third node set, which we might envision as suggested in Result 9-1.

Result 9-1   Result of Path Expression to Find yearReleased Nodes

image

What about asking about all of the ancestors of the familyName element node whose child text node contains “Carpenter”? From looking at Figure 9-1, we see that the first ancestor encountered is the parent element node director. The next ancestor is that director node’s parent element node movie. The next is that movie node’s parent element node movies. And the final ancestor is the movies node’s parent node, the document root.

Assuming that the context node is that familyName element node, this query is expressed as the relative path expression in Example 9-4. As you’ll learn in Section 9.2.5, the function-like notation “node( )” is a node test that means “any node is acceptable, regardless of its name or type.”

Example 9-4   Path Expression to Find Ancestor Nodes

image

Now, the result of that simple query is not necessarily what you might expect. You might expect to envision the results as shown in Result 9-2.

Result 9-2   Possible Result of Path Expression to Find Ancestor Nodes

image

Reality is slightly more complex, though. The result shown in Result 9-2 is correct, but its implications are not obvious. The result, as seen in detail in Result 9-3, is actually:

• The root node and all of its children (and all of their descendants), followed by

• The <movies> node and all of its children (and all of their descendants), followed by

• The appropriate <movie> node and all of its children (and all of their descendants), finally followed by

• The appropriate <director> node and all of its children (and all of their descendants)

Observe the way that we’ve indented the results to illustrate the four different kinds of ancestor – the root node, the <movies> node, the <movie> node, and the <director> node. (The indentation is not part of the result; it is merely our presentation style to help demonstrate the various results and their relationships to one another. In addition, our comments in italics within parentheses are not part of the result; they are our way of showing you where the value of the root node begins and ends. Similarly, the XML comments are not part of the result.)

Result 9-3   Actual Result of Path Expression to Find Ancestor Nodes

image

image

image

The reason that it’s important for you to understand the complexity of this answer is because you may frequently want to “drill down” from some ancestor node into one of its descendants. Again, assuming that the context node is that same familyName element node (the one whose child text node contains “Carpenter”), we can discover the names of the movies represented by the following siblings of “this movie,” as illustrated in Example 9-5.

Example 9-5   Taking Advantage of the Actual Results

image

The result of the query in Example 9-5 is seen in Result 9-4.

Result 9-4   Result of “Drill Down” Path Expression

The Shining

If you examine the path expression in Example 9-5, you’ll see that it first finds the context node’s ancestor named “movie,” then finds all of the following siblings of that node (there happens to be only one), then finds the children elements of that new movie node that are named “title,” and finally extracts the value of that title node – that’s what the node test “text( )” does, as you’ll read in Section 9.2.5.

If the result of that ancestor axis did not include all of the found nodes’ descendants, then this query would have been impossible to evaluate. Frankly, it would be much more surprising if XPath did not behave this way, because “returning” the ancestor <movie> node requires that the entire node, meaning it and all of its descendants (which are simply part of that node), be returned.

Axes can be forward axes or reverse axes. Axes that contain only the context node and/or nodes that follow the context node in document order are forward axes; axes that contain only the context node and/or nodes that precede the context node in document order are reverse axes. Thus, the child, descendant, descendant-or-self, following, following-sibling, attribute, and namespace axes are all forward axes, while the parent, ancestor, ancestor-or-self, preceding, and preceding-sibling axes are all reverse axes. The self axis could be considered either a forward axis or a reverse axis – the concept is irrelevant, since that axis can never contain more than one node.

When traversing a forward axis such as the child axis, the first node encountered in document order along that axis is in position 1, the second is in position 2, and so forth. When traversing a reverse axis such as the preceding-sibling axis, the first node encountered in reverse document order along that axis (which would be the last node encountered in document order were the nodes being traversed along a forward axis) is in position 1, the next is in position 2, and so forth. In spite of the convention of counting nodes along a reverse axis in reverse document order, the nodes returned by a step along a reverse axis are still returned in (forward) document order.

The syntax for using axes is often lengthy and cumbersome, so XPath provides some shorthand notations7 to make the job of writing path expressions a little less tedious. Not all axes have shorthand notations, but the most common ones do. The effect of one of these shorthand notations is identical to the corresponding full notation. The shorthand notations are:

• nodename A step expression may contain a node name without an axis name (and without the double colons that separate axis names from node names). This is a shorthand for child: : nodename, so/movies means “start at the root node and locate every element node child named movies of the root node.” (As we saw earlier, when a slash appears as the first character in a path expression, it has the meaning “start at the root node.” When it appears elsewhere in a path expression, it serves to separate two step expressions from one another.)

• /nodename A step expression that contains a slash followed by a node name (without an axis name or the double colons) is equivalent to specification of the descendant-or-self axis, so/movies//familyName means “start at the root node, find all element node children named movies of the root node, and then find every element node descendant (including, if relevant, the context node) named familyName.” Similarly, //givenName means “start at the root node and find every element descendant named familyName.” (Recall that the first slash between two step expressions is just the separator, so it is the second slash that really means “descendant-or-self.”)

• @ nodename – A step expression that contains an “at” sign – which is called by different names in various countries – followed by a node name (without an axis name or the double colons) is equivalent to specification of the attribute axis. Therefore, movie/@myStars means “start at the context node, find every element node child named movie, and then find every attribute child node named myStars.” (Arguably, the “@” notation was chosen because Americans call it the “at” sign and that syllable is the first syllable of the word “attribute.”)

• . – For the sake of readability, it is sometimes convenient to make explicit the fact that you want the path expression to start operating at the context node. If you wish to do this, the period, or full stop, (.) serves the purpose. This notation is equivalent to specification of self: :node( ). Therefore, the path expression. /director means “starting with the context node, find all element child nodes named director.” Readers familiar with some computer file systems will recognize the inspiration for this notation, which indicates “this directory” in those file systems.

• image – A step expression that contains two consecutive periods, or full stops, is equivalent to the use of the parent axis. The path expression image /movie/yearReleased means exactly the same thing as the path expression parent::node()/child::movie/child::year-Released, and it returns the siblings of the context node’s yearReleased children. (That raises this question: Are those the nieces and nephews of the context node?) This notation was also inspired by analogous usage in some computer file systems.

9.2.5 Node Tests

Every axis has a principal node type. The principal node type for axes that can contain elements is element. The principal node type for axes that cannot contain elements is the type of the nodes that the axis can contain – the only two axes with this property are the attribute axis, which can contain only attribute nodes, and the namespace axis, which can contain only namespace nodes. A node test is a way of testing the result of traversing an axis to determine whether the nodes in which you’re interested have been returned.

There are two sorts of node tests:

• Name tests

• Node type tests

A name test provides a way for you to instruct a step expression that you’re only interested in nodes with a particular name. A name test is, syntactically, a QName. It is true if and only if the type of the node is the principal node type of the axis specified in the step expression and the expanded name of the node is equivalent to the expanded name of the supplied QName. (An expanded name is a tuple comprising the URI associated with the QName’s prefix part, if any, and the QName’s local part.) For example, the step expression child: : director selects the director element children of the context node. If the context node is one of the <movie> nodes, the step expression attribute: : myStars identifies the attribute children named myStars.

Name tests come in a couple of other flavors as well. The name test “*” is true for any node of the principle node type, no matter what its QName happens to be. For example, if the context node is a <movie> node, the step expression child: : * selects all element children, including the <title> node, the <yearReleased> node, and the <director> node.

Since name tests usually involve QNames, let’s explore the implications associated with that kind of name. Recall that a QName is, syntactically, a namespace prefix followed by a colon followed by a “local” name. The namespace prefix and local name are both instances of NCName (no-colon name – a name without a colon). For example, example:movies might be the namespace-qualified name of a document of movies defined by somebody other than ourselves.

But that namespace prefix has to be associated with a “namespace name,” which is always some sort of URI. If example is a namespace prefix, it might be associated with the URI http://entertainment.example.com/multimedia/. Note that (as Gertrude Stein famously said about Oakland, California) there is no “there” there. That is, the URI is not required to resolve to an actual page on the web; it’s nothing more than an identifier. (Many people consider it good web etiquette to place an actual web page at the address indicated by a namespace URI, if only to inform a human reader of the intent of that address. Such pages are often referred to as namespace documents.)

The name test example : * selects all nodes of the principle node type whose namespace name (the URI) is the namespace URI associated with the namespace prefix example. Note that those nodes might have prefixes other than example; that matters not at all, because it’s only the associated namespace URI that is used for the name test. Similarly, the name test * : familyName selects all nodes of the principle node type whose local name is familyName, regardless of their namespaces.

Node type tests allow you to instruct step expressions to select only nodes of a specified type. For example, the node type test comment ( ) is true for all comment nodes, the node type test text ( ) is true for all text nodes, and the node type test processing-instruction ( ) is true for all processing instruction nodes, while the node type test node ( ) is true for nodes of any type. Recall that the step expression “/*”, because it is merely a shorthand for child: : *, identifies only element children – never attribute nodes. But/node( ), as well as child: :node( ), identifies all node children, including attributes.

A processing instruction node type test can include a string literal within the parentheses; if it does, then it matches only those processing instructions whose name (also known as its target) is equal to the literal. Thus the node type test processing-instruction (“xml-stylesheet”) matches all processing instruction nodes whose name, or target, is xml-stylesheet.

The way that you write a path expression can sometimes give slightly surprising results, especially when parentheses come into play. Consider the following two path expressions:

• //director[3]

• (//director)[3]

The first of those expressions can be read like this: Select all director element nodes anywhere in the document that are the third director child of their parent, including all of their descendants. The result in this case is an empty node set, because our sample data contains no movie that has three directors.

By contrast, the second expression is read: Select all director element nodes anywhere in the document, and then identify the third (in document order) of those director nodes, including all of its descendants. With the data in Example 9-2, that result is:

image

9.2.6 Predicates

In XPath, as in all computer languages, a predicate is an expression that evaluates to true or false. (In some languages, there may be a third possible result to indicate that the result cannot be determined from the information provided; in SQL, for example, some predicates evaluate to unknown if the expression being evaluated includes null values.) It is appropriate to think of predicates as filters, because they exclude objects (nodes, for instance) for which the predicate evaluates to any value other than true. Predicates are applied to node sets that are returned by evaluating the node test with respect to the specified axis and the context node. They may reduce the number of nodes in the node set by eliminating nodes for which the predicate does not return true, but they can never add to the nodes in a node set. When a predicate is applied to each node in a node set in turn, that node is treated as the context node for the purpose of evaluating the predicate, while the context size is the number of nodes in the node set and the context position is the position of the node within that node set with respect to the specified axis.

Syntactically, a predicate is represented as an ordinary XPath expression surrounded by square brackets, as you saw in Section 9.2.3. Of course, ordinary XPath expressions may have types other than Boolean – string, number, and node set, to be precise. XPath includes rules for determining a Boolean value from the result of any XPath expression.

• If the type of the expression’s result is number, then the predicate is true if and only if the value of that number is equal to the context position. Therefore, the predicate [3] is equivalent to the predicate [position( ) = 3], where position( ) is an XPath function that returns the context position. Please note that the first position is always position 1 (not position 0, as in some languages).

• If the type of the expression’s result is string, then the predicate is true if and only if the length of the string is greater than zero (that is, there is at least one character in the string). For example, considering the XML document from Example 9-2, if the predicate [title/text ( )] were applied to the expression/movies/child: : movie, it would be true for all of the movies, since they all have title element children whose value is not the zero-length string.

• If the type of the expression’s result is node set, then the predicate is true if and only if the node set contains at least one node. The implication of this rule is that you can easily test whether the current context node has at least one of a given type of node as a child, as an attribute node, or as a namespace node. Again considering the XML document from Example 9-2: If the predicate [descendant: : familyName] were applied to the expression/movies/child: : movie, it would be true because movie elements do have a descendant element called familyName; however, the predicate [descendant: :dogName] applied to the same expression would return false because those movie elements do not have a descendant element called dogName.

9.2.7 XPath Functions

XPath supplies us with a number of built-in functions. Implementations, as well as the host environment from which XPath is invoked, are free to supply additional functions.

First, let’s expand a bit on the description of function invocations that you read in Section 9.2.1. Functions are invoked in XPath as part of a step expression, and the notation is entirely familiar to programmers: function-name (argument, argument, …). The function-name, of course, serves to identify the function to be invoked. Each argument is evaluated and, if necessary, converted to the data type required by the corresponding parameter of the function. (If the number of arguments is not the same as the number of function parameters or if any of the arguments cannot be converted to the proper data type, that’s an error.)

Since function invocations are just another sort of XPath expression, they must return a value of a particular type. The result of a function expression is the value returned by the function itself.

The XPath 1.0 specification categorizes functions according to the sorts of objects on which they operate, so we’ll do the same here.

Some XPath functions are focused on node sets:

• last( ) – returns a number equal to the current context size.

• position ( ) – returns a number equal to the current context position.

• count (nodeset) – returns a number equal to the number of nodes in the node set specified by the argument.

• id (object) – If the argument identifies a node set, then this function first takes the string value of each node in that node set and (recursively) applies the id ( ) function to the resulting string value; the result is the union of the sets of nodes that are returned from those applications of the id ( ) function. If the argument has any other type, it is first converted to a string that is split along white-space boundaries (if any) into a list of tokens; the result is a node set containing every node in the same document as the context node that has an attribute of type ID whose value is equal to any of those tokens.

• namespace-uri (nodeset?) – returns a string equal to the URI component of the expanded name of the first node (in document order) in the node set identified by the argument. If the optional nodeset argument is empty or if the first node in the node set identified by that argument does not have an expanded name or if the namespace URI of that first node is null, then the function returns a zero-length string. If the argument is not provided, then the context node is the node set used by the function.

• local-name (nodeset?) – returns a string equal to the local name component of the expanded name of the first node (in document order) in the node set identified by the argument. If the optional nodeset argument is empty or if the first node in the node set identified by that argument does not have an expanded name, then the function returns a zero-length string. If the argument is not provided, then the context node is the node set used by the function.

• name (nodeset?) – returns a string containing a QName that represents the expanded name of the first node (in document order) in the node set identified by the argument, with respect to the namespace declarations in effect for that node. In most cases, the QName will contain the namespace prefix that was used in the original XML document; however, if the namespace represented by that prefix was declared for multiple prefixes, then the function might use any of those prefixes in the QName. If the optional nodeset argument is not provided or if the first node in the node set identified by that argument does not have an expanded name, then the function returns a zero-length string. If the argument is not provided, then the context node is the node set used by the function.

Another group of functions concerns itself with string values:

• string (object?) – returns the object (that is, some value, node, or node set) converted to a string. If the object is a node set (a single node is a node set with only one member), then the function returns the string value of the first node (in document order) of the node set. If the object is a Boolean, then the value false is converted to the string “false” and the value true is converted to the string “true.” If the object is a string, then its value is returned. If the object is a number, then it is converted to that number’s string representation, corresponding approximately to the notation defined in IEEE 854.8 For the details, we suggest that you consult the XPath 1.0 specification.

For example, using the XML document in Example 9-2, string(//director[2]) would return CarpenterJohn.

• concat(string, string, …) – returns the string that results from concatenating all of the arguments together (in the order supplied).

    For example, concat(’Director : ‘, //director [2] / familyName) would return Director : Carpenter.

• starts-with(string, string) – returns true if the value of the first argument contains as its leading characters the value of the second argument; otherwise, it returns false.

    Invoking starts-with(string(//director[2]), “John”) returns false, but invoking starts-with (string (//director [2]), “Car”) returns true.

• contains (string, string) – returns true if the value of the first argument contains anywhere within it the value of the second argument; otherwise, it returns false.

    The expression contains (string(//director [2]), “rJ”) returns true.

• substring-before(string, string) – returns the portion of the value of the first argument that occurs before the first occurrence of the value of the second argument; if the value of the second argument doesn’t appear as part of the value of the first argument, the function returns false.

    If you invoke substring-before(string(// director [2]), “John” ), you’ll get the string Carpenter.

• substring-after (string, string) – returns the portion of the value of the first argument that occurs after the first occurrence of the value of the second argument; if the value of the second argument doesn’t appear as part of the value of the first argument, the function returns false.

    Evaluation of substring-after (string (// director [2]), “John “) returns the zero-length string.

• substring(string, number, number?) – returns the portion of the value of the first argument starting with the position indicated by the value of the second argument (the first character is at position 1) – if the third argument is not supplied, then the returned value includes all characters in the value of the first argument following that starting position; if the third argument is provided, then its value determines the maximum number of characters returned. If the value of the second argument is not an integer, then it is rounded up to the next higher number. If the third argument is specified and is not an integer, then the position of the last character returned is less than or equal to the rounded value of the second argument plus the rounded value of the third argument.

    substring(string(//director[2]), 4, 7) yields the string penter J.

• string-length (string?) – returns the length, in characters, of the value of the argument; if no argument is supplied, the length of the string value of the context node is returned.

    string-length(string(//director[2])) is 12.

• normalize-space(string?) – returns the value of the argument with white space normalized (meaning that all leading and trailing white space is removed, and each sequence of white space within the value is replaced by a single space character); if no argument is supplied, then the function operates on the string value of the context node.

    normalize-space(“My favorite film is not on DVD!”) yields the string “My favorite film is not on DVD!”.

• translate(string, string, string) – returns the value of the first argument after replacing each occurrence of a character that appears in the value of the second argument with the character at the corresponding position in the value of the third argument; if the value of the third argument is shorter than the value of the second argument, then characters in the value of the first argument that appear in the “excess” portion of the value of the second argument are simply deleted from the returned value.

    Use of translate(string(//director[2]), “Jh”, “R”) results in CarpenterRon. Note that the J in John has been translated to an R and that the h in John has been eliminated entirely.

Yet another group of functions deals with Boolean values:

• Boolean(object) – returns a Boolean value computed from the value of the argument. If the type of the argument is a node set, then the function returns true if and only if the node set has at least one node. If the type of the argument is string, then the function returns true if and only if the string contains at least one character. If the type of the argument is Boolean – well, the function returns that value. If the type of the argument is number, then the function returns true if and only if the value of the argument is neither positive zero, negative zero, nor NaN (not a number).

• not (object) – returns the Boolean value true if the value of the argument is false and returns false if the value of the argument is true.

• true ( ) – returns the Boolean value true.

• false( ) – returns the Boolean value false.

• lang(string) – returns true if and only if the language of the context node, as expressed by an xml : lang attribute on the context node (or if the context node has no such attribute, the nearest ancestor node with such an attribute), is the same as or is a sublanguage of the language indicated by the value of the argument (ignoring case). If there is no applicable xml : lang attribute, then the function returns false.

The final group of functions return numeric values:

• number (object?) – returns the value of the argument, converted to a number. If the argument is a number, then its value is returned. If the argument is a string whose value corresponds to a valid representation of a number in XPath, then the function returns the corresponding number; other strings are converted to NaN. If the argument is a Boolean, the true is converted to 1 (one) and false is converted to 0 (zero). If the argument is a node set, then the string value of the first node, in document order, of the node set is used as the effective value of the argument. If no argument is supplied, then the function operates on the node set containing only the context node.

• sum (nodeset) – returns the sum of the numbers that result from converting the string value of each node in the node set to a number.

• floor (number) – returns the largest integer number (that is, the number closest to positive infinity) that is not greater than the value of the argument.

• ceiling (number) – returns the smallest integer number (that is, the number closest to negative infinity) that is not less than the value of the argument.

• round (number) – returns the integer number that is closest to the value of the argument. If there are two possible values, then the one closest to positive infinity is returned.

9.2.8 Putting the Pieces Together

Before leaving the subject of XPath 1.0, let’s consider a few examples that illustrate the various concepts we’ve discussed in this part of the chapter. These examples are all based on the XML document contained in Example 9-2.

In this section, each example contains the XPath expression being illustrated and its results (using our indentation convention – with a reminder that the actual results are not serialized into character strings at all but remain in the more abstract form of an instance of the Xpath 1.0 data model).

Example 9-6   Average Rating of Movies Directed by “John”

image

Let’s look in detail at the expression in Example 9-6. To compute an average, we apply the time-honored mechanism of adding up a collection of values and then dividing that sum by the number of values in that collection; notice that the arguments given to the sum( ) function and the count ( ) function are identical. In both cases, the argument should be read thusly:

• Starting at the root of the document, create a node set containing all child element nodes named movies.

• Create a second node set containing, for every node in the first node set (there will never be more than one, because the root node never has more than one child element node), every child element node named movie (the second node set contains three element nodes).

• Create a third node set containing every node in the second node set that satisfies the predicate. The predicate should be understood to say that, for each node in the second node set:

– Create a fourth node set containing every child element node named director (there are three such nodes).

– For all nodes in the fourth node set, create a fifth node set containing every child element node named givenName (again, there are three such nodes)

– For all nodes in the fifth node set whose string value is equal to “John” (there are two such nodes), node being considered in the second node set is satisfied (and thus included in the third node set).

• Create a sixth node set containing every node in the third node set that has an attribute named myStars (there are two such nodes).

The sum( ) function, as described earlier, “returns the sum of the numbers that result from converting the string value of each node in the node set to a number.” The string values of the two nodes in the sixth node set are “5” and “4,” respectively, and the result of converting those string values to numbers are 5 and 4, respectively. The count ( ) function counts the number of nodes in the node set; that count is, of course, 2. Therefore, the expression finally divides (5 + 4) by 2 and returns 4.5.

Example 9-7   Titles of Movies with High Ratings

image

In Example 9-7, assuming that the context node is the <movies> node, the expression:

• Builds a nodeset containing every child element named movie.

• Creates a second node set containing, for every node in the first node set, the child element nodes named title.

• Creates a third node set containing every node in the second node set that satisfies the predicate. The predicate, for each node in the second node set:

– Creates a fourth node set containing the parent of the node (from the second node set) being considered. There is one such node (every node has no more than one parent).

– Creates a fifth node set containing, for all of the nodes in the fourth node set, the attribute named myStars. (There is one such node.)

– If the value of any node in the fifth node set is greater than 3, then the predicate is satisfied for the node being considered from the second node set and that node is included into the third node set.

The string ( ) function returns the string value of the first node in the third node set, which is an element node named title. Intuitively, one might expect for the string function to return the string value of all nodes (there are two of them) in the third node set, strung together: “An American Werewolf in LondonThe Thing.” However, the string ( ) function was described earlier this way: “If the object is a node set, then the function returns the string value of the first node (in document order) of the node set.” Consequently, only the first node that satisfies the predicate is used to produce the result.

Example 9-8   Titles of Movies with High Ratings and Low Ratings

image

In Example 9-8, the two instances of the string ( ) function are evaluated very much according to the process described for Example 9-7, and the results are concatenated together with a single space between them. Don’t forget that the first string ( ) function returns the string value of the first node in the resulting node set.

Example 9-9   Manipulating the Titles of Movies with High Ratings and Low Ratings

image

By now, you should be able to work through the expression in Example 9-9. Consult the descriptions of the substring(), substring-after( ), and translate( ) functions in Section 9.2.7 as you work out this example.

Example 9-10   Is There a Movie with a Rating of Fewer Than Four Stars?

image

In Example 9-10, the expression first forms a node set containing all nodes that have an attribute whose name is myStars and whose value is less than 4. The Boolean ( ) function returns true because the constructed node set is not empty.

Interestingly, the Boolean ( ) function is not necessary in this case. The expression “//@myStars<4” also returns true. We leave it as an exercise for our readers to determine why this shorter expression behaves like the first.

9.3 XPath 2.0 Components

XPath 2.0 is a significant improvement over XPath 1.0. In adding the major enhancements, a small number of incompatibilities were introduced, most of which can be avoided if the environment that invokes XPath9 simply sets a “backwards compatibility” flag.

XPath shares a great deal of syntax and semantics with XQuery, allowing most of XPath to be described in Chapter 11, “XQuery 1.0 Definition.” Our relatively brief discussion of XPath 2.0 in this chapter focuses on those aspects of XPath that are distinct from XQuery – particularly identifying the language features that are absent in XPath but present in XQuery.

The most significant driving factor involved in the differences between XPath 1.0 and XPath 2.0 is probably the change in data model. XPath 1.0, as you read in Section 9.2, based its data model on the Infoset (however, you’ll recall that the XPath 1.0 is not exactly the same as the Infoset). By contrast, XPath 2.0 is defined with respect to the XPath 2.0 and XQuery 1.0 Data Model (for the remainder of this chapter, we’ll simply call it the “Data Model”), about which you will read in Chapter 10, “Introduction to XQuery 1.0.”

9.3.1 Expressions

One consequence of having a much richer data model is that the number of types of expressions grew. In XPath 1.0 (see Section 9.2.1), there are three kinds of expressions: node set expressions that allow formation of a union of two node sets; value expressions that operate on string, numeric, and Boolean values; and path expressions.

In XPath 2.0, node sets have been replaced with sequences, which are among the most important concepts of the Data Model. A node set contains zero or more nodes, no node can appear in the node set more than once (that is, no duplicates are possible), and the nodes are not in any particular order. A sequence, by contrast, allows a node to appear more than once (duplicates are permitted), and the nodes in the sequence are in a particular order; in addition, sequences can contains nodes, atomic values, or any mixture of the two. The so-called set expressions that operate on sequences of nodes includes the union operator; XPath still allows this operator to be represented by the vertical bar “|” but also allows it to be spelled out: union. Two new operators have been added: intersection (spelled intersect), which returns a sequence containing only those nodes that appear in both of the source sequences, and difference (spelled except), which returns a sequence containing only those nodes that occur in the first source sequence but not in the second.

Value expressions have been enhanced significantly in XPath 2.0. The most fundamental changes are driven by the adoption of the Data Model. The Data Model, as you’ll learn in Chapter 10, provides a much larger collection of data types, which are based on the types supported by XML Schema Part 2;10 additional types are defined by the Data Model itself. To support the new set of data types, a number of new operators have been provided. A much larger collection of “built-in” functions has been provided, many of them to support the new data types. Additional functions, called external functions, can be supplied by XPath implementations and even by users.

Path expressions in XPath 2.0 serve the same purpose as in XPath 1.0. Path expressions are still composed of a sequence of steps, and steps (which we prefer to call step expressions) still comprise the same three components: an axis, a node test, and zero or more predicates. However, XPath 2.0 extends this by allowing a step to be any expression that evaluates to a sequence of nodes, without an axis being involved at all.

In addition, the slash “/” that was described in Section 9.2.3 as a separator between step expressions now behaves more like a true operator. Recall that XPath 1.0’s steps produced node sets and that sets have no particular order; XPath 1.0 generally processed the nodes in document order, but that was not an attribute of the node sets themselves. In XPath 2.0, sequences are inherently ordered, and the slash operator causes duplicate elimination to be performed and for the nodes in the sequence to be rearranged into document order.

Node tests are still name tests or kind tests. In XPath 1.0, name tests could be specified in three forms: a QName, *, NCName:*. XPath 2.0 adds one more: * : NCName (all nodes with a specified local name, regardless of the namespace in which they are defined).

XPath 2.0 adds three new kinds of expression: sequence expressions, the conditional expression, and type expressions. A sequence expression is one that manipulates sequences. The XQuery Data Model, which introduces the concept of sequences, is discussed in detail in Chapter 10. A sequence is an ordered collection of items, which may be atomic values, nodes, or even mixtures of both; unlike a node set, the order of items in a sequence is not necessarily document order.

Sequence Expressions

There are several varieties of sequence expression:

• , (a comma) – Sequence concatenation, construction of a sequence from other sequences

• to – Numeric range, producing a sequence of consecutive values starting with the value of the first argument and ending with the value of the second argument

• some and every – Quantified expressions, evaluating whether at least one item, or all items, respectively, in a sequence satisfies a specified condition

Arguably the most powerful sort of sequence expression is:

• for and return – Application of an expression to every item in a sequence, returning the results of each such application in a sequence that contains all of the results in the order in which they were generated

The for expression, accompanied by the return expression, is closely related to the FLWOR expression in XQuery (which we discuss in detail in Chapter 11, “XQuery 1.0 Definition”), but it is significantly limited by comparison. This pair of expressions as defined for XPath are important enough to justify their own section, Section 9.3.2.

The Conditional Expression

The conditional expression is better known as the if expression:

image

Unlike in many languages, the else clause is mandatory. The semantics are exactly what you expect: The first expression, exprl, is evaluated. If it evaluates to true, then the second expression, expr2, is evaluated and is the value of the if expression; if the first expression evaluates to false, then the third expression, expr3, is evaluated and is the value of the if expression.

XPath determines the (Boolean) value of the first expression using the semantics of the effective Boolean value of that expression. In general, all values of that expression evaluate to true, except: the empty sequence, a single zero-length string (xs:string and xdt : untypedAtomic), a single number (xs:decimal, xs:float, and xs:double) whose value is 0, a single floating-point number (xs:float and xs:double) whose value is NaN (not a number), and a single Boolean whose value is false. An error is raised if the expression produces more than one atomic value.

Type Expressions

Type expressions deal with the data types defined for XPath, including the types that are built into the Data Model and other types that are defined in XML Schemas associated with the context in which an XPath expression is evaluated. Every value in the Data Model is an instance of some type and is inherently a member of a sequence (an individual item is actually a sequence of length 1). XPath uses the term sequence type to talk about items. An item is either a node or an atomic value. The Data Model provides two generalized item types: item ( ), which allows any sort of item at all, and empty ( ), which prohibits every kind of item.

The type expressions used in XPath include:

• Expressions related to converting values to a new data type

• Expressions dealing with determining the data type of a value

In XPath, as in XQuery and SQL, the expression that converts an atomic value of one atomic data type into a corresponding value of another atomic type is called a cast. Neither XPath nor XQuery support any form of error recovery, so any attempt to cast a value into an inappropriate type results in an error that causes evaluation of the “outermost” expression to terminate. Run-time failures are generally a bad idea, and many languages – especially query languages – strive to minimize the possibility of such failures. XPath and XQuery provide a castable expression that allows a query to determine whether a cast will succeed before actually performing the cast:

image

There are a number of limitations on permissible casts. Some limitations are absolute – it is a type error to attempt to cast a value whose type is xs:dateTime into the xs:NCName type, because no value of xs:dateTime could ever be a valid xs:NCName value. Other limitations depend on actual values – casting a value of xs:string into xs:decimal will fail unless the xs:string value has the same lexical form as a valid literal for xs:decimal values.

The other components of XPath are philosophically the same as they were in XPath 1.0, meaning that they serve the same purpose with essentially the same syntax. The differences in them are caused by factors we mentioned earlier, such as the adoption of the Data Model. For example, in XPath 1.0, determination of effective Boolean values did not have to contend with decimal numbers or single-precision floating-point values, while XPath 2.0’s use of the Data Model brings those data types into consideration.

9.3.2 The for and return Expressions

The for expression and the sequence data type defined in the Data Model are closely related. The for expression always returns a sequence of zero or more items, and the sequence data type is most powerful when a mechanism is provided to iterate through the items in a sequence. When coupled with the return expression (which, in XPath, it always is), the for expression produces a sequence of items – not necessarily nodes – in much the same way that step expressions and the other sequence expressions do.

Consider the for expression in Example 9-11, which uses the XML document given in Example 9-2.

Example 9-11   Using the for Expression

image

The variable $m is the range variable of the expression, while the value of the path expression //movies [yearReleased=” 1984”] is the binding sequence, and the expression following return is the return expression. The result of this for expression is the result of evaluating the return expression once for every item in the binding sequence. In this case, the result is shown in Result 9-5.

Result 9-5   Result of Simple for Expression

image

A note about Result 9-5: The for expression in Example 9-11 returns a sequence of items. In this case, each of the items is a string value. The expression does not insert a new line or even a space between the two string values. However, to ensure that the result of the expression is clear, we have illustrated the result on two lines.

It’s worth observing that the for expression in Example 9-11 is both a valid XPath 2.0 expression and a valid XQuery 1.0 expression. If shown without any context in which to evaluate it, we could not tell you whether it was XPath 2.0 or XQuery 1.0 – because it is both. This characteristic is true of virtually all XPath 2.0 expressions. The only exception is that XPath 2.0 supports, in backwards-compatibility mode only, a namespace:: axis, while XQuery does not.

XPath allows for expressions to be nested, in which the result is produced by evaluating the “inner” for expression once for each item in the result of the “outer” for expression, and the inner return clause produces one item for each item in the result of all those evaluations of the inner for expression. XPath provides a syntactic shorthand for nesting for expressions: The sequence “$var in expression” can be repeated, with multiple instances of that sequence separated by commas.

XPath 2.0 offers considerable more power than XPath 1.0. Here are some of the more obvious new capabilities introduced by XPath 2.0.

• There is a dependence on the Data Model, implying sequences and new data types.

• Node tests can now test the type of a node and not merely its name.

• Function calls can be used in place of step expressions.

• It introduces several new operators (such as operators that test the positional relationship between two nodes, the idiv operator, and the new set operators).

• It includes new expression types (the for expression explored earlier, the if expression also discussed earlier, and existential expressions using some and every).

• The library of built-in functions available for use is much enlarged, and user-defined functions are possible.

In Chapter 11, “XQuery 1.0 Definition,” you’ll read much more about the XPath expressions discussed in this section.

9.4 XPath 2.0 and XQuery 1.0

In Section 9.1, we told you that “one language is a subset of the other.” To be very clear about that relationship, XPath 2.0 is a subset of XQuery 1.0. Both languages are free of side effects (except for possibly side effects caused by invocation of external functions). Because they are both functional languages, expressions written in them can be arbitrarily nested. That is, XQuery expressions can be used within other XQuery expressions, and XPath expressions can appear within XQuery expressions. Because XPath is a subset of XQuery, the second part of that previous statement is redundant – (virtually) every XPath expression is an XQuery expression.

The converse is not true, since XQuery has significantly more features than XPath 2.0 (and even more differences from XPath 1.0). XQuery, as you’ll read in Chapter 11, “XQuery 1.0 Definition,” provides many more expressions. For example, XPath 2.0 supports the for expression with the following syntax (using the extended BNF notation that the XPath 2.0 specification uses):

image

where “ExprSingle” is a BNF nonterminal symbol that corresponds to a single expression (as opposed to a comma-separated list of expressions).

By comparison, XQuery 1.0 provides a similar but extended variant called a FLWOR (For, Let, Where, Order by, Return) expression. Using the same EBNF notation, it looks like this:

image

The definitions of ForClause, LetClause, WhereClause, and OrderByClause are, respectively:

image

By contrast with XPath 2.0’s for expression, XQuery’s FLWOR expression provides the abilities to define variables without creating a loop over a node set, to filter the results with a predicate, and to specify an ordering of the results.

9.5 Chapter Summary

In this chapter, we’ve described both versions of XPath. XPath 1.0 was covered in some detail, while XPath 2.0 was discussed somewhat less thoroughly. In Chapter 11, “XQuery 1.0 Definition,” we discuss XQuery in detail and consequently discuss XPath 2.0 in more detail than in this chapter.

XPath is, as we’ve seen, a language for addressing parts of XML documents. The nature of that “addressing” makes XPath a query language. While the ability to express complex queries has improved significantly between XPath 1.0 and XPath 2.0, it remains somewhat limited when compared to more powerful languages, such as XQuery.


1XML Path Language (XPath) Version 1.0 (Cambridge, MA: World Wide Web Consortium, 1999). Available at: http://www.w3.org/TR/xpath.

2W3C Candidate Recommendation of XML Path Language (XPath) Version 2.0 (Cambridge, MA: World Wide Web Consortium, 2005). Available at: http://www.w3.org/TR/xpath20.

3Throughout Section 9.2, the unqualified word “XPath” must be interpreted as “XPath 1.0.”

4In XPath 1.0, these comparison operators have “existential” semantics. That characteristic means that, for operands that are not singletons, if there exists any value in the first operand that satisfies the comparison with respect to any value in the second operand, then the comparison is true. Thus, if a set of values (1, 2, 3) is compared to another set of values (2, 4, 6) for equality, the answer is true because the value 2 in the first set is equal to the value 2 in the second set. Surprisingly, if the two sets are compared for inequality, the answer is also true, because there is at least one value in the first set that is unequal to at least one value in the second set (1 is not equal to 4, for example).

5Namespaces in XML 1.0 (Cambridge, MA: World Wide Web Consortium, 1999). Available at: http://www.w3.org/TR/REC-xml-names.

6The term document order is defined in the XPath 1.0 specification to be “the order in which the first character of the XML representation of each node occurs in the XML representation of the document after expansion of general entities.” The root node is the first node; element nodes precede their children; attribute and namespace nodes precede the children of the element node; and namespace nodes precede attribute nodes. The relative positions of attribute nodes and of namespace nodes is not defined. Reverse document order is, quite logically, the reverse of document order.

7In fact, the discussions and examples in this chapter that precede Section 9.2.4 are all done with shorthand notations.

8ANSI/IEEE Std. 854:1987, IEEE Standard for Radix-Independent Floating-Point Arithmetic (New York: American National Standards Institute, 1987).

9Throughout Section 9.3, the unqualified word XPath must be interpreted as XPath 2.0.

10XML Schema Part 2: Datatypes (Cambridge, MA: World Wide Web Consortium, 2001). Available at: http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.147.193