The XPointer Specification

XPointer is a complex specification for addressing into the internal structures of XML documents. In XLink, the resource can be referred to by a Uniform Resource Identifier (URI). A URI is a URL followed by an optional query and then an optional fragment identifier. XPointers are fragment identifiers that can be used in conjunction with a URL.

Note

At the time of this writing, the XPointer specification is a candidate recommendation. The latest specification can be found at http://www.w3c.org/TR.


XPointers operate on the tree structure defined by the elements and markup of an XML document. From the discussion of the Document Object Model in Chapter 3, you should know that an XML document contains seven types of nodes: root nodes, element nodes, text nodes, attribute nodes, namespace nodes, processing instruction nodes, and comment nodes. The purpose of an XPointer is to refer to a particular portion of this tree, sometimes in relation to another part. In general, XPointers select a portion of the tree with axes and predicates. An axis selects a node or group of nodes in an XML document. A predicate tests either the selected nodes or nodes relative to the selected nodes. The XPointer specification builds on another specification, called XPath, that is a common syntax used by both XPointer and the extensible stylesheet language transformation (XSLT) specification, which is discussed in Chapter 5, "Java and the Extensible Stylesheet Language (XSL)."

XPath

XPath defines a language for creating expressions that operate on an XML document tree. The most important type of expressions are location paths. There are two types of location paths: absolute and relative. A location path consists of a set of location steps separated by a /. A location step has three parts: an axis, a node-test, and zero or more predicates. Here is an example of an absolute location path that selects the chapter child (or children) with a title attribute that has the value Introduction:

xpointer(/child::chapter[attribute::title='Introduction'])

In the example, the / is the absolute location for the root of the document. The term child is the axis. The double colon (::) separates the axis from the node-test. The node-test is the term chapter. The predicate is enclosed in quotes. For absolute location path, XPath provides / for the root and id("name") to locate a specific element with a unique ID.

Here is another example:

/descendant::para

This location path selects all the para elements in the document. In this example, descendant is the axis and para is the node-test.

An axis works in respect to a context node. A context node is defined either by an absolute location or a previous relative location step. The following keywords are the available axes:

  • child—Identifies a child node of the context node.

  • descendant—Nodes appearing anywhere in the content of the context node.

  • parent—Identifies a parent node of the context node.

  • ancestor—Element nodes containing the context node.

  • preceding—Nodes before the location source.

  • following—Nodes after the location source.

  • preceding-sibling—Identifies sibling nodes sharing their parent with the location source that appears before the location source.

  • following-sibling—Identifies sibling nodes sharing their parent with the location source that appears after the location source.

  • attribute—Attributes of the context node.

  • namespace—Namespaces of the context node.

  • self—The context node.

  • namespace—Contains the namespace nodes of the context node; the axis will be empty unless the context node is an element.

  • descendant-or-self—Contains the context node and the descendants of the context node.

  • ancestor-or-self—Contains the context node and the ancestors of the context node; thus, the ancestor axis will always include the root node.

An axis is either a forward axis or a backward axis. If the axis produces the context node and nodes after it then it is a forward axis. If the axis produces the context nodes and nodes before it (higher in the tree) then it is a backward axis.

A node-test filters nodes from an axis if those nodes do not meet certain criteria. Here are the possible node-tests:

  • A qualified name—This will filter nodes if they exactly match the name. For example, child::para will return all the para elements that are children of the current node. The qualified name may include a namespace.

  • One of three type tests: comment(), text(), and processing-instruction()—These tests return the node if it matches the type.

  • An asterisk (*)—This is a wildcard that returns all nodes in the axis.

  • An asterisk as the localpart of a fully qualified name—For example child::bk:* will return all the child elements that are part of the bk namespace.

A predicate filters a node-set with respect to an axis to refine the selection. Predicates evaluate to a Boolean value (true or false). There is a core function library that can be used in predicates. Table 4.6 presents the available functions in the core function library.

Table 4.6. XPath Core Function Library
Function Prototype Description
number last() Returns a number equal to the context size.
number position() Returns a number equal to the context position.
number count(node-set) Returns the number of nodes in the node-set argument.
node-set id(object) Selects elements by their unique ID.
string local-name(node-set?) Returns the local part of an expanded name.
string namespace-uri(node-set?) Returns the namespace part of an expanded name.
string name(node-set?) Returns the qualified name of the first node in the node-set.
string string(object?) Converts an object to a string.
string concat(string, string, string*) Returns the concatenation of its arguments.
boolean starts-with(string, string) Returns true if the first argument string starts with the second argument string.
boolean contains (string, string) Returns true if the first argument string contains the second argument string.
string substring-before ( string, string) Returns the substring of the first argument string that precedes the first occurrence of the second argument string.
string substring-after( string, string) Returns the substring of the first argument string that follows the first occurrence of the second argument string.
string substring(string, number, number?) Returns the substring of the first argument starting at the position of the second argument for the length of the third argument.
number string-length( string?) Returns the number of characters in the string.
string normalize-space( string?) Returns the argument with leading and trailing whitespace removed and sequences of whitespace replaced by a single space.
string translate (string, string, string) Returns the first argument string with occurrences of the second argument string replaced with the third argument string.
boolean boolean(object) Converts its argument to a Boolean.
boolean not(boolean) Returns the negation of its argument.
boolean true() Returns true.
boolean false() Returns false.
boolean lang(string) Returns true if the argument matches the current value of xml:lang.
number number(object?) Converts its argument to a number.
number sum(node-set) The sum of the node-set calculated by converting the string values of the node to a number.
number floor(number) Returns the largest integer not greater than the number.
number ceiling(number) Returns the integer not less than the argument.
number round(number) Returns the number that is closest to the argument and that is an integer.

Let's examine a complete example:

<!DOCTYPE  SCREENPLAY [
<!ELEMENT  LINES  (#PCDATA | SPEAKER  |  DIRECTOR)* >
<!ATTLIST  SCREENPLAY
           ID            ID  #IMPLIED>
<!ELEMENT  SPEAKER  (#PCDATA) >
<!ELEMENT  DIRECTOR (#PCDATA)> ]>
<SCREENPLAY  ID="Miller1">
   <SPEAKER> Linda </SPEAKER>
   You didn't crash the car, did you?
   <DIRECTOR> Willy looks irritated. </DIRECTOR>
   <SPEAKER> Willy </SPEAKER>
   I said nothing happened. Didn't you hear me?
</SCREENPLAY>

Now, let's create some XPointers in this document.

xpointer(id('Miller1')/child::SPEAKER[position() = 2])
     selects the 2nd "SPEAKER" element whose content is "Willy"

xpointer(id('Miller1')/child::text()[position() = 2])
     selects the second child  text element which
             is "I said nothing happened".

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.85.221