Simple location paths

Expressions often identify items by their location in the document structure. A 'path' is a series of steps to a target location. A location path may burrow down into the structure, skip over siblings, or work back up the structure.

Relative paths

A relative path is one that starts from an existing location in the document structure. The element that is ultimately targeted depends entirely upon where the starting point is. The same relative expression selects different elements when applied from different context locations:



The simplest form of relative path is a reference to an element name. In the following case, the reference is to any Paragraph element that happens to be a child element of the currently selected element:

para

This is in fact the simplest possible form of a 'node test'. In this case, a test for the presence of an element node with the name 'para' is made. This is also an abbreviation of the expression ' child::para', which makes the meaning more explicit.

To re-emphasize the point, this expression does not select children of a paragraph, but children of the current element that have the name 'para'. The current element may, perhaps, be a Chapter element:

<chapter>
  <title>A TITLE</title>
  <para>First paragraph.</para>
						<para>Second paragraph.</para>
  ...
</chapter>

Some applications that could use XPath may have no concept of a current element. For example, an expression may be a query that searches the entire document. On these occasions, a relative path will never be appropriate.

Some other applications may process a document from start to finish, selecting each element in turn, so that all elements eventually take their turn to become the current element. In this scenario, the expression shown above would apply to all paragraphs in the document, as they would all be children of elements selected during the process.

An objection may be raised to the suggestion made above that all elements are the children of other elements. The root element has no parent, so a Book element should not be selected by the expression 'book'. But even the root element has a parent node This node represents the entire document, including any markup, such as comments and processing instructions, that may surround the root element.

Multiple steps

When an expression includes a number of steps, these steps are separated from each other using the '/' symbol. The steps in the expression are read from left to right, and each step in the path creates a new context for remaining parts of the path. The following example includes two steps. First, the Book element is selected and made the current context, then each Title element directly within the Book element is selected:

book/title

Both steps in this expression refer to child elements. However, it is important to emphasize that the '/' symbol itself does not denote a parent/child relationship. It merely serves to separate the steps, and is also used in many other circumstances. This distinction is much more obvious when the more verbose form of expression is used:

child::book/child::title

Wildcard steps

Sometimes, the names of elements between the context element and the required descendant may not be known, but this does not need to be an obstacle. The '*' symbol can be used as a 'wildcard', standing in for any element name (like the joker in some card games).

This technique can be used to select elements in a number of different contexts simultaneously. For example, it may be necessary to select all chapter titles, and also the title of an introduction. The first example below is equivalent to both of the following, more explicit expressions:

book/*/title
book/intro/title
book/chapter/title

But this approach can be dangerous. The example above may inadvertently also select titles in other structures, such as an Appendix element that follows the Chapter elements. Used with care, though, it is a powerful technique. The unabbreviated version simply adds the asterisk symbol to 'child::':

child::*
					

Although multiple asterisks can be used to indicate unknown elements at several levels in the expression, it is necessary to know in advance exactly how many levels deep the required elements will be. This approach clearly does not work when the elements to be selected lie at different levels within the document structure. Instead, a more powerful feature allows selection of all descendants. In the abbreviated syntax, two slashes are used to indicate this intent, '//'. In the following example, paragraphs that occur anywhere within a chapter are selected:

chapter//para


   <chapter>
     <para>...</para>
     <note><para>...</para></note>
     <para>...</para>
   </chapter>

It is also possible to use this technique at the beginning of an expression. However, for reasons that will become clear later, using '//' at the beginning would not produce the desired effect. Instead, it is first necessary to explicitly declare that the starting point is the current element. This is done using a full-point, '.'. The following example selects all paragraphs within the current element:

						.//para

Parents and grandparents

The parent of the context element is represented by two full-point symbols, '..'. This mechanism is typically used to help select siblings of the current element. The following example selects a Title element that shares the same parent as the context element (the Para element):

						../title



To move up to the grandparent element, the '..' notation is simply repeated as a further step. The following expressions select the title of a chapter, when the current element is embedded within a section of the chapter:

						/../title



This technique for accessing ancestors of the context element can be clumsy when many intermediate levels exist, and does not work at all when the number of levels to be traversed varies. A more advanced technique for accessing ancestors is described later.

Absolute paths

In some circumstances a relative path is not suitable. For example, it may be necessary to select the title of the book, irrespective of the current context. The location relative to the document as a whole may be known, whereas the offset from the current location (if there is a current location) may not. In this case, an absolute path is more appropriate. This is a path that begins at a fixed 'landmark'. Essentially, an absolute path is the same as a relative path, except for the first step, which identifies such a landmark.

One kind of absolute path begins with a '/' symbol, indicating that the landmark is the root of the document:

						/book/title



Note that by placing '//' at the beginning of an expression, it is possible to select all occurrences of a specific element type within the document (and is necessary if the expression is not being employed by an application that traverses the document structure, applying the expression to each element it finds):

//para

The other kind of absolute path is one that begins from a specific 'anchor point' in the document. An element that has a unique identifier can be targeted using the 'id()' function:

id("para33")/...



Note that, for this to work, the identifier attribute must be declared in a DTD (or other modelling language) to be an identifier attribute.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.171.202