Text nodes represent the character data content within elements. Every adjacent string of characters within element content makes up a single text node. Text nodes can be both queried and constructed in XQuery, although these expressions have limited usefulness.
A text node does not have any children, and its parent is an element. In Example 21-8, the desc
element has three children:
A text node whose content is Our
(ending with a space)
A child element i
A text node whose content is shirt!
(starting with a space)
The i
element itself has one child: a text node whose content is favorite
.
The string value of a text node is its content, as an instance of xs:string
. Its typed value is the same as the string value, except that it is of type xs:untypedAtomic
rather than xs:string
.
Text nodes do not have names, so calling any of the name-related functions with a text node will result in the empty sequence or a zero-length string, depending on the function.
If your document has no DTD or schema, any whitespace appearing between the tags in your source XML will be translated into text nodes. This is true even if it is just there to indent the document. For example, the following b:header
element node:
<b:header> <b:date>2006-10-15</b:date> </b:header>
has three children. The first and third children are text nodes that contain only whitespace, and the second child is, of course, the b:date
element node. If a DTD or schema is used, and the element's type allows only child elements (no character data content), then the whitespace will be discarded and b:header
will not have text node children.
In the data model, there are never two adjacent text nodes with the same parent; all adjacent text is merged into a single text node. This means that if you construct a new element using:
<example>{1}{2}{3}</example>
the resulting example
element will have only one text node child, whose value is 123
. There is also no such thing as an empty text node, so the element constructor:
<example>{""}</example>
will result in an element with no children at all.
Text nodes can be queried using path expressions. The text( )
kind test can be used to specifically ask for text nodes. For example:
doc("desc.xml")//text( )
will return all of the three text nodes in the document, while:
doc("desc.xml")/desc/text( )
will return only the two text nodes that are children of desc
.
The node( )
kind test will return text nodes as well as all other node kinds. For example:
doc("desc.xml")/desc/node( )
will return a sequence consisting of the first text node, the i
element node, and the second text node. This is in contrast to *
, which selects child element nodes only.
The text( )
keyword can also be used in sequence types to match text nodes. For example, to display the content of a text node as a string, you could use the function shown in Example 21-9. The use of the text( )
sequence type in the function signature ensures that only text nodes are passed to this function.
Example 21-9. Function that displays text nodes
declare function local:displayTextNodeContent ($textNode as text( )) as xs:string { concat("Content of the text node is ", $textNode) };
A text node will also match the node( )
and item( )
sequence types.
Because text nodes contain all the data content of elements, it may seem that the text( )
kind test would be used frequently and would be covered earlier in this book. However, because of atomization and casting, it is often unnecessary to ask explicitly for the text nodes. For example, the expression:
doc("catalog.xml")//product[name/text( )="Floppy Sun Hat"]
has basically the same effect as:
doc("catalog.xml")//product[name="Floppy Sun Hat"]
because the name
element is atomized before being compared to the string Floppy Sun Hat
. Likewise, the expression:
distinct-values(doc("catalog.xml")//product/number/text( ))
is very similar to:
distinct-values(doc("catalog.xml")//product/number)
because the function conversion rules call for atomization of the number
elements.
One difference is that text nodes, when atomized, result in untyped values, while element nodes will take on the type specified in the schema. Therefore, if your number
element is of type xs:integer
, the second distinct-values
expression above will compare the numbers as integers. The first expression will compare them as untyped values, which, according to the rules of the distinct-values
function, means that they are treated like strings.
Not only is it almost always unnecessary to use the node test text( )
, it sometimes yields surprising results. For example, the expression:
doc("catalog.xml")//product[4]/desc/text( )
has a string value of Our shirt!
instead of Our favorite shirt!
because only the text nodes that are direct children of the desc
element are included. If /text( )
is left out of the expression, its string value is Our favorite shirt!
.
There are some cases where the text( )
sequence type does come in handy, though. One case is when you are working with mixed content and want to work with each text node specifically. For example, suppose you wanted to modify the product catalog to change all the i
elements to em
elements (without knowing in advance where i
elements appear). You could use the recursive function shown in Example 21-10.
Example 21-10. Testing for text nodes
declare function local:change-i-to-em ($node as element()) as node( ) { element {node-name($node)} { $node/@*, for $child in $node/node( ) return if ($child instance of text( )) then $child else if ($child instance of element(i)) then <em>{$child/@*,$child/node( )}</em> else if ($child instance of element( )) then local:change-i-to-em($child) else ( ) } };
The function checks all the children of an element node. If it encounters a text node, it copies it as is. If it encounters an element child, it recursively calls itself to process that child element's children. When it encounters an i
element, it constructs an em
and includes the original children of i
.
It is important in this case to test for text nodes because the desc
element has mixed content; it contains both text nodes and child element nodes. If you throw away the text nodes, it changes the content of the document.
You can also construct text nodes, using a text node constructor. The syntax of a text node constructor, shown in Figure 21-4, consists of an expression enclosed by text{
and }
. For example, the expression:
text{concat("Sequence number: ", $seq)}
will construct a text node whose content is Sequence number: 1
.
The value of the expression used in the constructor is atomized (if necessary) and cast to xs:string
. Text node constructors have limited usefulness in XQuery because they are created automatically in element constructors using literal text or expressions that return atomic values. For example, the expression:
<example>{concat("Sequence number: ", $seq)}</example>
will automatically create a text node as a child of the example
element node. No explicit text node constructor is needed.
3.147.89.30