You can also insert your own XML elements and attributes into the query results using XML constructors. There are two kinds of XML constructors: direct constructors, which use familiar XML-like syntax, and computed constructors, that allow you to generate dynamically the XML names used in the results.
A direct element constructor is a constructor of the first kind; it specifies an XML element (optionally with attributes) using XML-like syntax, as shown in Example 5-3. The result of the query is an XHTML fragment that presents the selected data.
Example 5-3. Constructing elements using XML-like syntax
Query <html> <h1>Product Catalog</h1> <ul>{ for $prod in doc("catalog.xml")/catalog/product return <li>number: {data($prod/number)}, name: {data($prod/name)}</li> }</ul> </html> Results <html> <h1>Product Catalog</h1> <ul> <li>number: 557, name: Fleece Pullover</li> <li>number: 563, name: Floppy Sun Hat</li> <li>number: 443, name: Deluxe Travel Bag</li> <li>number: 784, name: Cotton Dress Shirt</li> </ul> </html>
The h1
, ul
, and li
elements appear in the results as XML elements. The h1
element constructor simply contains literal characters Product Catalog
, which appear in the results as the content of h1
. The ul
element constructor, on the other hand, contains another XQuery expression enclosed in curly braces. This is known as an enclosed expression, and its value becomes the content of the ul
element in the results. In this case, the enclosed expression evaluates to a sequence of li
elements, which then appear as children of the ul
element in the results. An enclosed expression may also evaluate to one or more atomic values, which appear in the results as character data.
The li
element constructor contains a combination of both literal characters (the strings number:
and , name:
) and enclosed expressions, each of which evaluates to an atomic value. Any element constructor content outside curly braces is considered a literal, no matter how much it looks like an expression.
Direct element constructors use a syntax that looks very much like XML. The tags use the same angle-bracket syntax, the names must be valid XML names, and every start tag must have a matching end tag that is properly nested. In addition, prefixed names can be used, and even namespace declarations included. As with regular XML, the attributes of a direct element constructor must have unique names. But there are a few differences from real XML. For example, expressions within curly braces can use the < operator without escaping it.
As shown in Example 5-3, element constructors can contain literal characters, other element constructors, and enclosed expressions, in any combination.
Literal characters are characters that appear outside of enclosed expressions in element constructor content. Literal characters from Example 5-3 include the string Product Catalog
in the h1
element constructor, and the string , name:
in the li
element constructor.
In addition, the literal characters can include character and predefined entity references such as  
and <
and CDATA sections (described in Chapter 21). As in XML content, the literal characters may not include unescaped less-than (<) or ampersand (&) characters; they must be escaped using <
and &
, respectively.
When a curly brace is to be included literally in the content of an element, it must be escaped by doubling it, that is, {{
for the left curly brace, or }}
for the right.
Direct element constructors can also contain other direct element constructors. In Example 5-4, the html
element constructor contains constructors for h1
and ul
. They are included directly within the content of html
, without curly braces. No special separator is used between them. The p
element constructor contains a combination of character data content, a direct element constructor (for the element i
), and an enclosed expression. As you can see, these three things can be intermingled as necessary.
In Example 5-3, the enclosed expression of the ul
element evaluates to a sequence of elements. In fact, it is possible for the enclosed expression to evaluate to a sequence of attributes or other nodes, atomic values, or even a combination of nodes and atomic values. It can even evaluate to a document node, in which case that document node is replaced by its children.
As you have seen with the li
elements, elements in the sequence become children of the element being constructed (in this case, ul
). Atomic values, on the other hand, become character data content. If the enclosed expression evaluates to a sequence of both elements and atomic values, as shown in Example 5-5, the result element has mixed content, with the order of the child elements and character data preserved.
Example 5-5. Enclosed expressions that evaluate to elements
Query for $prod in doc("catalog.xml")/catalog/product return <li>number: {$prod/number}</li> Results <li>number: <number>557</number></li> <li>number: <number>563</number></li> <li>number: <number>443</number></li> <li>number: <number>784</number></li>
The prior examples used the data
function in enclosed expressions to extract the values of the elements number
and name
. In this example, the number
element is included without applying the data
function. The results are somewhat different; instead of just the number value itself, the entire number
element is included.
If an element constructor contains an enclosed expression that evaluates to one or more attributes, these attributes become attributes of the element under construction. This is exhibited in Example 5-6, where the enclosed expression {$prod/@dept}
has been added at the beginning of the li
constructor content.
Example 5-6. Enclosed expressions that evaluate to attributes
Query
for $prod in doc("catalog.xml")/catalog/product
return <li>{$prod/@dept}
number: {$prod/number}</li>
Results
<li dept="WMN">number: <number>557</number></li>
<li dept="ACC">number: <number>563</number></li>
<li dept="ACC">number: <number>443</number></li>
<li dept="MEN">number: <number>784</number></li>
The dept
attribute appears in the results as an attribute of the li
element rather than as content of the element. If the example had used the data
function within the enclosed expression, the value of the dept
attribute would have been the first character data content of the li
element.
Enclosed expressions that evaluate to attributes must appear first in the element constructor content, before any other kinds of nodes.
If an enclosed expression evaluates to one or more atomic values, those values are simply cast to xs:string
and included as character data content of the element. When adjacent atomic values appear in the expression sequence, they are separated by a space in the element content. For example:
<li>{"x", "y", "z"}</li>
will return <li>x y z</li>
, with spaces. To avoid this, you can use three separate expressions, as in:
<li>{"x"}{"y"}{"z"}</li>
Another option is to use the concat
function to concatenate them together into a single expression, as in:
<li>{concat("x", "y", "z")}</li>
Enclosed expressions may include more than one subexpression inside the curly braces, using commas as separators. In Example 5-7, the enclosed expression in the li
constructor contains four different subexpressions, separated by commas.
Example 5-7. Enclosed expressions with multiple subexpressions
Query for $prod in doc("catalog.xml")/catalog/product return <li>{$prod/@dept,"string",5+3,$prod/number}</li> Results <li dept="WMN">string 8<number>557</number></li> <li dept="ACC">string 8<number>563</number></li> <li dept="ACC">string 8<number>443</number></li> <li dept="MEN">string 8<number>784</number></li>
The first subexpression, $prod/@dept
, evaluates to an attribute, and therefore becomes an attribute of li
.
The next two subexpressions, "string"
and 5+3
, evaluate to atomic values: a string and an integer, respectively. Note that they are separated by a space in the results.
The final subexpression, $prod/number
, is an element, which is not separated from the atomic values by a space.
You have seen how attributes can be included with the result elements by including enclosed expressions that evaluate to attributes. Attributes can also be constructed directly using XML-like syntax. Attribute values can be specified using literal text or enclosed expressions, or a combination of the two.
In Example 5-8, class
and dep
attributes are added to the h1
and li
elements, respectively. The class
attribute of h1
simply includes literal text that is repeated in the results. The dep
attribute of li
, on the other hand, includes an enclosed expression that evaluates to the value of the dept
attribute of that item
. Do not let the quotes around the expression fool you; anything in curly braces is evaluated as an enclosed expression.
Example 5-8. Specifying attributes directly using XML-like syntax
Query <html> <h1class="itemHdr"
>Product Catalog</h1> <ul>{ for $prod in doc("catalog.xml")/catalog/product return <lidep="{$prod/@dept}"
>number: {data($prod/number) }, name: {data($prod/name)}</li> }</ul> </html> Results <html> <h1 class="itemHdr">Product Catalog</h1> <ul> <li dep="WMN">number: 557, name: Fleece Pullover</li> <li dep="ACC">number: 563, name: Floppy Sun Hat</li> <li dep="ACC">number: 443, name: Deluxe Travel Bag</li> <li dep="MEN">number: 784, name: Cotton Dress Shirt</li> </ul> </html>
Note that the dep
attribute will appear regardless of whether there is a dept
attribute of the $prod
element. If the $prod
element has no dept
attribute, the dep
attribute's value will be a zero-length string. This is in contrast to Example 5-7, where li
will have a dept
attribute only if $prod
has a dept
attribute.
If literal text is used in a direct attribute constructor, it follows similar rules to the literal text in element constructors. Also, as with XML syntax, quote characters in attribute values must be escaped if they match the kind of quotes (single or double) used to delimit that value. However, you don't need to escape quotes appearing in an expression inside curly braces. The following example is valid because the inner pair of double quotes is inside curly braces:
<li dep="{substring-after($prod/@dept, "-")}"/>
The evaluation of enclosed expressions in attribute values is slightly different from those in element content. Because attributes cannot themselves have children or attributes, the attribute value must evaluate to an atomic value. Therefore, if an enclosed expression in an attribute value evaluates to one or more elements or attributes, the value of the node(s) is extracted and converted to a string.
In Example 5-8, the enclosed expression {$prod/@dept}
for the dep
attribute of li
evaluates to an attribute. The processor did not attempt to add a dept
attribute to the dep
attribute (which would not make sense). Instead, it extracted the value of the dept
attribute and used this as the value of the dep
attribute.
Just as in XML, you can specify multiple attributes on an element, as long as they have unique names. The order of the attributes is never considered significant in XML, so your attributes might not appear in your result document in the same order as you specified them in the query. There is no way to force the processor to preserve attribute order.
In addition to regular attributes, you can also include namespace declarations in direct element constructors. These namespace declaration attributes affect the element itself and all its descendants, and override any namespace declarations in the prolog or in outer element constructors. Example 5-9 shows the use of a namespace declaration in an element constructor. This is discussed in detail in the section "Namespace Declarations in Element Constructors" in Chapter 10.
Example 5-9. Using a namespace declaration in a constructor
Query
<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml"
>
<xhtml:h1 class="itemHdr">Product Catalog</xhtml:h1>
<xhtml:ul>{
for $prod in doc("catalog.xml")/catalog/product
return <xhtml:li class="{$prod/@dept}">number: {
data($prod/number)}</xhtml:li>
}</xhtml:ul>
</xhtml:html>
Results
<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xhtml:h1 class="itemHdr">Product Catalog</xhtml:h1>
<xhtml:ul>
<xhtml:li class="WMN">number: 557</xhtml:li>
<xhtml:li class="ACC">number: 563</xhtml:li>
<xhtml:li class="ACC">number: 443</xhtml:li>
<xhtml:li class="MEN">number: 784</xhtml:li>
</xhtml:ul>
</xhtml:html>
Suppose you want to include elements from the input document but want to make minor modifications such as adding or removing a child or attribute. To do this, a new element must be created using a constructor. For example, suppose you want to include product
elements from the input document, but add an additional attribute id
that is equal to the letter P
plus the product number. The query shown in Example 5-10 accomplishes this.
Example 5-10. Adding an attribute to an element
Query for $prod in doc("catalog.xml")/catalog/product[@dept = 'ACC'] return <product id="P{$prod/number}"> {$prod/(@*, *)} </product> Results <product dept="ACC" id="P563"> <number>563</number> <name language="en">Floppy Sun Hat</name> </product> <product dept="ACC" id="P443"> <number>443</number> <name language="en">Deluxe Travel Bag</name> </product>
The query makes a new copy of the product
element, which contains the enclosed expression {$prod/(@*, *)}
to copy all of the attributes and child elements from the original product
element. You could also use the broader expression {$prod/(@*, node( ))}
to copy all the child nodes of the element, including text, comments, and processing instructions.
As another example, suppose you want to copy some product
elements from the input document but remove the number
child. This can be accomplished using the query in Example 5-11. The enclosed expression $prod/(@*, * except number)
selects all the attributes and all of the child elements of product
except number
.
Example 5-11. Removing a child from an element
Query for $prod in doc("catalog.xml")/catalog/product[@dept = 'ACC'] return <product> {$prod/(@*, * except number)} </product> Results <product dept="ACC"> <name language="en">Floppy Sun Hat</name> </product> <product dept="ACC"> <name language="en">Deluxe Travel Bag</name> </product>
Additional examples of making "modifications" to elements and attributes can be found in the section "Copying Input Elements with Modifications" in Chapter 9.
Whitespace is often used in direct element constructors. For example, you may use line breaks and tabs to indent result XML elements for readability, or spaces to separate enclosed expressions. Sometimes the query author intends for whitespace to be significant (included in the results); sometimes it is just used for formatting the query for visual presentation.
Boundary whitespace is whitespace that occurs by itself (without any nonwhitespace characters) in direct element constructors. It may appear between two element constructor tags, between two enclosed expressions, or between a tag and an enclosed expression. It can be made up of any of the XML whitespace characters, namely space, tab, carriage return, and line feed.
For example, in the constructor shown in Example 5-12, there is boundary whitespace
in the ul
constructor between the ul
start tag and the left curly brace, as well as between the right curly brace and the ul
end tag. In the li
constructor, there is boundary whitespace between the li
start tag and the b
start tag, between the b
end tag and the left curly brace, and between the right curly brace and the li
end tag.
Example 5-12. Constructor with boundary whitespace
<ul> { <li> <b> number:</b> { $prod/number } </li> } </ul>
With boundary whitespace discarded,[*] the results look something like:
<ul><li><b> number:</b><number>557</number></li></ul>
Note that the whitespace before the text number:
is not discarded because it appears with other characters.
Whitespace inside enclosed expressions that is not in quotes is never considered significant. It is simply the normal whitespace allowed by XQuery syntax. In the ul
constructor, the spaces between the left curly brace and the li
start tag fall into this category. It is not technically considered boundary whitespace, and it is always discarded.
There is no boundary whitespace in attribute values. For example, in the expression:
<product dept=" {$d} "/>
the whitespace between the quotes and the enclosed expression is considered significant and therefore is preserved. The expression:
<product dept="{ $d }"/>
has no boundary whitespace either, only whitespace in an enclosed expression. This whitespace is not preserved. Line breaks are never preserved in attribute values; they are converted to spaces. This is a standard feature of XML itself, known as attribute value normalization.
By default, a query processor discards all boundary whitespace. Sometimes you want to preserve the boundary whitespace in your query results because it is significant. The boundary-space declaration, specified in the query prolog, instructs the processor how to handle boundary whitespace in direct element constructors.[†] Its syntax is shown in Figure 5-1.
The two valid values are:
preserve
This value results in boundary whitespace being preserved.
strip
This value results in boundary whitespace being deleted.
The default is strip
. For example, the boundary-space declaration:
declare boundary-space preserve;
causes whitespace to be preserved. With this boundary-space declaration, the result of the constructor in Example 5-12 becomes:
<ul> <li> <b> number:</b> <number>557</number> </li> </ul>
Table 5-1 shows some additional examples of results with and without preserved whitespace.
Table 5-1. Stripping boundary whitespace
Expression |
Value with boundary whitespace preserved |
Value with boundary whitespace stripped |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If you don't want to preserve all whitespace but wish to preserve it in one or more specific elements, you can do this in one of two ways. The first way is to include an enclosed expression that evaluates to whitespace. For example, <e>{" x "}</e>
evaluates to <e> x </e>
, regardless of the boundary-space declaration. This is because the whitespace is part of the value of the expression (the literal string).
Another method is to use a character reference to a whitespace character. Whitespace that is the result of a character reference is always considered significant. For example, <e>   {"x"}</e>
always evaluates to <e> x</e>
. Character references are described further in the section "XML Entity and Character References" in Chapter 21.
13.58.236.191