Direct Element Constructors

You can also insert your own XML elements and attributes into the query results using XML constructors. There are two kinds of XML constructors: direct constructors, which use familiar XML-like syntax, and computed constructors, that allow you to generate dynamically the XML names used in the results.

A direct element constructor is a constructor of the first kind; it specifies an XML element (optionally with attributes) using XML-like syntax, as shown in Example 5-3. The result of the query is an XHTML fragment that presents the selected data.

Example 5-3. Constructing elements using XML-like syntax

Query
<html>
  <h1>Product Catalog</h1>
  <ul>{
    for $prod in doc("catalog.xml")/catalog/product
    return <li>number: {data($prod/number)}, name: {data($prod/name)}</li>
  }</ul>
</html>
Results
<html>
  <h1>Product Catalog</h1>
  <ul>
    <li>number: 557, name: Fleece Pullover</li>
    <li>number: 563, name: Floppy Sun Hat</li>
    <li>number: 443, name: Deluxe Travel Bag</li>
    <li>number: 784, name: Cotton Dress Shirt</li>
  </ul>
</html>

The h1, ul, and li elements appear in the results as XML elements. The h1 element constructor simply contains literal characters Product Catalog, which appear in the results as the content of h1. The ul element constructor, on the other hand, contains another XQuery expression enclosed in curly braces. This is known as an enclosed expression, and its value becomes the content of the ul element in the results. In this case, the enclosed expression evaluates to a sequence of li elements, which then appear as children of the ul element in the results. An enclosed expression may also evaluate to one or more atomic values, which appear in the results as character data.

The li element constructor contains a combination of both literal characters (the strings number: and , name:) and enclosed expressions, each of which evaluates to an atomic value. Any element constructor content outside curly braces is considered a literal, no matter how much it looks like an expression.

Direct element constructors use a syntax that looks very much like XML. The tags use the same angle-bracket syntax, the names must be valid XML names, and every start tag must have a matching end tag that is properly nested. In addition, prefixed names can be used, and even namespace declarations included. As with regular XML, the attributes of a direct element constructor must have unique names. But there are a few differences from real XML. For example, expressions within curly braces can use the < operator without escaping it.

As shown in Example 5-3, element constructors can contain literal characters, other element constructors, and enclosed expressions, in any combination.

Containing Literal Characters

Literal characters are characters that appear outside of enclosed expressions in element constructor content. Literal characters from Example 5-3 include the string Product Catalog in the h1 element constructor, and the string , name: in the li element constructor.

In addition, the literal characters can include character and predefined entity references such as &#x20; and &lt; and CDATA sections (described in Chapter 21). As in XML content, the literal characters may not include unescaped less-than (<) or ampersand (&) characters; they must be escaped using &lt; and &amp;, respectively.

When a curly brace is to be included literally in the content of an element, it must be escaped by doubling it, that is, {{ for the left curly brace, or }} for the right.

Containing Other Element Constructors

Direct element constructors can also contain other direct element constructors. In Example 5-4, the html element constructor contains constructors for h1 and ul. They are included directly within the content of html, without curly braces. No special separator is used between them. The p element constructor contains a combination of character data content, a direct element constructor (for the element i), and an enclosed expression. As you can see, these three things can be intermingled as necessary.

Example 5-4. Embedded direct element constructors

Query
<html>
  <h1>Product Catalog</h1>
  <p>A <i>huge</i> list of {count(doc("catalog.xml")//product)} products.</p>
</html>
Results
<html>
  <h1>Product Catalog</h1>
  <p>A <i>huge</i> list of 4 products.</p>
</html>

Containing Enclosed Expressions

In Example 5-3, the enclosed expression of the ul element evaluates to a sequence of elements. In fact, it is possible for the enclosed expression to evaluate to a sequence of attributes or other nodes, atomic values, or even a combination of nodes and atomic values. It can even evaluate to a document node, in which case that document node is replaced by its children.

Enclosed expressions that evaluate to elements

As you have seen with the li elements, elements in the sequence become children of the element being constructed (in this case, ul). Atomic values, on the other hand, become character data content. If the enclosed expression evaluates to a sequence of both elements and atomic values, as shown in Example 5-5, the result element has mixed content, with the order of the child elements and character data preserved.

Example 5-5. Enclosed expressions that evaluate to elements

Query
for $prod in doc("catalog.xml")/catalog/product
return <li>number: {$prod/number}</li>
Results
<li>number: <number>557</number></li>
<li>number: <number>563</number></li>
<li>number: <number>443</number></li>
<li>number: <number>784</number></li>

The prior examples used the data function in enclosed expressions to extract the values of the elements number and name. In this example, the number element is included without applying the data function. The results are somewhat different; instead of just the number value itself, the entire number element is included.

Enclosed expressions that evaluate to attributes

If an element constructor contains an enclosed expression that evaluates to one or more attributes, these attributes become attributes of the element under construction. This is exhibited in Example 5-6, where the enclosed expression {$prod/@dept} has been added at the beginning of the li constructor content.

Example 5-6. Enclosed expressions that evaluate to attributes

Query
for $prod in doc("catalog.xml")/catalog/product
return <li>{$prod/@dept}number: {$prod/number}</li>
Results
<li dept="WMN">number: <number>557</number></li>
<li dept="ACC">number: <number>563</number></li>
<li dept="ACC">number: <number>443</number></li>
<li dept="MEN">number: <number>784</number></li>

The dept attribute appears in the results as an attribute of the li element rather than as content of the element. If the example had used the data function within the enclosed expression, the value of the dept attribute would have been the first character data content of the li element.

Enclosed expressions that evaluate to attributes must appear first in the element constructor content, before any other kinds of nodes.

Enclosed expressions that evaluate to atomic values

If an enclosed expression evaluates to one or more atomic values, those values are simply cast to xs:string and included as character data content of the element. When adjacent atomic values appear in the expression sequence, they are separated by a space in the element content. For example:

<li>{"x", "y", "z"}</li>

will return <li>x y z</li>, with spaces. To avoid this, you can use three separate expressions, as in:

<li>{"x"}{"y"}{"z"}</li>

Another option is to use the concat function to concatenate them together into a single expression, as in:

<li>{concat("x", "y", "z")}</li>

Enclosed expressions with multiple subexpressions

Enclosed expressions may include more than one subexpression inside the curly braces, using commas as separators. In Example 5-7, the enclosed expression in the li constructor contains four different subexpressions, separated by commas.

Example 5-7. Enclosed expressions with multiple subexpressions

Query
for $prod in doc("catalog.xml")/catalog/product
return <li>{$prod/@dept,"string",5+3,$prod/number}</li>
Results
<li dept="WMN">string 8<number>557</number></li>
<li dept="ACC">string 8<number>563</number></li>
<li dept="ACC">string 8<number>443</number></li>
<li dept="MEN">string 8<number>784</number></li>

The first subexpression, $prod/@dept, evaluates to an attribute, and therefore becomes an attribute of li.

The next two subexpressions, "string" and 5+3, evaluate to atomic values: a string and an integer, respectively. Note that they are separated by a space in the results.

The final subexpression, $prod/number, is an element, which is not separated from the atomic values by a space.

Specifying Attributes Directly

You have seen how attributes can be included with the result elements by including enclosed expressions that evaluate to attributes. Attributes can also be constructed directly using XML-like syntax. Attribute values can be specified using literal text or enclosed expressions, or a combination of the two.

In Example 5-8, class and dep attributes are added to the h1 and li elements, respectively. The class attribute of h1 simply includes literal text that is repeated in the results. The dep attribute of li, on the other hand, includes an enclosed expression that evaluates to the value of the dept attribute of that item. Do not let the quotes around the expression fool you; anything in curly braces is evaluated as an enclosed expression.

Example 5-8. Specifying attributes directly using XML-like syntax

Query
<html>
<h1 class="itemHdr">Product Catalog</h1>
<ul>{
  for $prod in doc("catalog.xml")/catalog/product
  return <li dep="{$prod/@dept}">number: {data($prod/number)
             }, name: {data($prod/name)}</li>
}</ul>
</html>
Results
<html>
<h1 class="itemHdr">Product Catalog</h1>
<ul>
  <li dep="WMN">number: 557, name: Fleece Pullover</li>
  <li dep="ACC">number: 563, name: Floppy Sun Hat</li>

  <li dep="ACC">number: 443, name: Deluxe Travel Bag</li>
  <li dep="MEN">number: 784, name: Cotton Dress Shirt</li>
</ul>
</html>

Note that the dep attribute will appear regardless of whether there is a dept attribute of the $prod element. If the $prod element has no dept attribute, the dep attribute's value will be a zero-length string. This is in contrast to Example 5-7, where li will have a dept attribute only if $prod has a dept attribute.

If literal text is used in a direct attribute constructor, it follows similar rules to the literal text in element constructors. Also, as with XML syntax, quote characters in attribute values must be escaped if they match the kind of quotes (single or double) used to delimit that value. However, you don't need to escape quotes appearing in an expression inside curly braces. The following example is valid because the inner pair of double quotes is inside curly braces:

 <li dep="{substring-after($prod/@dept, "-")}"/>

The evaluation of enclosed expressions in attribute values is slightly different from those in element content. Because attributes cannot themselves have children or attributes, the attribute value must evaluate to an atomic value. Therefore, if an enclosed expression in an attribute value evaluates to one or more elements or attributes, the value of the node(s) is extracted and converted to a string.

In Example 5-8, the enclosed expression {$prod/@dept} for the dep attribute of li evaluates to an attribute. The processor did not attempt to add a dept attribute to the dep attribute (which would not make sense). Instead, it extracted the value of the dept attribute and used this as the value of the dep attribute.

Just as in XML, you can specify multiple attributes on an element, as long as they have unique names. The order of the attributes is never considered significant in XML, so your attributes might not appear in your result document in the same order as you specified them in the query. There is no way to force the processor to preserve attribute order.

Declaring Namespaces in Direct Constructors

In addition to regular attributes, you can also include namespace declarations in direct element constructors. These namespace declaration attributes affect the element itself and all its descendants, and override any namespace declarations in the prolog or in outer element constructors. Example 5-9 shows the use of a namespace declaration in an element constructor. This is discussed in detail in the section "Namespace Declarations in Element Constructors" in Chapter 10.

Example 5-9. Using a namespace declaration in a constructor

Query
<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <xhtml:h1 class="itemHdr">Product Catalog</xhtml:h1>
  <xhtml:ul>{
    for $prod in doc("catalog.xml")/catalog/product
    return <xhtml:li class="{$prod/@dept}">number: {
                               data($prod/number)}</xhtml:li>
  }</xhtml:ul>
</xhtml:html>
Results
<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <xhtml:h1 class="itemHdr">Product Catalog</xhtml:h1>
  <xhtml:ul>
    <xhtml:li class="WMN">number: 557</xhtml:li>
    <xhtml:li class="ACC">number: 563</xhtml:li>
    <xhtml:li class="ACC">number: 443</xhtml:li>
    <xhtml:li class="MEN">number: 784</xhtml:li>
  </xhtml:ul>
</xhtml:html>

Use Case: Modifying an Element from the Input Document

Suppose you want to include elements from the input document but want to make minor modifications such as adding or removing a child or attribute. To do this, a new element must be created using a constructor. For example, suppose you want to include product elements from the input document, but add an additional attribute id that is equal to the letter P plus the product number. The query shown in Example 5-10 accomplishes this.

Example 5-10. Adding an attribute to an element

Query
for $prod in doc("catalog.xml")/catalog/product[@dept = 'ACC']
return <product id="P{$prod/number}">
          {$prod/(@*, *)}
       </product>
Results
<product dept="ACC" id="P563">
  <number>563</number>
  <name language="en">Floppy Sun Hat</name>
</product>
<product dept="ACC" id="P443">
  <number>443</number>
  <name language="en">Deluxe Travel Bag</name>
</product>

The query makes a new copy of the product element, which contains the enclosed expression {$prod/(@*, *)} to copy all of the attributes and child elements from the original product element. You could also use the broader expression {$prod/(@*, node( ))} to copy all the child nodes of the element, including text, comments, and processing instructions.

As another example, suppose you want to copy some product elements from the input document but remove the number child. This can be accomplished using the query in Example 5-11. The enclosed expression $prod/(@*, * except number) selects all the attributes and all of the child elements of product except number.

Example 5-11. Removing a child from an element

Query
for $prod in doc("catalog.xml")/catalog/product[@dept = 'ACC']
return <product>
          {$prod/(@*, * except number)}
       </product>
Results
<product dept="ACC">
  <name language="en">Floppy Sun Hat</name>
</product>
<product dept="ACC">
  <name language="en">Deluxe Travel Bag</name>
</product>

Additional examples of making "modifications" to elements and attributes can be found in the section "Copying Input Elements with Modifications" in Chapter 9.

Direct Element Constructors and Whitespace

Whitespace is often used in direct element constructors. For example, you may use line breaks and tabs to indent result XML elements for readability, or spaces to separate enclosed expressions. Sometimes the query author intends for whitespace to be significant (included in the results); sometimes it is just used for formatting the query for visual presentation.

Boundary whitespace

Boundary whitespace is whitespace that occurs by itself (without any nonwhitespace characters) in direct element constructors. It may appear between two element constructor tags, between two enclosed expressions, or between a tag and an enclosed expression. It can be made up of any of the XML whitespace characters, namely space, tab, carriage return, and line feed.

For example, in the constructor shown in Example 5-12, there is boundary whitespace in the ul constructor between the ul start tag and the left curly brace, as well as between the right curly brace and the ul end tag. In the li constructor, there is boundary whitespace between the li start tag and the b start tag, between the b end tag and the left curly brace, and between the right curly brace and the li end tag.

Example 5-12. Constructor with boundary whitespace

<ul>
  {  <li>  <b> number:</b> { $prod/number }  </li>   }
</ul>

With boundary whitespace discarded,[*] the results look something like:

<ul><li><b> number:</b><number>557</number></li></ul>

Note that the whitespace before the text number: is not discarded because it appears with other characters.

Whitespace inside enclosed expressions that is not in quotes is never considered significant. It is simply the normal whitespace allowed by XQuery syntax. In the ul constructor, the spaces between the left curly brace and the li start tag fall into this category. It is not technically considered boundary whitespace, and it is always discarded.

There is no boundary whitespace in attribute values. For example, in the expression:

<product dept="    {$d}     "/>

the whitespace between the quotes and the enclosed expression is considered significant and therefore is preserved. The expression:

<product dept="{    $d     }"/>

has no boundary whitespace either, only whitespace in an enclosed expression. This whitespace is not preserved. Line breaks are never preserved in attribute values; they are converted to spaces. This is a standard feature of XML itself, known as attribute value normalization.

The boundary-space declaration

By default, a query processor discards all boundary whitespace. Sometimes you want to preserve the boundary whitespace in your query results because it is significant. The boundary-space declaration, specified in the query prolog, instructs the processor how to handle boundary whitespace in direct element constructors.[] Its syntax is shown in Figure 5-1.

The two valid values are:

preserve

This value results in boundary whitespace being preserved.

strip

This value results in boundary whitespace being deleted.

Syntax of a boundary-space declaration

Figure 5-1. Syntax of a boundary-space declaration

The default is strip. For example, the boundary-space declaration:

declare boundary-space preserve;

causes whitespace to be preserved. With this boundary-space declaration, the result of the constructor in Example 5-12 becomes:

<ul>
  <li>  <b> number:</b> <number>557</number>  </li>
</ul>

Table 5-1 shows some additional examples of results with and without preserved whitespace.

Table 5-1. Stripping boundary whitespace

Expression

Value with boundary whitespace preserved

Value with boundary whitespace stripped

<e>

 <c></c>

</e>

<e>

 <c></c>

</e>

<e><c></c></e>

<e> {"x"} </e>

<e> x </e>

<e>x</e>

<e> {( )} </e>

<e> </e>

<e></e>

<e>{"x"} {"y"}</e>

<e>x y</e>

<e>xy</e>

<e> x {"y"}</e>

<e> x y</e>

<e> x y</e>

<e>{" x "}</e>

<e> x </e>

<e> x </e>

<e>{ "x" }</e>

<e>x</e>

<e>x</e>

<e> &#x20; {"x"}</e>

<e> x</e>

<e> x</e>

<e> </e>

<e> </e>

<e></e>

Forcing boundary whitespace preservation

If you don't want to preserve all whitespace but wish to preserve it in one or more specific elements, you can do this in one of two ways. The first way is to include an enclosed expression that evaluates to whitespace. For example, <e>{" x "}</e> evaluates to <e> x </e>, regardless of the boundary-space declaration. This is because the whitespace is part of the value of the expression (the literal string).

Another method is to use a character reference to a whitespace character. Whitespace that is the result of a character reference is always considered significant. For example, <e> &#x20; {"x"}</e> always evaluates to <e> x</e>. Character references are described further in the section "XML Entity and Character References" in Chapter 21.



[*] Although the boundary whitespace will be discarded, if you choose to serialize your results, your processor may add whitespace to indent them. Therefore, your results may vary.

[] By contrast, the xml:space attribute on a constructed element has no effect on boundary whitespace.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.236.191