When querying large documents or databases, it is important to tune queries to optimize performance. Implementations vary significantly in their ability to take clues from the query in order to optimize its evaluation. This section provides some general tips for improving query performance. For more specific tuning information for your XQuery processor, consult the documentation.
A let
clause can be used to evaluate an expression once and bind the value to a variable that can be referenced many times. This can be much more efficient than evaluating the expression many times. For example, suppose you want to add a bargain-bin
element to your results, but only if there are products whose price is less than 30. You first need to check whether any bargain products exist, and if so, construct a bargain-bin
element and list the products in it. Example 15-3 shows an example of this.
Example 15-3. Avoid re-evaluating the same expression
Less efficient query if (doc("prices.xml")/prices/priceList/prod[price < 30]) then <bargain-bin>{ doc("prices.xml")/*/priceList/prod[price < 30] }</bargain-bin> else ( ) More efficient query let $bargains := doc("prices.xml")/prices/priceList/prod[price < 30] return if ($bargains) then <bargain-bin>{$bargains}</bargain-bin> else ( )
In the first query, similar path expressions appear in the if expression and in the bargain-bin
element constructor. In the second query, the expression is evaluated once and bound to the variable $bargains
, then referenced twice in the rest of the query. This is considerably more efficient, since the expensive expression need only be evaluated once. Using some XQuery implementations, the difference in performance can be dramatic, especially when the doc
function is part of the expression.
If you are not concerned about the order in which your results are returned, you can improve the performance of your query by not sorting. Some expressions, particularly path expressions and the union, intersect, and except expressions, always sort the results in document order unless they appear in an unordered
expression or function. Example 15-4 shows two queries that select all the number
and name
elements from the catalog document.
Example 15-4. Avoid unnecessary sorting
Less efficient query let $doc := doc("catalog.xml") return $doc//number | $doc//name More efficient query unordered { let $doc := doc("catalog.xml") return $doc//(number|name) }
The first query has two inefficiencies related to sorting:
It selects the elements without using an unordered
expression, so each of the two path expressions sorts the elements in document order.
It performs a union of the two sequences, which causes them to be resorted in document order.
The more efficient query uses an unordered
expression to indicate that the order of the elements does not matter. Even if you care about the order of the final results, there may be some steps along the way that can be unordered. More information on indicating that order is not significant can be found in the section "Indicating that Order Is Not Significant" in Chapter 7.
The use of the descendant-or-self
axis (abbreviated //) in path expressions
can be very expensive, because every descendant node must be checked. If the path to the desired descendant is known and consistent, it is far more efficient to specify the exact path. Example 15-5 shows an example of this situation.
Example 15-5. Avoid expensive path expressions
Less efficient query
doc("catalog.xml")//number
More efficient query
doc("catalog.xml")/catalog/product
/number
The first query uses the //
abbreviation to indicate all number
descendants of the input document, while the second specifies the exact path to the number
descendants. Use of the parent
, ancestor
, o ancestor-or-self
axis can also be costly when using some XQuery implementations based on databases.
Using some XQuery implementations that are based on databases, predicates are more efficient than where
clauses of FLWORs. An example of this is shown in Example 15-6.
Example 15-6. Use predicates instead of where clauses
Less efficient query
for $prod in doc("catalog.xml")//product
where $prod/@dept = "ACC"
order by $prod/name
return $prod/name
More efficient query
for $prod in doc("catalog.xml")//product[@dept = "ACC"]
order by $prod/name
return $prod/name
The first query uses a where
clause $prod/@dept = "ACC"
to filter out elements, while the second query uses the predicate [@dept = "ACC"]
. The predicate is more efficient in some implementations because it filters out the elements before they are selected from the database and stored in memory.
18.117.73.127