Performance

When querying large documents or databases, it is important to tune queries to optimize performance. Implementations vary significantly in their ability to take clues from the query in order to optimize its evaluation. This section provides some general tips for improving query performance. For more specific tuning information for your XQuery processor, consult the documentation.

Avoid Reevaluating the Same or Similar Expressions

A let clause can be used to evaluate an expression once and bind the value to a variable that can be referenced many times. This can be much more efficient than evaluating the expression many times. For example, suppose you want to add a bargain-bin element to your results, but only if there are products whose price is less than 30. You first need to check whether any bargain products exist, and if so, construct a bargain-bin element and list the products in it. Example 15-3 shows an example of this.

Example 15-3. Avoid re-evaluating the same expression

Less efficient query
if (doc("prices.xml")/prices/priceList/prod[price < 30])
then <bargain-bin>{
       doc("prices.xml")/*/priceList/prod[price < 30]
     }</bargain-bin>
else ( )
More efficient query
let $bargains := doc("prices.xml")/prices/priceList/prod[price < 30]
return if ($bargains)
       then <bargain-bin>{$bargains}</bargain-bin>
      else ( )

In the first query, similar path expressions appear in the if expression and in the bargain-bin element constructor. In the second query, the expression is evaluated once and bound to the variable $bargains, then referenced twice in the rest of the query. This is considerably more efficient, since the expensive expression need only be evaluated once. Using some XQuery implementations, the difference in performance can be dramatic, especially when the doc function is part of the expression.

Avoid Unnecessary Sorting

If you are not concerned about the order in which your results are returned, you can improve the performance of your query by not sorting. Some expressions, particularly path expressions and the union, intersect, and except expressions, always sort the results in document order unless they appear in an unordered expression or function. Example 15-4 shows two queries that select all the number and name elements from the catalog document.

Example 15-4. Avoid unnecessary sorting

Less efficient query
let $doc := doc("catalog.xml")
return $doc//number | $doc//name
More efficient query
unordered {
  let $doc := doc("catalog.xml")
  return $doc//(number|name)
}

The first query has two inefficiencies related to sorting:

  • It selects the elements without using an unordered expression, so each of the two path expressions sorts the elements in document order.

  • It performs a union of the two sequences, which causes them to be resorted in document order.

The more efficient query uses an unordered expression to indicate that the order of the elements does not matter. Even if you care about the order of the final results, there may be some steps along the way that can be unordered. More information on indicating that order is not significant can be found in the section "Indicating that Order Is Not Significant" in Chapter 7.

Avoid Expensive Path Expressions

The use of the descendant-or-self axis (abbreviated //) in path expressions can be very expensive, because every descendant node must be checked. If the path to the desired descendant is known and consistent, it is far more efficient to specify the exact path. Example 15-5 shows an example of this situation.

Example 15-5. Avoid expensive path expressions

Less efficient query
doc("catalog.xml")//number
More efficient query
doc("catalog.xml")/catalog/product/number

The first query uses the // abbreviation to indicate all number descendants of the input document, while the second specifies the exact path to the number descendants. Use of the parent, ancestor, o ancestor-or-self axis can also be costly when using some XQuery implementations based on databases.

Use Predicates Instead of where Clauses

Using some XQuery implementations that are based on databases, predicates are more efficient than where clauses of FLWORs. An example of this is shown in Example 15-6.

Example 15-6. Use predicates instead of where clauses

Less efficient query
for $prod in doc("catalog.xml")//product
where $prod/@dept = "ACC"
order by $prod/name
return $prod/name
More efficient query
for $prod in doc("catalog.xml")//product[@dept = "ACC"]
order by $prod/name
return $prod/name

The first query uses a where clause $prod/@dept = "ACC" to filter out elements, while the second query uses the predicate [@dept = "ACC"]. The predicate is more efficient in some implementations because it filters out the elements before they are selected from the database and stored in memory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.73.127