Queries should be able to handle all possible combinations of values of input data. This includes handling any potential variations in the data, and considering the impact of missing or empty values.
It is important to consider variations in the input documents that may cause incorrect results or dynamic errors. Some common problems occur when:
For example, the expressions $prod[name eq "Floppy Sun Hat"]
and substring($prod/name, 1, 30)
raise an error if there is more than one name
child. The expression $prod/name != "Floppy Sun Hat"
evaluates to true
if two name
children exist and either one is not equal to Floppy Sun Hat
.
For example, the expression $prod/price - $prod/discount
returns the empty sequence if there is no discount
element.
For example, the expression max($prod/number)
raises a type error if the product number is N/A
instead of an integer.
Especially zero and negative numbers where a positive number was expected.
You should not assume that because an input document is validated by a schema it must be exactly as you expect. A schema can validate, for example, that a required element is present, but other assumptions might be made that cannot be validated by a schema. For example, the constraint "if the discounted
attribute is true
, a discount
child must appear" cannot be validated by XML Schema.
Individual data values may be missing from an input document. Sometimes these missing values are in error, and sometimes they are not. In either case, they need to be handled gracefully.
Suppose you are calculating the sale price of an item by taking the regular price, and applying a discount. Some product prices have discounts, but others do not. The absence of a discount could be represented in (at least) four ways:
A discount
element or attribute that is entirely absent from the input document
An empty element or an attribute whose value is a zero-length string—for example, <discount></discount>
, <discount/>
, or discount=""
An element that is marked with the attribute xsi:nil
—for example, <discount xsi:nil="true"></discount>
An element or attribute that has a default "missing" value such as N/A
or 0
—for example, <discount>N/A</discount>
An expression used to calculate a sequence of prices might be:
for $prod in doc("prices.xml")//prod return $prod/price - $prod/discount
In the case where the discount
element is absent,the value of the discount
expression is the empty sequence, and therefore the $prod/price - $prod/discount
expression also returns the empty sequence. You probably intended instead for the discount to default to zero, and for your expression to return the price itself if no discount was available.
Another problem that might occur when the discount value is missing is in calculating an average. To find the average discount, you might be tempted to use an expression like avg(doc("prices.xml")//discount)
. However, that function gives the average of the discount values that exist, ignoring any prices that do not have discounts.
The second possible scenario is that the discount
element appears, but it is empty (and it may or may not have an xsi:nil
attribute). It may appear in the input document as either <discount></discount>
or <discount/>
, which are equivalent elements in XML. This scenario also poses problems for the $prod/price - $prod/discount
expression. Assuming the discount
element is untyped, the processor attempts to cast the empty value to xs:double
for use of the arithmetic expression. This results in a type error because the value is not a valid number.
The final scenario is one where the discount
element contains a value such as N/A
to indicate that it is absent. A simple conditional expression can handle this scenario, as in:
for $prod in doc("prices.xml")//prod return if ($prod/
discount = "N/A") then$prod/
price else$prod/
price -$prod/
discount
Alternatively, if N/A
is used for a number of different elements, you could alter the if-absent
or if-empty
function to check for this value, too.
3.137.183.210