Chapter 17. Working with Numbers

A variety of numerical calculations can be performed using XQuery. This chapter describes the major numeric types, along with the operators and functions that act on numeric values. These include comparisons, arithmetic operations, and functions that operate on numeric values such as round and sum.

The Numeric Types

The four main numeric types supported in XQuery are xs:decimal, xs:integer, xs:float, and xs:double. All the operations and functions that can be performed on these types of numeric values can also be performed on values whose types are restrictions of these types. This includes user-defined types that appear in a schema, as well as the built-in derived types such as xs:positiveInteger and xs:unsignedByte. For a complete list and explanation of these built-in derived types, see Appendix B.

The xs:decimal Type

The type xs:decimal represents a signed decimal number of implementation-defined precision. Numeric literals that contain only digits and a decimal point (no letter E or e) are considered decimal numbers, with the type xs:decimal. For example, 25.5 and 25.0 are xs:decimal values.

The xs:integer Type

The type xs:integer represents a signed integer. The limit on how large an xs:integer value can be is implementation-defined. Numeric literals that contain only digits (no decimal points or the letter E or e) are considered integers, with the type xs:integer. For example, 25 is an xs:integer value.

In the type hierarchy, xs:integer is derived from xs:decimal. Therefore, anywhere XQuery is expecting an xs:decimal value, an xs:integer value may be used in its place because of subtype substitution.

The xs:float and xs:double Types

The type xs:float is patterned after IEEE single-precision 32-bit floating-point numbers, and xs:double is patterned after IEEE double-precision 64-bit floating-point numbers. The representation of both xs:float and xs:double values is a mantissa (a decimal number) followed by the character E or e, followed by an exponent, which must be an integer. For example, 3E2 represents 3 × 102, or 300. Numeric literals that contain an E or e are considered to have the type xs:double.

In addition, the following values are represented: INF (infinity), -INF (negative infinity), and NaN (not a number).

The xs:numeric Type

The type xs:numeric is a union type that includes xs:double, xs:float, and xs:decimal. It is most commonly used in the signatures of functions that accept any numeric type. For example, the abs function accepts an argument of type xs:numeric, which means that an xs:double, xs:float, or xs:decimal value can be passed to it. Because of the function conversion rules, xs:integer values are also allowed, since xs:integer is derived from xs:decimal.

Despite its xs prefix, this type is not defined by XML Schema. However, it is implicitly supported by XQuery 3.1 processors. In previous versions of XQuery, a special keyword numeric (no prefix) was used in built-in function signatures instead.

Constructing Numeric Values

How does a value become “numeric”? As with any type, a value may be assigned one of the numeric types in a number of ways, for example:

  • It may be selected from an input document that has a schema declaring it to have a numeric type.

  • It may be a numeric literal value that appears in the query and is not surrounded by quotes. For example, $price > 25.5 compares $price to the xs:decimal value 25.5.

  • It may be the result of a function that returns a number, such as count($products), which returns an xs:integer.

  • It may be the result of one of the standard constructor functions, such as:

    • xs:float("25.5E3"), which constructs an xs:float value from a string

    • xs:decimal($prod/price), which constructs an xs:decimal value from an element

  • It may be the result of an explicit cast, such as $prod/price cast as xs:decimal.

  • If it is untyped, it may be cast automatically when it is passed to a function, such as the sum function.

  • A random number can be generated with the random-number-generator function.

The number Function

In addition to the standard type constructors, the number function is useful for telling the processor to treat a node or atomic value as a number, regardless of its declared type (if any). It returns that argument cast as an xs:double. If no argument is provided, the number function uses the context node.

One difference between using the number function and the xs:double constructor is that the number function returns the xs:double value NaN in the case that the value cannot be cast to a numeric value, whereas the xs:double constructor raises an error. Table 17-1 shows some examples that use the number function.

Table 17-1. The number function
ExampleReturn value
number(doc("prices.xml")//prod[1]/price) 29.99
number(doc("prices.xml")//prod[1]/price/@currency) NaN
number("29.99") 29.99
number( () ) NaN

Numeric Type Promotion

If an operation, such as a comparison or arithmetic operation, is performed on values of two different primitive numeric types, one value is promoted to the type of the other value. Specifically, an xs:decimal value can be promoted to xs:float or xs:double, and an xs:float value to xs:double. For example, the expression 1.0 + 1.2E0 adds a decimal number to a floating-point number. The xs:decimal number (1.0) is promoted to xs:double before the expression is evaluated.

Numeric type promotion happens automatically in arithmetic expressions and comparison expressions. It is also used in calls to functions that expect numeric values. For example, if a function expects an xs:double value, you can pass it an xs:decimal value, and xs:decimal will be promoted to xs:double.

In addition to these specific promotion rules, any numeric value can be treated as if it has its type’s base type or any ancestor type. This is known as subtype substitution. For example, if in your schema you define a type myDecimal that is derived by restriction from xs:decimal, a myDecimal value can be added to an xs:decimal value, returning an xs:decimal value. This rule also applies to built-in types. For example, since xs:integer is derived from xs:decimal, an xs:integer value can be used anywhere an xs:decimal value is expected.

Comparing Numeric Values

Two numeric values can be compared using the general comparison operators: =, !=, <, <=, >, and >=. Values of different numeric types can be compared; one is promoted to the other’s type. Nodes that contain numeric values can also be compared using these operators; in that case, they are atomized to extract their typed values. Table 17-2 shows some examples of comparing numeric values.

Some caution should be used when comparing untyped values using the general comparison operators. When an untyped value is compared to a numeric value (for example, a numeric literal), it is cast to the numeric type. However, when two untyped values are compared, they are treated like strings. This means that, for example, the untyped value 100 would evaluate to less than the value 99. If you want to compare two untyped values, you must explicitly cast the value(s) to a numeric type, as shown in the fourth example in Table 17-2. The table assumes that prices.xml is untyped, i.e., has not been validated with a schema.

Table 17-2. Comparing numeric values (assuming untyped input)
ExampleReturn value
doc("prices.xml")//prod[3]/discount > 10 false
doc("prices.xml")//prod[3]/discount gt 10 Error XPTY0004
doc("prices.xml")//prod[3]/discount > doc("prices.xml")//prod[1]/discount true (it is comparing the string 3.99 to the string 10.00)
doc("prices.xml")//prod[3]/number(discount) > doc("prices.xml")//prod[1]/number(discount) false (it is comparing the number 3.99 to the number 10.00)
3 gt 2 true
1 = 1.0 true
xs:float("NaN") = xs:float("NaN") false
xs:string(xs:float("NaN")) = "NaN" true

Numeric values can also be compared using the value comparison operators: eq, ne, lt, le, gt, and ge. However, the value comparison operators treat every untyped operand like a string, even if the other operand is numeric. This means that if you want a numeric comparison, you have to say so, by using an explicit cast.

The value INF (positive infinity) is greater than all other values, and -INF (negative infinity) is less than all other values, but each equals itself. The value NaN cannot usefully be compared with any other value (including itself) by using comparison operators, because the result of the comparison operation is always false (unless the operator is !=, in which case it’s always true). To determine whether a value is NaN, you can compare its string value to the string NaN, as in string($myVal) = "NaN". In some functions, such as the distinct-values function, NaN is considered to be equal to itself.

Arithmetic Operations

The following typical arithmetic operations can be performed on numeric values:

  • Addition and subtraction using the plus (+) and the minus (-) sign

  • Negation of a single value using the minus sign (-)

  • Multiplication using the * operator

  • Division using the div operator

  • Integer division (with results truncated) using the idiv operator

  • Modulus (the remainder of a division) using the mod operator

Some of these arithmetic operators can be used on date and time types in addition to numeric types. Date/time arithmetic is described in “Using Arithmetic Operators on Dates, Times, and Durations”.

If the value NaN is involved in an arithmetic operation (and the other operand is not the empty sequence), the result is always NaN. If the empty sequence is used in an arithmetic operation, the result is always the empty sequence. It is important to understand that the empty sequence is different from zero. For example, $prod/price - $prod/discount is equal to the empty sequence (not the value of $prod/price) if there is no element that matches the $prod/discount path.

Arithmetic Operations on Multiple Values

Arithmetic operators cannot accept a sequence of more than one value as one of their operands. For example:

doc("prices.xml")//price * 2

will raise a type error because more than one price element is returned by the path expression. To perform an arithmetic operation on a sequence of values, you can put parentheses around the arithmetic operation, as in:

doc("prices.xml")//(price * 2)

which will perform the operation on each price element individually and return a sequence of doubled price values. You could get the same results by using a simple map operator, as in:

doc("prices.xml")//price ! (. * 2)

Arithmetic Operations and Types

When an operation is performed on two values that are the same type, the result is also a value of that type. For example, adding two xs:integer values results in an xs:integer. However, if an operation is performed on values of two different numeric types, one value is promoted to the type of the other value. For example, adding an xs:decimal to an xs:float results in an xs:float. This is true for all arithmetic operations except division of two xs:integer values, which results in an xs:decimal, and integer division, which always results in an xs:integer.

If an untyped value is used in an arithmetic operation, it is automatically cast to xs:double. For example, when adding the xs:integer 2 to the untyped value 3, the untyped value is cast to xs:double, and the result is the xs:double value 5. All non-numeric types must be explicitly cast to a numeric type before being used in an arithmetic operation.

Atomization occurs on the operands of arithmetic expressions. This means that the operations can be performed on nodes that contain numeric values, as well as numeric atomic values themselves. For example, an arithmetic expression might be ($price * 2) if $price is bound to a single node that contains a numeric value. For more information on atomization, see “Atomization”.

Precedence of Arithmetic Operators

Multiplication and division take precedence over addition and subtraction, as is customary in mathematical expressions. For example, 2 + 3 * 5 is equal to 2 + (3 * 5), or 17, rather than (2 + 3) * 5, or 25. The unary minus operator has precedence over all others. For example, - 3 + 5 is equal to 2, not -(3 + 5), or -8.

Multiplication and division operators (*, div, idiv, and mod) have equal precedence and are evaluated from left to right. Likewise, addition and subtraction operators have equal precedence and are evaluated from left to right. When in doubt, it is a good practice to use parentheses to delimit expressions for the sake of clarity.

Addition, Subtraction, and Multiplication

Addition, subtraction, and multiplication are straightforward. Table 17-3 shows some examples.

Table 17-3. Examples of arithmetic expressions
ExampleReturn valueReturn type
5 + 3 8 xs:integer
5 + 3.0 8 xs:decimal
5 + 3.0E0 8 xs:double
5 * 3 15 xs:integer
2 + 3 * 5 17 xs:integer
(2 + 3) * 5 25 xs:integer
- 3 + 5 2 xs:integer
() + 3 () N/A
doc("prices.xml")//prod[1]/price+5 34.99 xs:double
doc("prices.xml")//prod[1]/price-5 () N/A
doc("prices.xml")//prod[1]/price - 5 24.99 xs:double

Generally, you are not required to put whitespace before or after arithmetic operators. For example, price+5, with no spaces, is a valid expression meaning “the value of the price child plus 5.” However, there is a special rule for subtraction. Because the hyphen (-) is a valid character in XML names, it is necessary to put whitespace after any valid XML name that precedes it. For example, price-5 is interpreted as a single name, so to subtract, you should use price - 5 instead. (The space after the hyphen is technically unnecessary since a name cannot start with a hyphen, but it looks cleaner.)

Division

There are two division operators: div and idiv. A slash (/) cannot be used to indicate division because the / operator is used to delimit steps in a path expression. The div operator is used to perform division of the first operand (the dividend) by the second operand (the divisor). If both numbers being divided are xs:integer-based values, the result is an xs:decimal. Otherwise, normal type promotion rules apply, and the type of the result is the same as the type of the operands.

The idiv operator is used to divide two numbers and obtain the integer portion of the division result. The operands can have any numeric type. If the result of the division is not an even integer, the decimal portion of the number is truncated rather than rounded. For example, (14 div 4) is equal to 3.5, but (14 idiv 4) is equal to 3.

Table 17-4 shows examples of the div and idiv operators.

Table 17-4. Examples of the div and idiv operators
ExampleReturn valueReturn type
14 div 4 3.5 xs:decimal
14 idiv 4 3 xs:integer
-14 idiv 4 -3 xs:integer
14.0 div 3.5 4.0 xs:decimal
14.0 idiv 3.5 4 xs:integer
() div 3 () N/A
14 div 0 Error FOAR0001N/A
xs:float("14") div 0 INF xs:float
xs:double("INF") div 2 INF xs:double
xs:float("NaN") div 2 NaN xs:float

Attempting to divide by zero will raise error FOAR0001 when using the idiv operator or when using the div operator with values of type xs:integer or xs:decimal. Dividing by zero using the div operator on values of type xs:float or xs:double will not raise an error; it will return NaN (if the dividend is 0), or INF or -INF.

When using idiv, error FOAR0002 will be raised if either operand is NaN, or if the divisor is INF or -INF.

Modulus (Remainder)

The mod operator is used to obtain the remainder after dividing the first operand (the dividend) by the second operand (the divisor). For example, (14 mod 4) equals 2. The sign of the result is the same as the sign of the first operand. Table 17-5 shows examples of the mod operator.

Table 17-5. Examples of the mod operator
ExampleReturn valueReturn type
14 mod 4 2 xs:integer
-14 mod 4 -2 xs:integer
14 mod -4 2 xs:integer
14.9 mod 2.1 0.2 xs:decimal
14.5E1 mod 2E1 5 xs:double
xs:float("14") mod 0 NaN xs:float
xs:double("INF") mod 2 NaN xs:double
14 mod () () N/A
14 mod xs:double("INF") 14 xs:double

Special rules, depicted in Table 17-6, apply when one of the operands is INF, -INF, or 0. The cases marked "NaN or division by zero” depend on the types of the operands. If the operands are of type xs:decimal or xs:integer, a “division by zero” error FOAR0001 is raised. Otherwise, NaN is returned.

Table 17-6. Results for the mod operator
Divisor______________Dividend______________
INF or -INFFinite number0 or -0
INF or -INF NaN The dividendThe dividend
Finite number NaN The remainder 0
0 or -0 NaN NaN or division by zeroNaN or division by zero

Functions on Numbers

XQuery provides built-in functions that operate on numeric values. Some operate on single numeric values. They are summarized in Table 17-7 and covered in more detail in Appendix A. Each of these functions returns a numeric value whose type is the same as its argument, or the empty sequence if the argument is the empty sequence.

Table 17-7. Functions on single numbers
Function nameDescription
round The argument rounded to the nearest whole number
round-half-to-even The argument rounded, with half values rounded to the nearest even number
floor The largest whole number that is not greater than the argument
ceiling The smallest whole number that is not smaller than the argument
abs The absolute value of the argument

Table 17-8 lists some additional functions that can be used to aggregate or summarize numeric data. These functions accept a sequence of numeric values and return a single numeric result. All of these functions will automatically cast untyped values to xs:double, so it is not necessary to perform any explicit casting to have the values treated like numbers.

Table 17-8. Functions on sequences of numbers
Function nameDescription
avg The average of a sequence of numbers
sum The sum of a sequence of numbers
min The minimum value of a sequence of numbers
max The maximum value of a sequence of numbers

Table 17-9 lists the trigonometric and exponential functions that were introduced in version 3.0. They are all in the namespace http://www.w3.org/2005/xpath-functions/math, commonly associated with the prefix math.

Table 17-9. Trigonometric and exponential functions
Function nameDescription
math:acos The arc cosine of the argument
math:asin The arc sine of the argument
math:atan The arc tangent of the argument
math:atan2 The arc tangent based on two arguments
math:cos The cosine of the argument
math:exp The value of ex, where x is the argument
math:exp10 The value of 10x, where x is the argument
math:log The natural logarithm of the argument
math:log10 The base-ten logarithm of the argument
math:pi An approximation of the mathematical constant π (pi)
math:pow The result of raising the first argument to the power of the second
math:sin The sine of the argument
math:sqrt The non-negative square root of the argument
math:tan The tangent of the argument

Formatting Numbers

XQuery 3.0 adds some capabilities to format numbers according to specified patterns. They consist of two functions: format-integer and format-number. There is additionally a prolog declaration that specifies defaults for formatting decimal numbers.

Formatting Integers

The format-integer function formats an integer according to the pattern you specify in the $picture (second) argument. This can be used to pad integers with leading zeros, insert grouping separators between segments of an integer, or convert integers to other formats like letters or Roman numerals.

The $picture argument can take one of several forms. It can be a decimal digit pattern that shows optional and mandatory digits, along with grouping separators. For example, the pattern 0000 represents four mandatory digits, so the result will contain at least four digits (padding with leading zeros if necessary). The pattern #,##0 uses the optional digit sign #, which will not insert leading zeros but will cause a comma to be inserted between each group of three digits that is present.

It can instead be a letter indicating a style of number, for example a to use lowercase letters, I to use uppercase Roman numerals, or w to use lowercase words (“one”, “two”, etc.)

Ordinal numbers (1st, 2nd, etc.) are handled by appending ;o to the end of the $picture argument.

Table 17-10 shows some examples. A much more detailed explanation of the syntax of the format-integer function and its $picture argument can be found in Appendix A, in the “format-integer” section.

Table 17-10. Examples of format-integer
ExampleReturn valuePurpose
format-integer(123, '0000') 0123 Adding leading zeros
format-integer(12345678, '#,##0') 12,345,678 Inserting commas between groups
format-integer(4, 'a') d Using other numbering systems (letters, Roman numerals, etc.)
format-integer(4, 'Ww') Four Using words
format-integer(15, '0;o') 15th Using ordinal numbers

Formatting Decimal Numbers

The format-number function formats a number according to the pattern in the $picture argument, and optionally according to the decimal format in the $decimal-format-name argument. It can be used to format values of any numeric type, but for integers it may be best to use the format-integer function described in the previous section, because of its support for different number styles.

The $picture argument can be used to pad numbers with leading or trailing zeros, insert grouping separators between segments of a number, or show numbers with exponents. It is a decimal digit pattern that shows optional and mandatory digits, along with grouping separators. For example, the pattern 0000 represents four mandatory digits, so the result will contain at least four digits (padding with leading zeros if necessary). The pattern #,##0 uses the optional digit sign #, which will not insert leading zeros but will cause a comma to be inserted between each group of three digits that is present.

The optional $decimal-format-name argument allows the specification of a decimal format, which controls other aspects of formatting the number, including the characters used for the decimal separator and grouping separator, the value to show when the number is NaN, and so on.

Table 17-11 shows some examples. A much more detailed explanation of the syntax of the $picture argument and decimal formats can be found in Appendix A, in the “format-number” section.

Table 17-11. Examples of format-number
ExampleReturn valuePurpose
format-number(12, '0000.00') 0012.00 Padding with leading and trailing zeros
format-number(12345.6, '#,###.0') 12,345.6 Inserting commas between groups
format-number(-1, "#,##0.00;(#,##0.00)") (1.00) Using different formats for negative numbers
format-number(0.18, '0%') 18% Calculating percentages
format-number(12, 'Number: 0') Number: 12 Inserting other characters before or after

The Decimal Format Declaration

A decimal format declaration can be used to override the default values for decimal separators, grouping separators, and other settings used by the format-number function. For example, the default value for a decimal separator is a period, and the default value for a grouping separator is a comma, so a decimal number might be formatted as 1,000.50. To reverse these two characters, so that the number would be formatted as 1.000,50 as is customary in some countries, the following default decimal declaration could be used:

declare default decimal-format decimal-separator="," grouping-separator=".";

The syntax of a decimal format declaration is shown in Figure 17-1. It is possible to specify a default decimal declaration, as shown in the previous example, which is used by the two-argument version of the format-number. If you need to format numbers multiple different ways in the same query, you can also name your decimal format declarations and pass that name to the format-number function as the third argument.

Figure 17-1. Syntax of a decimal format declaration

Any of the available properties shown in the syntax diagram can be specified. Most of the properties must be a single character, except for infinity and NaN, which can be strings of any length. The characters must be unique; it is not acceptable to use the same character for two different purposes. If a property is not included in the decimal format declaration, a default value is used.

An explanation of each of the keywords is provided in Table 17-12.

Table 17-12. Keywords a decimal format declaration
KeywordDefaultAllowed valuesPurpose
decimal-separator Period (“.”)A single characterThe decimal separator character used in both the picture string and the formatted number
grouping-separator Comma (“,”)A single characterThe grouping (thousands) separator character in both the picture string and the formatted number
percent Percent sign (“%”)A single characterThe percent sign character used in both the picture string and the formatted number
per-mille Per-mille character (‰) (Unicode codepoint 2030)A single characterThe per-mille character in both the picture string and the formatted number
zero-digit Zero (“0”)A character that is a digit with the numeric value zeroThe first character of the digit family (i.e., the same 10 consecutive codepoints defined by Unicode) to use as the mandatory digit character in the picture string
digit #A single characterThe optional digit character used in the picture string
pattern-separator Semicolon (“;”)A single characterThe pattern separator symbol, which separates positive and negative sub-pictures in a picture string
exponent-separator Lowercase letter “e”A single characterThe exponent separator character used to separate the mantissa from the exponent in both the picture string and in the formatted number (version 3.1 and later)
infinity “Infinity”A stringThe string used to represent infinity in the formatted number
minus-sign - (hyphen-minus character)A single characterThe character used for the minus sign in the formatted number
NaN “NaN”A stringThe string used to represent “not-a-number” (NaN) in the formatted number
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.130.201