Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Appendix A. Data Model and Type System Reference

Introduction

This appendix lists features of the XQuery Data Model and type system, sorted alphabetically and cross-referenced for convenience. See Chapter 2 for an introduction to these topics. This appendix also summarizes the sequence type syntax used by XQuery expressions such as function definitions, typeswitch, and cast as. Most of these operators are described in Chapter 9. Section A.5 covers all of the built-in atomic types, including value ranges and lexical forms.

Overview

The XQuery Data Model consists of sequences of items (including the empty sequence). Items are nodes or atomic values. XQuery defines 7 node kinds and 50 atomic types.

XQuery provides two different ways to refer to types: single type and sequence type. The single type syntax is used only in cast as and castable as expressions. The sequence type syntax is used in all other expressions: typeswitch, instance of, treat as, function definitions, and variable types.

A sequence type consists of either the empty() type test (which matches only the empty sequence) or else a single type together with an optional occurrence indicator. A single type is a subset of sequence type, consisting of a type test with either no occurrence indicator or else the question mark (?). The occurrence modifiers are listed in Table A.1.

Table A.1. Occurrence indicators in sequence types

Occurrence indicator	Meaning
	Exactly one
`?`	Zero or one
`*`	Zero or more
`+`	One or more

The item type can be an atomic type name (any QName), or a node kind type test. It can also be the special type test item(), which matches any item, or the type test node(), which matches any node.

XQuery defines one type test for each node kind: attribute(), comment(), document-node(), element(), processing-instruction(), and text(). Without arguments, these node kind tests match any node of the corresponding kind. Additionally, the attribute(), element(), and processing-instruction() node kind tests accept optional arguments. Table A.2 lists all of these. See Chapter 9 for additional examples.

Table A.2. Type names used in simple types and sequence types

Type name	Meaning
A qualified name	Atomic type with that name.
`item()`	Any item.
`node()`	Any node kind.
`xdt:anyAtomicType`	Any atomic type.
`xdt:untypedAtomic`	Untyped atomic data.
`attribute()`	Any attribute node.
`attribute(@` `nodename``)`	Any attribute node with the given name. The name may be a qualified name or the wildcard * (matching any name).
`attribute(@` `nodename, typename``)`	Any attribute node with the given name and type. The node and type names may be qualified names or the wildcard * (matching any name or any type, respectively).
`attribute(``context``@` `QName``)`	Any attribute node matching the type specified by schema context path and type name.
`comment()`	Any comment node.
`document-node()`	Any document node.
`document-node(``elementtest``)`	Any document node whose content matches the given element node test.
`element()`	Any element node.
`element(``nametest``)`	Any element node with the given name. The name may be a qualified name or the wildcard * (matching any name).
`element(``nodename, typename``)`	Any element node with the given name and type. The node and type names may be qualified names or the wildcard * (matching any name or any type, respectively).
`element(``nodename, typename` `nillable)`	Any element node with the given name and type that is nillable. The name and type test may be qualified names or the wildcard *.
`element(``context QName``)`	Any element node matching the type specified by schema context path and type name.
`namespace()`	Any namespace node.
`processing-instruction()`	Any processing instruction node.
`processing-instruction(``string``)`	Any processing instruction node with the given target, as in `<?target content?>`.
`text()`	Any text node.

Node Kinds

XQuery supports all seven XML node kinds and uses the XML syntax to construct them; for some node kinds, XQuery also provides an alternate syntax. Table A.3 summarizes both constructor styles. See Chapter 7 for additional examples and construction rules.

Table A.3. Constructors for each node kind

Node kind	Constructor(s)
attribute	`attribute` `name` `{` `content-expr` `}attribute {` `name-expr` `} {` `content-expr` `}``name``="``value``"` (in element constructor only)
comment	comment { content-expr } <!-- content -->
document	document { content-expr }
element	element name { content-expr } element { name-expr } { content-expr } <name attributes /> <name attributes>content</name>
namespace	`namespace` `prefix` `{` `uri-expr` `}xmlns="``uri``"` (in element constructor only)`xmlns:``prefix``="``uri``"` (in element constructor only)
processing instruction	processing-instruction { target-expr } { content-expr } <?target content?>
text	text { content-expr }

Every node belongs to a tree (possibly containing only that node). If the root node of the tree is an element, then the tree is a fragment; otherwise, the root node is a document node, and the tree is a document. Every node has a unique node identity, and all nodes are ordered relative to one another (document order). Nodes in the same tree are ordered in left-depth-first order; nodes from different trees may have any order, but the order doesn't change within the execution of a query.

Table A.4 lists the built-in node properties. For some node kinds, the property value is always the empty list. For some properties, the value is always a constant (such as node kind). Otherwise, the sequence type of the property is listed.

Table A.4. Node properties by node kind

	Attribute	Comment	Document	Element	Namespace	Processing Instruction	Text
attributes	`()`	`()`	`()`	`attribute*`	`()`	`()`	`()`
base-uri	`xs:anyURI?`	`xs:anyURI?`	`xs:anyURI?`	`xs:anyURI?`	`xs:anyURI?`	`xs:anyURI?`	`xs:anyURI?`
children	`()`	`()`	`node()`	`node()`	`()`	`()`	`()`
namespaces	`()`	`()`	`()`	`namespace()*`	`()`	`()`	`()`
nilled	`()`	`()`	`()`	`xs:boolean`	`()`	`()`	`()`
node-kind	`"attribute"`	`"comment"`	`"document"`	`"element"`	`"namespace"`	`"processing-instruction"`	`"text"`
node-name	`xs:QName`	`()`	`()`	`xs:QName`	`xs:QName?`	`()`	`()`
parent	`element()?`	`(element()\| documentnode())?`	`()`	`element()\| documentnode())?`	`(element()?`	`(element()\| documentnode())?`	`(element() \| documentnode())?`
string-value	`xs:string`	`xs:string`	`xs:string`	`xs:string`	`xs:string`	`xs:string`	`xs:string`
typed-value	`xdt:anyAtomic Type?`	`()`	`()`	`xdt:anyAtomic Type?`	`()`	`()`	`xdt:anyAtomic Type?`
unique-id	`()`	`()`	`()`	`xs:ID?`	`()`	`()`	`()`

Table A.5. Expressions that access node properties

	Expression	See also
attributes	attribute::* @*	Chapter 3, Appendix B
base-uri	`fn:base-uri()`	Appendix C
children	child::node() node()	Chapter 3, Appendix B
namespaces	`fn:get-in-scope-namespaces()`	Appendix C
nilled	instance of element(, nilled) self::[@xsi:nil="true"]	Chapters 2 and 9
node-kind	`typeswitch` node kind tests	Chapters 2 and 9
node-name	fn:node-name ()fn:name()	Appendix C
parent	`parent::*..`	Chapter 3
string-value	`fn:string()`	Chapter 9, Appendix C
typed-value	`fn:data()`	Chapter 9, Appendix C
unique-id	`fn:unique-id()`	Appendix C

The XQuery Data Model is an abstraction and not all of its properties are directly accessible in XQuery. For example, node identity and node order aren't directly accessible as values. Expressions that access values in the data model are summarized in Table A.5. See Chapter 2 for more information.

Atomic Types

Figure A.1 shows the entire XQuery atomic type hierarchy, and Table A.6 lists the meaning of every type. Arrows indicate inheritance (also known as derivation). Type names in bold are the “types you need to know” from Chapter 2. Grey boxes indicate built-in types that cannot be constructed (because they derive by list). All other types derive by restriction, and can be cast to and constructed.

Figure A.1. The XQuery atomic type hierarchy

Table A.6. The numeric types and their ranges

Type	Meaning	Range
`xs:float`	Single-precision floating-point	m*2^E -2²⁴ < m < 2²⁴ -149 <= E <= 104
`xs:double`	Double-precision floating-point	m*2^E -2⁵³ < m < 2⁵³ -1075 <= E <= 970
`xs:decimal`	Arbitrary-precision fixed-point (base 10)	`Implementation-defined`
`xs:integer`	Arbitrary-precision integer	`Implementation-defined`
`xs:positiveInteger`	Arbitrary-precision positive integer	`> 0`
`xs:nonNegative Integer`	Arbitrary-precision non-negative integer	`>= 0`
`xs:negativeInteger`	Arbitrary-precision negative integer	`< 0`
`xs:nonPositiveInteger`	Arbitrary-precision non-positive integer	`<= 0`
`xs:byte`	1-byte signed integer	-128 to 127 (-2⁷ to 2⁷-1)
`xs:short`	2-byte signed integer	-32768 to 32767 (-2¹⁵ to 2¹⁵-1)
`xs:int`	4-byte signed integer	-2147483648 to 2147483647 (-2³¹ to 2³¹-1)
`xs:long`	8-byte signed integer	-9223372036854775808 to 9223372036854775807 (-2⁶³ to 2⁶³-1)
`xs:unsignedByte`	1-byte unsigned integer	0 to 255 (0 to 2⁸-1)
`xs:unsignedShort`	2-byte unsigned integer	0 to 65535 (0 to 2¹⁶-1)
`xs:unsignedInt`	4-byte unsigned integer	0 to 4294967295 (0 to 2³²-1)
`xs:unsignedLong`	8-byte unsigned integer	0 to 18446744073709551615 (0 to 2⁶⁴-1)

The prefix xs is bound to the namespace http://www.w3.org/2001/XMLSchema and the prefix xdt is bound to the namespace http://www.w3.org/2003/11/xpath-datatypes.

Primitive Type Conversions

This section summarizes all of the XQuery type conversions used by the cast as expression when converting to atomic types. Each table shows the conversions used from the row type (source type) to the column type (target type). The letters used in Tables A.7, A.9, and A.11 correspond to the rules listed in Tables A.8, A. 10, and A.12, respectively. A blank entry means the value is unchanged by the conversion (only its type changes).

All other type conversions not mentioned below or shown in Tables A.9 – A.12 result in errors. Some implementations detect these as compile-time (static) errors, and others as run-time (dynamic) errors.

Converting to the three types xs:anySimpleType or xdt:untypedAtomic is the same as converting to xs:string, except that list types cannot be converted to xdt:untypedAtomic.

Except for xs:NOTATION, every type can be converted to one of these three types without error, by taking its canonical representation. These three types can also be converted to any other type except xs:NOTATION (possibly resulting in an error), by attempting to parse according to other type's lexical format. See the individual entries for each type in Section A.5 for canonical and lexical representations.

The two binary types, xs:base64Binary and xs:hexBinary, can be converted to xs:boolean; see their descriptions in Section A.5 for details.

Table A.7. XQuery numeric type conversion chart

from o	xs:bool	xs:float	xs:double	xs:integer	xs:decimal
xs:bool		`A`	`A`	`A`	`A`
xs:float	`B`		`C`	`F`	`G`
xs:double	`B`	`D`		`F`	`G`
xs:integer	`B`	`E`	`C`
xs:decimal	`B`	`E`	`C`	`F`

Table A.8. XQuery numeric type conversion rules

Rule	Meaning
`A`	True converts to the number `1`, false converts to the number `0` in the target type.
`B`	Positive or negative zero and `NaN` convert to false; all others convert to true.
`C`	The string representation of the number is parsed as `xs:double`.
`D`	The string representation is parsed as `xs:float` (possibly losing precision). If the value exceeds the maximum or minimum float value, then the result is +`INF` or `-INF`, respectively. If underflow occurs, the result is `0`.
`E`	The string representation is parsed as `xs:float` (possibly losing precision).
`F`	The fractional part, if any, is discarded, and the remaining value converted to `xs:integer`. If the value is too large or too small to be represented, or if the value is infinite or `NaN`, then an error is raised.
`G`	The number is converted to the closest decimal value that the implementation can represent (arbitrarily chosen when there is a tie). If the number is infinite or `NaN`, or if it is too large or too small to be represented, then an error is raised.

Table A.9. XQuery duration type conversion chart

from o	xdt:dayTimeDuration	xdt: yearMonthDuration
xs:duration	`A`	`B`
xdt:dayTimeDuration		`X`
xdt:yearMonthDuration	`X`

Table A.10. XQuery duration type conversion rules

Rule	Meaning
`A`	Only the year and month parts of the duration are kept.
`B`	Only the day and time parts of the duration are kept.
`X`	No conversion possible (error).

Table A.11. XQuery calendar type conversion chart

from o	xs:dateTime	xs:date	xs:time	xs:gYear	xs:gYearMonth	xs:gMonth	xs:gDay	xs:gMonthDay
xs:dateTime		`C`	`D`	`E`	`F`	`G`	`H`	`I`
xs:date	`A`		`X`	`E`	`F`	`G`	`H`	`I`
xs:time	`B`	`X`		`X`	`X`	`X`	`X`	`X`

Table A.12. XQuery calendar type conversion rules

Rule	Meaning
`A`	The `xs:dateTime` has time `00:00:00`, remaining parts from the `xs:date`.
`B`	The `xs:dateTime` has date equal to `fn:current-date()`.
`C`	Only the date and time zone parts of the original `xs:dateTime` value.
`D`	Only the time and time zone parts of the original `xs:dateTime` value.
`E`	Only the year and time zone parts of the original value.
`F`	Only the year, month, and time zone parts of the original value.
`G`	Only the month and time zone parts of the original value.
`H`	Only the day and time zone parts of the original value.
`I`	Only the month, day, and time zone parts of the original value.
`X`	No conversion possible (error).

Of course, every type can be converted to itself or any of its supertypes without changing value. Types can also be converted to derived types, provided that they satisfy the additional restrictions of the derived type (otherwise an error is raised).

Built-in Atomic Types

In this section there is a one-line description of each concrete type followed by its type constructor, its lexical format, and its canonical form (when it differs from the lexical one). The lexical form is a regular expression describing the formats that are accepted when converting from string to this type. The canonical form describes the format used when converting to string. Except for the xs:NOTATION type, every type has a lexical and canonical form. These forms are described using the XQuery regular expression syntax (see Appendix D).

The three types xdt:anyAtomicType, xs:anySimpleType, and xs:anyType are abstract and so are not listed here. No atomic value has exactly one of these types, although every atomic type is derived from one of them. See Chapter 2 for details.

The xs:anyURI type represents a Uniform Resource Identifier (URI) Reference, as defined by RFC 2396 and RFC 2732 (see the Bibliography for complete references). In XQuery, the xs:anyURI type is used to name collations, collections, and documents.

Example A.1. xs:anyURI

xs:anyURI('http://www.awprofessional.com/')
xs:anyURI('ftp://ftp.w3.org/')
xs:anyURI('urn:relative/path')

An xs:anyURI value can be constructed from any string value that contains only ASCII characters, except for the following excluded characters: the special punctuation characters <>"{}|^` and control characters (U+0000 through U+001F and U+007F). Whitespace (U+0020) is allowed but recommended to be escaped as %20. In addition, the punctuation characters ;/?:@&=+$, are reserved, so it is recommended to escape them also.

The sequence %HH where H is a hexadecimal digit is used to encode the character U+00HH. Disallowed characters (including non-ASCII characters) must be first converted to UTF-8, then hex-encoded as necessary to be represented as xs:anyURI values.

Note that xs:anyURI is a distinct type from xs:string, and isn't a subtype of it. A cast or constructor function must be used to convert one to the other.

Unfortunately, URI values are notoriously difficult to work with; for example, on some systems file system paths are case-sensitive while on others they are not. In general, it isn't possible to determine whether two URI values point to the same resource or not. Consequently, implementations are given some latitude in how they handle URI values. For example, some implementations may disallow xs:anyURI("foo") but allow xs:anyURI("urn:foo"), while others may allow both.

For more information, see RFC 2396 and RFC 2732 (complete references are provided in the Bibliography).

See also: fn:base-uri(), fn:collection(), fn:default-collation(), fn:doc(), fn:escape-uri(), and fn:resolve-uri()in Appendix C.

The xs:base64Binary type represents base64-encoded binary data, as defined by RFC 2045. It consists of a sequence of ASCII characters A-Z, a-z, 0-9, the punctuation characters +/ and possibly one or two trailing = characters for padding; all other characters are ignored. Base64-encoding is a popular format for embedding arbitrary binary data in XML (among other things).

Each character represents 6 bits of data (most significant bit first); four consecutive characters thus encode 3 bytes of binary data. When the original input data has a length not divisible by three, trailing = symbols are used to pad the base64 encoding of the data. Trailing bits are zero-padded as necessary.

Example A.2. xs:base64Binary

xs:base64Binary("ABBA")     (: encodes the three bytes 0, 16, 64 :)
xs:base64Binary("+XQuerY=") (: encodes 249, 116, 46, 122, 182 :)

XQuery doesn't provide any functions for working with base64- or hex-encoded binary data, but Chapter 10 includes a few user-defined functions that do so.

See also: xs:hexBinary and Chapter 10.

The xs:boolean type represents a single boolean value (true or false).

In addition to the numeric and string type conversions summarized in Table A.7, values of type xs:hexBinary and xs:base64Binary can also be converted to xs:boolean. They convert to true if their string value is "1", false if their string value is "0", and otherwise result in an error.

Example A.3. xs:boolean

xs:boolean("true")    => true
xs:boolean("1")       => true
xs:boolean(0)         => false
xs:boolean(0.0)       => error
xs:boolean(42)        => error
xs:boolean("ja")      => error
xs:boolean("")        => error

Casting to xs:boolean or using the xs:boolean() constructor differs from using the fn:boolean() function. Casting to xs:boolean converts "0" and "false" to false, "1" and "true" to true, and all other string values result in an error. In contrast, the built-in boolean() function returns true for all non-empty strings and false for the empty string. (The boolean() function can also be applied to non-singleton and non-atomic values, such as the empty sequence or nodes, while xs:boolean() cannot.)

Table of Contents for A. Data Model and Type System Reference

Create new playlist

Sign In

Sign Up