Five functions, summarized in Table 17-6, concatenate and split apart strings.
Table 17-6. Functions that concatenate and split apart strings
Name |
Description |
---|---|
|
Concatenates two or more strings |
|
Concatenates a sequence of strings, optionally using a separator |
|
Breaks a single string into a sequence of strings, using a specified separator |
|
Converts a sequence of Unicode code-point values to a string |
|
Converts a string to a sequence of Unicode code-point values |
Strings can be concatenated together using one of two functions: concat
or string-join
. XQuery does not allow use of concat operators such as +
, &
, or ||
to concatenate strings. The concat
function accepts individual string arguments and concatenates them together. This function is unique in that it accepts a variable number of arguments. For example:
concat("a", "b", "c")
returns the string abc
. The string-join
function, on the other hand, accepts a sequence of strings. For example:
string-join( ("a", "b", "c"), "")
also returns the string abc
. In addition, string-join
allows a separator to be passed as the second argument. For example:
string-join( ("a", "b", "c"), "/")
returns the string a/b/c
.
Strings can be split apart, or tokenized, using the tokenize
function. This function breaks a string into a sequence of strings, using a regular expression to designate the separator character(s). For example:
tokenize("a/b/c", "/")
returns a sequence of three strings: a
, b
, and c
. Regular expressions such as s
, which represents a whitespace character (space, line feed, carriage return, or tab), and W
, which represents a nonword character (anything other than a letter or digit) are often used with this function. A list of useful regular expressions for tokenization can be found in Appendix A in the discussion of the tokenize
function. Table 17-7 shows some examples of the tokenize
function.
Table 17-7. Examples of the tokenize function
Example |
Return value |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Strings can be constructed from a sequence of Unicode code-point values (expressed as integers) using the codepoints-to-string
function. For example:
codepoints-to-string((97, 98, 99))
returns the string abc
. The string-to-codepoints
function performs the opposite; it converts a string to a sequence of code points. For example:
string-to-codepoints("abc")
returns a sequence of three integers 97
, 98
, and 99
.
3.12.108.175