Concatenating and Splitting Strings

Five functions, summarized in Table 17-6, concatenate and split apart strings.

Table 17-6. Functions that concatenate and split apart strings

Name	Description
`concat`	Concatenates two or more strings
`string-join`	Concatenates a sequence of strings, optionally using a separator
`tokenize`	Breaks a single string into a sequence of strings, using a specified separator
`codepoints-to-string`	Converts a sequence of Unicode code-point values to a string
`string-to-codepoints`	Converts a string to a sequence of Unicode code-point values

Concatenating Strings

Strings can be concatenated together using one of two functions: concat or string-join. XQuery does not allow use of concat operators such as +, &, or || to concatenate strings. The concat function accepts individual string arguments and concatenates them together. This function is unique in that it accepts a variable number of arguments. For example:

concat("a", "b", "c")

returns the string abc. The string-join function, on the other hand, accepts a sequence of strings. For example:

string-join( ("a", "b", "c"), "")

also returns the string abc. In addition, string-join allows a separator to be passed as the second argument. For example:

string-join( ("a", "b", "c"), "/")

returns the string a/b/c.

Splitting Strings Apart

Strings can be split apart, or tokenized, using the tokenize function. This function breaks a string into a sequence of strings, using a regular expression to designate the separator character(s). For example:

tokenize("a/b/c", "/")

returns a sequence of three strings: a, b, and c. Regular expressions such as s, which represents a whitespace character (space, line feed, carriage return, or tab), and W, which represents a nonword character (anything other than a letter or digit) are often used with this function. A list of useful regular expressions for tokenization can be found in Appendix A in the discussion of the tokenize function. Table 17-7 shows some examples of the tokenize function.

Table 17-7. Examples of the tokenize function

Example	Return value
`tokenize("a b c", "s")`	`("a", "b", "c")`
`tokenize("a b c", "s+")`	`("a", "b", "c")`
`tokenize("a−b--c", "−")`	`("a", "b", "", "c")`
`tokenize("−a−b-", "−")`	`("", "a", "b", "")`
`tokenize("a/ b/ c", "[/s]+")`	`("a", "b", "c")`
`tokenize("2006-12-25T12:15:00", "[−T:]")`	`("2006","12","25","12","15","00")`
`tokenize("Hello, there.", "W+")`	`("Hello", "there")`

Converting Between Code Points and Strings

Strings can be constructed from a sequence of Unicode code-point values (expressed as integers) using the codepoints-to-string function. For example:

codepoints-to-string((97, 98, 99))

returns the string abc. The string-to-codepoints function performs the opposite; it converts a string to a sequence of code points. For example:

string-to-codepoints("abc")

returns a sequence of three integers 97, 98, and 99.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 17.6. Concatenating and Splitting Strings

Create new playlist

Sign In

Sign Up

Concatenating and Splitting Strings

Concatenating Strings

Splitting Strings Apart

Converting Between Code Points and Strings

Table of Contents for
17.6. Concatenating and Splitting Strings