The replace
function allows parenthesized sub-expressions (also known as groups) to be referenced by number in the replacement string. In the $replacement
string, you can use the variables $1
, $2
, $3
, etc. to represent (in order) the parenthesized expressions in $pattern
. This is very useful when replacing strings on the condition that they come directly before or after another string—for example, if you want to change instances of the word Chap to the word Sec, but only those that are followed by a space and a digit. This technique can also be used to reformat data for presentation. Table 18-16 shows some examples.
get-matches-and-non-matches
The regular expression capabilities of XQuery allow you to determine whether a string matches a regular expression and to replace matches in a string. However, one feature it does not directly provide is the ability to retrieve the parts of a string that do match a pattern. In XSLT 2.0, this can be achieved using the xsl:analyze-string
instruction that has no equivalent in XQuery. However, this can be accomplished using the get-matches-and-non-matches
function below, which returns a sequence of alternating match
and non-match
elements containing the strings that do and do not match a pattern. It starts with the entire string, constructs an element depending on whether it begins with a match or nonmatch, and recursively calls itself with the rest of the string.
This function depends on two other functions, also listed here:
index-of-match-first
This function determines where the first match (if any) occurs in the string. It does this by tokenizing the string and determining the length of the first token.
replace-first
This function replaces the first match in the string by concatenating an anchor and reluctant wildcard to the beginning of the pattern. It is used by the get-matches-and-non-matches
to help determine the length of any particular match.
declare namespace functx = "http://www.functx.com"; declare function functx:get-matches-and-non-matches ($string as xs:string?, $regex as xs:string) as element( )* { let $iomf := functx:index-of-match-first($string, $regex) return if (empty($iomf)) then <non-match>{$string}</non-match> else if ($iomf > 1) then (<non-match>{substring($string,1,$iomf − 1)}</non-match>, functx:get-matches-and-non-matches substring($string,$iomf),$regex)) else let $length := string-length($string) - string-length(functx:replace-first($string, $regex,'')) return (<match>{substring($string,1,$length)}</match>, if (string-length($string) > $length) then functx:get-matches-and-non-matches( substring($string,$length + 1),$regex) else ( )) } ; declare function functx:index-of-match-first ($arg as xs:string?, $pattern as xs:string) as xs:integer? { if (matches($arg,$pattern)) then string-length(tokenize($arg, $pattern)[1]) + 1 else ( ) } ; declare function functx:replace-first ($arg as xs:string?, $pattern as xs:string, $replacement as xs:string ) as xs:string { replace($arg, concat('(^.*?)', $pattern), concat('$1',$replacement)) } ;
For example, calling this function with:
functx:get-matches-and-non-matches('abc123def', 'd+')
returns a sequence of three elements:
<non-match>abc</non-match> <match>123</match> <non-match>def</non-match>
Table 18-16. Examples of using replacement variables
Example |
Return value |
---|---|
|
|
|
|
|
|
|
|
|
|
The variables are bound in order from left to right based on the position of the opening parenthesis. The variable $0
can be used to represent the string matched by the entire regular expression. If the variable number exceeds the number of parenthesized sub-expressions in the regular expression, it is replaced with a zero-length string.
If you wish to include the character $
in your replacement string, you must escape it with a backslash (i.e., $
), as shown in the fifth example. Backslashes must also be escaped in the $replacement
string, as in \
.
18.216.27.251