Representing Individual Characters

A single character can be used to represent itself in a regular expression. In this case, it is known as a normal character. For example, the regular expression d matches the letter d, and def matches the string def, as you might expect. Each of the three single characters (d, e, and f) is its own atom, and it can have a quantifier associated with it. For example, the regular expression d+ef matches the strings def, ddef, dddef, etc.

Certain characters, in order to be taken literally, must be escaped because they have another meaning in a regular expression. For example, the asterisk (*) will be treated like a quantifier unless it is escaped. These characters, called metacharacters, must be escaped (except when they are within square brackets): ., , ?, *, +, |, ^, $, {, }, (, ), [, and ].

These characters are escaped by preceding them with a backslash. This is referred to as a single-character escape because there is only one matching character. For convenience, there are three additional single-character escapes for the whitespace characters tab, line feed, and carriage return. Table 18-4 lists the single-character escapes.

Table 18-4. Single-character escapes

Escape sequence

Character

[a]

\

|

|

.

.

-

-

^

^

$ [a]

$

?

?

*

*

+

+

{

{

}

}

(

(

)

)

[

[

]

]

Line feed (#xA)

Carriage return (#xD)

Tab (#x9)

[a] This single-character escape can be used in XQuery, but not in XML Schema regular expressions.

You can also use the standard XML syntax for character references and predefined entity references in regular expressions, as long as they are in quoted strings. For example, a space can be represented as &#x20;, and a less-than symbol (<) can be represented as &lt;. This can be useful for special characters. It is described further in "XML Entity and Character References" in Chapter 21.

Table 18-5 shows some examples of representing individual characters in regular expressions.

Table 18-5. Representing individual characters

Regular expression

Strings that match

Strings that do not match

d

d

g

d+efg+

defg, ddefgg

defgefg, deffgg

defg

defg

d, efg

d|e|f

d, e, f

g

f*o

fo, ffo, fffo

f*o

f*o

f*o

fo, ffo, fffo

d&#233;f

déf

def, df

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.84.112