In their simplest form, pattern
facets may be used
as enumerations applied to the lexical space rather than on the value
space.
If, for instance, you have a byte value that can take only the values
1, 5, or 15, the classical way to define such a datatype is to use
RELAX NG’s choice
pattern:
<choice> <value type="byte">1</value> <value type="byte">5</value> <value type="byte">15</value> </choice>
or:
element foo { xsd:byte "1" | xsd:byte "5" | xsd:byte "15" }
This example is the normal way to define this datatype if it matches
the lexical space and the value space of an
xsd:byte
. It grants the flexibility to accept the
instance documents with values such as 1, 5, and 15, but also 01 or
0000005.
As far as validation alone is concerned, if you want to remove the
variations with leading zeros, you can use another datatype such as
token
instead of xsd:byte
in
your choice
pattern:
<choice> <value type="token">1</value> <value type="token">5</value> <value type="token">15</value> </choice>
or:
xsd:token "1" | xsd:token "5" | xsd:token "15"
However, you might have good reasons to use
xsd:byte
. For example, you can use it if
you’re interested in type annotation and want to use
a RELAX NG processor supporting type annotation. That processor can
usefully report the datatype as xsd:byte
and not
xsd:token
.
One of the peculiarities of the
pattern
facet is that it is the only facet constraining the lexical space. If
you have an application that doesn’t like leading
zeros, you can use pattern
facets instead of
enumerations to define your datatype:
<data type="byte"> <param name="pattern">1|5|15</param> </data>
or:
xsd:byte {pattern = "1|5|15"}
Here, I am still using the xsd:byte
datatype with
its associated semantics, but its lexical space is now constrained to
accept only 1, 5, and 15, leaving out any variation that has the same
value but a different lexical representation.
This constraint is an important difference from
Perl
regular expressions, on which W3C XML Schema
pattern
facets are built. A Perl expression such
as /15/
matches any string containing 15, while
the W3C XML Schema
pattern
facet matches only the string
equal to 15. The Perl expression equivalent to this
pattern
facet is thus /^15$/
.
This example has been carefully chosen to avoid using any
metacharacters within pattern
facets, which are:
.
?
*
+
{
}(
)[
and ]
.
You’ll see the meaning of these characters later in
this chapter; however, for the moment, you just need to know that
each of these characters needs to be escaped by a leading backslash
to be used as a literal. For instance, to define a similar datatype
for a decimal when lexical space is limited to 1 and 1.5, write:
<data type="decimal"> <param name="pattern">1|1.5</param> </data>
or:
xsd:decimal {pattern = "1|1.5"}
A common source of errors is that normal characters
shouldn’t be escaped: you’ll see
later that a leading backslash changes their meaning. For instance,
P
matches all the Unicode punctuation characters,
not the character P
.
3.147.126.242