In Chapter 6, I provided a detailed description of
the text
pattern and its behavior within
interleave
patterns. There’s
another pattern that also describes and attaches datatypes to text
nodes. Even though this pattern will become more useful with the
introduction of the datatype libraries in Chapter 8, it’s worth examining its
core features right now to be sure you’ve touched on
most of the definitions related to nodes.
The data
pattern accepts a
type
attribute (as for the
value
pattern) and checks that the value is valid
per this type. Since our two built-in types accept any value, the
data
pattern with built-in types is almost
equivalent to a text
pattern. However, the
data
pattern doesn’t mean, like
the text
pattern, “zero or
more text nodes” but instead “one
text node.” The data
pattern has
been designed to represent data. It’s forbidden in
mixed-content models because the authors of the RELAX NG
specification considered mixing data and elements poor practice.
This restriction applies to all
patterns that match a single text
node (data
, value
, and
list
) that can never be associated with patterns
matching sibling elements (elements that can add the same parent
element in the same instance document). In practice, this means you
can’t use a data
pattern to
describe content models such as:
<price><currency>USD</currency>20</price>
or:
<price>20<currency>Euro</currency></price>
These content models were considered poor practice by the authors of the RELAX NG specification. They advise reformulating them as:
<price> <amount>20</amount> <currency>USD</currency> </price>
or:
<price currency="USD">20</price>
This is the second time RELAX NG has given priority to good practices
over the ability to describe all the combinations possible according
to the XML recommendation. (The first one was the no
“unordered noninterleaved” pattern
in Chapter 6.) This case actually increases the
complexity of the implementations of RELAX NG processors, which must
check that data
patterns aren’t
included within mixed content models. The support of data in
mixed-content models would have been possible using the general
algorithms without any additional complexity. The only benefit for
RELAX NG processors is that they can skip whitespace occurring
between two elements, but this benefit seems really minimal compared
to the possibilities that are lost by this restriction.
This restriction appears to come from a strict distinction between data- and document-oriented applications of XML. Mixed content has been considered an aspect of document-oriented applications, which shouldn’t need datatypes, while datatypes are limited to data-oriented applications, which shouldn’t need mixed content.
18.226.181.89