You’ll see the restrictions
of RELAX NG in Chapter 15, but I need to mention
the principal restriction related to the
interleave
compositor, as it might affect you at
some point if you combine mixed-content models.
Let’s extend our title
element to
allow not only links (a
elements) but also bold
characters marked by a b
element:
<title xml:lang="en">Being a <a href="http://dmoz.org/Recreation/Pets/Dogs/">Dog</a> Is a <b>Full-Time</b> <a href="http://dmoz.org/Business/Employment/Job_Search/">Job</a> </title>
Because text can appear before the a
elements,
between a
and b
, and after the
b
element, you might be tempted to write the
following schemas:
<element name="title"> <interleave> <attribute name="xml:lang"/> <text/> <zeroOrMore> <element name="a"> <attribute name="href"/> <text/> </element> </zeroOrMore> <text/> <zeroOrMore> <element name="b"> <text/> </element> </zeroOrMore> <text/> </interleave> </element>
or:
element title { attribute xml:lang {text} & text & element a {attribute href {text}, text} * & text & element b {text} * & text }
Running the Jing validator against this schema raises the following error:
Error at URL "file:/home/vdv/xmlschemata-cvs/books/relaxng/examples/RngMorePatterns/ interleave-restriction2.rnc", line number 1, column number 2: both operands of "interleave" contain "text"
This error results because there can be only one
text
pattern in each interleave
pattern. You have seen that text
patterns match
zero or more text nodes, and in this case, the remedy is simple
enough: the schema must be rewritten as:
<element name="title"> <interleave> <attribute name="xml:lang"/> <text/> <zeroOrMore> <element name="a"> <attribute name="href"/> <text/> </element> </zeroOrMore> <zeroOrMore> <element name="b"> <text/> </element> </zeroOrMore> </interleave> </element>
or:
element title { attribute xml:lang {text} & text & element a {attribute href {text}, text} * & element b {text} * }
This new schema is perfectly valid and does what we tried to do with our invalid schema.
In this example, diagnosing the problem was very simple, but in
practice, the situation is often more complex. There can be
conflicting text
patterns belonging to different
subpatterns of interleave
or
mixed
patterns. When using pattern libraries (as
shown in Chapter 10), the conflicting
text
patterns often belong to different RELAX NG
grammars, making it still more difficult to pinpoint the problem. To
make it even worse, the error messages from the RELAX NG processors
are often quite cryptic, in this case telling you there are
conflicting text
patterns in
interleave
patterns without saying where they come
from. Unfortunately, for now at least, you’ll have
to figure this out by yourself.
The reason behind the restriction of only one text
pattern in each interleave
pattern is to optimize
RELAX NG implementations using the derivative method described by
James Clark. When processing mixed-content models, instead of
processing each text node, these implementations can simply memorize
the fact that this is mixed content and ignore each text node. To do
so, the implementation needs to be able to quickly find if a content
model mixed or not mixed. That’s where the
restriction makes a difference in terms of programming complexity and
execution speed.
3.140.244.45