In the preceding sections, you have seen how an external grammar can be used as a single pattern. This is useful in cases in which you want to include a content model described by an external schema at a single point, not unlike when you mount a Unix filesystem. The description contained in the external grammar is mounted at the point where you make your reference.
The main drawback to this approach is that you can’t individually reuse the definitions contained in the external schema. To do so, you need a new pattern, with a different meaning, which will let you control how two grammars are merged into a single one.
In the simplest case, you will want to reuse patterns defined in common libraries of patterns without modifying them. Let’s say we have defined a grammar with some common patterns, common.rng, which can be reused in many different schemas, such as:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="attribute-id"> <attribute name="id"> <data type="ID"/> </attribute> </define> <define name="content-person"> <ref name="attribute-id"/> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> </define> </grammar>
or common.rnc, in the compact syntax:
element-name = element name { token } element-born = element born { xsd:date } attribute-id = attribute id { xsd:ID } content-person = attribute-id, element-name, element-born?
These schemas are obviously not meant to be used as standalone
schemas: they have no start
patterns and would be
invalid. However, they contain definitions that can be used to write
the schema of our library. To employ these definitions, use
include
patterns and provide a supporting
framework. In the XML syntax, this looks like:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="common.rng"/> <start> <element name="library"> <oneOrMore> <element name="book"> <ref name="attribute-id"/> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <element name="author"> <ref name="content-person"/> <optional> <ref name="element-died"/> </optional> </element> </oneOrMore> <zeroOrMore> <element name="character"> <ref name="content-person"/> <ref name="element-qualification"/> </element> </zeroOrMore> </element> </oneOrMore> </element> </start> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar>
The include
pattern is translated to an
include
keyword in the compact syntax:
include "common.rnc" start = element library { element book { attribute-id, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, element author { content-person, element-died? }+, element character { content-person, element-qualification }* }+ } element-died = element died { xsd:date } element-qualification = element qualification { token }
Note that the name of the include
pattern is
slightly misleading. The include
pattern here
doesn’t include the external grammar directly. (You
have seen that this was the job of the externalRef
pattern.) Instead, it includes the content of the external grammar,
performing a merge of both grammars. This is exactly what you need;
it allows you to make references to the named patterns defined in the
common.rng grammar.
The result of this inclusion is thus equivalent to the following monolithic schema:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <!-- Content of the included grammar --> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="attribute-id"> <attribute name="id"> <data type="ID"/> </attribute> </define> <define name="content-person"> <ref name="attribute-id"/> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> </define> <!-- End of the included grammar --> <start> <element name="library"> <oneOrMore> <element name="book"> <ref name="attribute-id"/> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <element name="author"> <ref name="content-person"/> <optional> <ref name="element-died"/> </optional> </element> </oneOrMore> <zeroOrMore> <element name="character"> <ref name="content-person"/> <ref name="element-qualification"/> </element> </zeroOrMore> </element> </oneOrMore> </element> </start> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar>
or, in the compact syntax:
element-name = element name { token } element-born = element born { xsd:date } attribute-id = attribute id { xsd:ID } content-person = attribute-id, element-name, element-born? start = element library { element book { attribute-id, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, element author { content-person, element-died? }+, element character { content-person, element-qualification }* }+ } element-died = element died { xsd:date } element-qualification = element qualification { token }
In the previous example, we were lucky. The definitions of the common patterns included matched exactly what we needed. In the real world, this isn’t always the case. It is quite handy to be able to replace definitions found in the grammar that we’re including when they might conflict with other aspects of our schema design.
Let’s say that we have already written this very flat version of our schema, called library.rng:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="element-library"/> </start> <define name="element-library"> <element name="library"> <zeroOrMore> <ref name="element-book"/> </zeroOrMore> </element> </define> <define name="element-book"> <element name="book"> <ref name="attribute-id"/> <ref name="attribute-available"/> <ref name="element-isbn"/> <ref name="element-title"/> <oneOrMore> <ref name="element-author"/> </oneOrMore> <zeroOrMore> <ref name="element-character"/> </zeroOrMore> </element> </define> <define name="element-author"> <element name="author"> <ref name="content-person"/> <optional> <ref name="element-died"/> </optional> </element> </define> <define name="element-character"> <element name="character"> <ref name="content-person"/> <ref name="element-qualification"/> </element> </define> <define name="element-isbn"> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-title"> <element name="title"> <ref name="attribute-xml-lang"/> <data type="token" datatypeLibrary=""/> </element> </define> <define name="attribute-xml-lang"> <attribute name="xml:lang"> <data type="language"/> </attribute> </define> <define name="attribute-available"> <attribute name="available"> <data type="boolean"/> </attribute> </define> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> <define name="attribute-id"> <attribute name="id"> <data type="ID"/> </attribute> </define> <define name="content-person"> <ref name="attribute-id"/> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar>
or, in the compact syntax, library.rnc:
start = element-library element-library = element library { element-book* } element-book = element book { attribute-id, attribute-available, element-isbn, element-title, element-author+, element-character* } element-author = element author { content-person, element-died? } element-character = element character { content-person, element-qualification } element-isbn = element isbn { token } element-title = element title { attribute-xml-lang, token } attribute-xml-lang = attribute xml:lang { xsd:language } attribute-available = attribute available { xsd:boolean } element-name = element name { token } element-born = element born { xsd:date } element-died = element died { xsd:date } attribute-id = attribute id { xsd:ID } content-person = attribute-id, element-name, element-born? element-qualification = element qualification {token}
This might be a good
schema to
use in production to validate incoming documents from a variety of
patterns, so you wouldn’t want to modify it.
However, you might have a new application that
doesn’t work at the level of a library but only at
the level of a book. This application needs to validate instance
documents with book
root elements. Of course you
wouldn’t want to copy and paste the definition of
our existing schema into another one because that would mean
maintaining two different versions with similar content.
This is a case in which you would want to redefine the
start
element of our schema. To do so, use an
include
pattern, embedding the definitions that
must be substituted for the ones from the included grammar in the
include
pattern itself:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="library.rng"> <start> <ref name="element-book"/> </start> </include> </grammar>
or:
include "library.rnc" { start = element-book }
Note how the new definitions are embedded directly in the
include
pattern; the content of the
include
pattern is where all the redefinitions
must be written. This short schema includes all the definitions from
library.rng and redefines the
start
pattern. It validates instance documents
with a book
root element. Since we are performing
an inclusion instead of a copy, we will inherit any modifications
made to library.rng.
We have been able to redefine the start
pattern,
but each named pattern can also be redefined using the same syntax.
Let’s say for instance that I am not happy with the
definition of the element-name
pattern and want to
check that the name is shorter than 80 characters. If I
don’t want to (or can’t) modify the
original schema, I can include it and redefine this pattern:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="library.rng"> <define name="element-name"> <element name="name"> <data type="token"> <param name="maxLength">80</param> </data> </element> </define> </include> </grammar>
or:
include "library.rnc" { element-name = element name { xsd:token{maxLength = "80"} } }
Here again, the grammar of library.rnc is merged
with the grammar of the new schema (which happens to be empty) but
before the merge, the definitions that are embedded in the
include
pattern are substituted to the original
definitions.
The new definition can be as different from the original one as I
want. While it might not always be good practice, I can, for
instance, redefine attribute-available
and replace
the attribute by an element:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="library.rng"> <define name="attribute-available"> <element name="available"> <data type="boolean"/> </element> </define> </include> </grammar>
or:
include "library.rnc" { attribute-available = element available { xsd:boolean } }
This seems rather confusing (the named pattern is called
attribute-available
, and it’s now
describing an element), but the schema is perfectly valid and
describes instance documents in which the
available
attribute is replaced by an
available
element. The same approach can also
remove this attribute:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="library.rng"> <define name="attribute-available"> <empty/> </define> </include> </grammar>
or:
include "library.rnc" { attribute-available = empty }
Note how this uses a new pattern named empty
. This
pattern matches only text nodes made of whitespace, and it has the
same effect as if the named pattern had been removed from the schema.
The include
patterns have the effect of merging
the content of their grammar, after replacement of the redefined
patterns, with the content of the current grammar. This means that
these redefinitions can make references to any definition from either
the including or the included grammars. If you want, for instance, to
add zero or more email addresses to the author
element while retaining a flat structure, write:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="library.rng"> <define name="element-author"> <element name="author"> <ref name="content-person"/> <optional> <ref name="element-died"/> </optional> <zeroOrMore> <ref name="element-email"/> </zeroOrMore> </element> </define> </include> <define name="element-email"> <element name="email"> <data type="anyURI"> <param name="pattern">mailto:.*</param> </data> </element> </define> </grammar>
or:
include "library.rnc" { element-author = element author { content-person, element-died?, element-email* } } element-email = element email { xsd:anyURI { pattern = "mailto:.*" } }
Here the redefinition of the element-author
pattern is making three references to three named patterns.
content-person
and element-died
are defined in library.rng—i.e., the
grammar that is included. The third,
element-email
, is defined in the top-level
grammar—i.e., the including grammar.
When
I’ve replaced the
definitions in previous examples, the original definition was
completely replaced by the new one. This can make the maintenance of
these schemas more complicated than it should be. In the last
example, if the included schema (library.rng)
updated and the definition of element-author
changed to add a new element to include a telephone number, this
addition would be lost if I didn’t add it explicitly
in the including schema. As far as the
element-author
pattern is concerned, this
redefinition is no better than a copy and paste. A mechanism more
similar to inheritance would help with this.
To keep the definition from the included grammar, combine a new
definition with the existing one instead of replacing it. Unlike
redefinition, the combination of start
and named
patterns doesn’t take place in the
include
pattern itself but rather is done at the
level of the including grammar. It isn’t even
necessary to include a grammar to combine definitions, but the main
interest of combining definitions is to combine new definitions with
existing ones from included grammars.
There are two options for combining definitions:
choice
and interleave
.
When
definitions are combined
by choice, the result is similar to using a
choice
pattern between the content of the definitions. A use case for this
combination would be to define a schema accepting either a
library
or a book
element from
the schema used in the previous section. In the XML syntax, combining
by choice is done through a combine
attribute:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="library.rng"/> <start combine="choice"> <ref name="element-book"/> </start> </grammar>
In the compact syntax, combining by choice uses the
|=
operator (instead of
=
) in the definition:
include "library.rnc" start |= element-book
Note that in both cases, the combination is done outside the
inclusion. Its effect is to add a choice between the content of the
start
pattern. The definition becomes equivalent
to:
<start> <choice> <ref name="element-library"/> <ref name="element-book"/> </choice> </start>
or:
start = element-library | element-book
The logic behind this combination is to allow the content model corresponding to the original pattern while also allowing different content to appear. This is different from the logic behind pattern redefinitions, in which the original pattern is replaced by a new one.
Named
patterns can also be combined. If you want to accept either an
available
attribute or element, you can write:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="library.rng"/> <define name="attribute-available" combine="choice"> <element name="available"> <data type="boolean"/> </element> </define> </grammar>
or:
include "library.rnc" attribute-available |= element available { xsd:boolean }
Another interesting and common case involves making this attribute
optional, by combining this pattern by choice with an
empty
pattern:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="library.rng"/> <define name="attribute-available" combine="choice"> <empty/> </define> </grammar>
or:
include "library.rnc" attribute-available |= empty
Adding a choice between a defined component and nothingness may seem like a roundabout way to make the component optional, but it works with a minimum need to modify included schemas.
You have seen how an “old” pattern can be replaced by a new one using pattern redefinition and also how to specify a choice between an old definition and a new one using a combination by choice. The last option is to combine by interleave. The logic here is to allow pieces to be added to the original content model and to let these pieces be interleaved—i.e., added anywhere before, after, and between the subpatterns of the original pattern.
Earlier, I added an email
element to the content
of the author
element using a redefinition. You
can also use a combination by interleave to add this email pattern to
the content-person
pattern:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <include href="library.rng"/> <define name="content-person" combine="interleave"> <zeroOrMore> <ref name="element-email"/> </zeroOrMore> </define> <define name="element-email"> <element name="email"> <data type="anyURI"> <param name="pattern">mailto:.*</param> </data> </element> </define> </grammar>
or, in the compact syntax:
include "library.rnc" content-person &= element-email * element-email = element email { xsd:anyURI { pattern = "mailto:.*" } }
The effect of this combination by interleave is that the
content-model
pattern is now equivalent to an
interleave
pattern embedding both the original and
the new definition:
<define name="content-person"> <interleave> <group> <ref name="attribute-id"/> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> </group> <zeroOrMore> <ref name="element-email"/> </zeroOrMore> </interleave> </define>
or:
content-person = (attribute-id, element-name, element-born?) & element-email *
This definition allows any number of email
elements before the name
element, between the
name
element and the born
element, and after the born
element.
The logic here is to allow extension by adding new content anywhere
in the original definition. This is neat and safe if the applications
that read the documents are coded to ignore what they
don’t know. In our example, if I design an
application to read the original content model, this application will
be just fine with the new content model if it ignores the
email
elements that have been added.
You’ve seen how a combination by choice can make a
pattern optional. Combination by interleave can’t
reverse the process, but it can make a pattern forbidden. If you
don’t want to end up with a schema that
won’t validate any instance document, you must be
careful when working with a pattern to which reference is made
optional, such as the element-died
pattern:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="library.rng"/> <define name="element-died" combine="interleave"> <notAllowed/> </define> </grammar>
or:
include "library.rnc" element-died &= notAllowed
Here, we interleave a new pattern, notAllowed
,
with the content of the named pattern
element-died
. The effect of this operation is that
this pattern will no longer match any content model. This is OK
because the reference to the element-died
in the
definition of the author
element is optional. The
effect is that a document can be valid per the resulting schema only
if there is no died
element.
What about combining
start
patterns by interleave? This may
seem weird or even illegal because you’ve seen
start
patterns in a context in which they define
the root element of XML documents. A well-formed XML document can
have only one root element, but schemas can permit a variety of
different root elements in their models.
Another example in which combining by interleave is handy and very widely used is if you add attributes to a named pattern. In this case, the unordered interleave doesn’t make any difference because attributes are always unordered.
You have seen how to combine
definitions by interleave and choice,
and because group
is the third compositor, you
might be tempted to combine definitions by group. Unfortunately,
definitions of named patterns are declarations. Since the relative
order of these declarations isn’t considered
significant, combining definitions by group wouldn’t
give reliable results and has thus been forbidden. This issue
doesn’t arise with choice
and
interleave
compositors, because the relative order
of their children elements isn’t significant for a
schema.
3.142.173.72