External references
offer a powerful but simple mechanism for including a pattern
contained in an external document at any location in a schema. This
feature works through raw inclusion of the referenced external
document. The externalRef
pattern is replaced by
the content of the document. That document may be a complete RELAX NG
schema, though that isn’t required, but a valid
pattern is required.
You
may want to reuse existing
schemas as a whole, without modifying any of their definitions.
Imagine, for instance, that we have defined two grammars in two
schemas to describe our author
and
character
elements. First, create a RELAX NG
schema, author.rng, to describe our authors:
<?xml version="1.0" encoding="UTF-8"?> <element name="author" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <attribute name="id"> <data type="ID"/> </attribute> <element name="name"> <data type="token" datatypeLibrary=""/> </element> <optional> <element name="born"> <data type="date"/> </element> </optional> <optional> <element name="died"> <data type="date"/> </element> </optional> </element>
or, in the compact syntax, author.rnc:
element author { attribute id { xsd:ID }, element name { token }, element born { xsd:date }?, element died { xsd:date }? }
Then create a second schema, character.rng, to describe our characters:
<?xml version="1.0" encoding="UTF-8"?> <element name="character" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <attribute name="id"> <data type="ID"/> </attribute> <element name="name"> <data type="token" datatypeLibrary=""/> </element> <optional> <element name="born"> <data type="date"/> </element> </optional> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </element>
or, in the compact syntax, character.rnc:
element character { attribute id { xsd:ID }, element name { token }, element born { xsd:date }?, element qualification { token } }
To combine these components into a schema describing our library, use
externalRef
patterns:
<?xml version="1.0" encoding="UTF-8"?> <element name="library" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <oneOrMore> <element name="book"> <attribute name="id"> <data type="ID"/> </attribute> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <externalRef href="author.rng"/> </oneOrMore> <zeroOrMore> <externalRef href="character.rng"/> </zeroOrMore> </element> </oneOrMore> </element>
In the compact syntax, externalRef
patterns are
represented using the keyword external
:
element library { element book { attribute id { xsd:ID }, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, external "author.rnc" +, external "character.rnc" * }+ }
The externalRef
pattern performs direct inclusion: when a RELAX NG processor reads a
schema, it replaces externalRef
with the contents
of the referred document.
The
previous example used
externalRef
to include the content of Russian doll
schemas, but this also works with flat schemas. For instance, we
might change our author schema,
author.rng
, to read:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="element-author"/> </start> <define name="element-author"> <element name="author"> <attribute name="id"> <data type="ID"/> </attribute> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> <optional> <ref name="element-died"/> </optional> </element> </define> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> </grammar>
or the compact syntax, author.rnc , to:
start = element-author element-author = element author { attribute id { xsd:ID }, element-name, element-born?, element-died? } element-name = element name { token } element-born = element born { xsd:date } element-died = element died { xsd:date }
And our character schema, character.rng , to:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="element-character"/> </start> <define name="element-character"> <element name="character"> <attribute name="id"> <data type="ID"/> </attribute> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> <ref name="element-qualification"/> </element> </define> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar>
or, in the compact syntax, character.rnc :
start = element-character element-character = element character { attribute id { xsd:ID }, element-name, element-born?, element-qualification } element-name = element name { token } element-born = element born { xsd:date } element-qualification = element qualification { token }
The schema using externalRef
and
external
in the previous section will have no
difficulty using these flat schemas in place of the Russian doll
versions.
This seems
straightforward and logical, but why
does this approach work? How come there is no collision between the
named patterns element-name
and
element-born
defined in both
author.rng and
character.rng? Why is it that the
start
patterns defined in
author.rng and
character.rng don’t apply to
the schema for our library?
This works because of a RELAX NG feature called embedded
grammars
. As I have already mentioned,
externalRef
patterns perform strict inclusion of
the referred schema. Using our last example, this means that our
resulting schema is:
<?xml version="1.0" encoding="UTF-8"?> <element name="library" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <oneOrMore> <element name="book"> <attribute name="id"> <data type="ID"/> </attribute> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <grammar> <start> <ref name="element-author"/> </start> <define name="element-author"> <element name="author"> <attribute name="id"> <data type="ID"/> </attribute> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> <optional> <ref name="element-died"/> </optional> </element> </define> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> </grammar> </oneOrMore> <zeroOrMore> <grammar> <start> <ref name="element-character"/> </start> <define name="element-character"> <element name="character"> <attribute name="id"> <data type="ID"/> </attribute> <ref name="element-name"/> <optional> <ref name="element-born"/> </optional> <ref name="element-qualification"/> </element> </define> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar> </zeroOrMore> </element> </oneOrMore> </element>
or, in the compact syntax:
element library { element book { attribute id { xsd:ID }, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, grammar { start = element-author element-author = element author { attribute id { xsd:ID }, element-name, element-born?, element-died? } element-name = element name { token } element-born = element born { xsd:date } element-died = element died { xsd:date } }+, grammar { start = element-character element-character = element character { attribute id { xsd:ID }, element-name, element-born?, element-qualification } element-name = element name { token } element-born = element born { xsd:date } element-qualification = element qualification { token } }* }+ }
Here we are thus embedding grammars within our schema, and they behave as patterns. In fact there’s even more than that: for RELAX NG, grammars are patterns! The meaning of these patterns is twofold:
As far as validation is concerned,
embedded grammars are equivalent to their
start patterns: the grammar describing the
character
element, for instance, matches instance
nodes corresponding to its start pattern—i.e., instance nodes
matching the pattern element-character
, which is
what was expected.
Grammars also set the scope of their definitions:
start
and named patterns defined in a grammar are
visible only in this grammar. Their scope (the location where they
can be referred to) is strictly limited to the grammar in which they
are defined.
Applied to our example, the strict scoping of
start
and named patterns means that:
The born
pattern of the grammar describing the
character
element can’t be seen
from its parent grammar—i.e., the grammar describing the
library
and book
elements. Nor
can it be seen from its sibling grammar—i.e., the grammar
describing the author
element. The same applies to
start
patterns.
Unlike common usage among programming languages, the scopes of
start
and named patterns don’t include embedded grammars.
start
and named patterns defined in the grammar
describing the library
and book
elements aren’t visible in the embedded
grammars.
This
strict isolation of
start
and named patterns in their grammars is
usually convenient when you create references to external grammars.
It means that external grammars can be written independently without
risk of collision or incompatibility. You can safely take any RELAX
NG schema, drop it into a new schema, and see it as a single pattern
without any risk of collision.
On the other hand, that approach doesn’t let you
modify what you include (you will see how to do so in the next
section) nor even let you leverage a set of common named patterns. In
our example, since there are already \ two definitions of
element-name
and element-born
,
it’s a good thing that they are both isolated in
their grammars. If you were designing the same building blocks from
scratch, however, you’d probably want to have only
one definition of these two elements that could be shared by the
author
and character
elements.
In fact, if you followed the principle “if
it’s written more than once, make it
common,” you’d also want to share
the definition of the id
attribute.
Parent references let you make an explicit reference to a pattern from the parent grammar—i.e., the grammar embedding the current one. In this case, you need to add the definition that you want to share in the top-level schema even if you don’t use all of them in this schema:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <element name="library"> <oneOrMore> <element name="book"> <ref name="attribute-id"/> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <externalRef href="author.rng"/> </oneOrMore> <zeroOrMore> <externalRef href="character.rng"/> </zeroOrMore> </element> </oneOrMore> </element> </start> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="attribute-id"> <attribute name="id"> <data type="ID"/> </attribute> </define> </grammar>
or:
start = element library { element book { attribute-id, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, external "author.rnc"+, external "character.rnc"* }+ } element-name = element name { token } element-born = element born { xsd:date } attribute-id = attribute id { xsd:ID }
Now, to make a reference to the named patterns
element-name
, element-born
, and
attribute-id
in the embedded grammars, use a
pattern called parentRef
. This pattern makes
author.rng look like:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="element-author"/> </start> <define name="element-author"> <element name="author"> <attribute name="id"> <data type="ID"/> </attribute> <parentRef name="element-name"/> <optional> <parentRef name="element-born"/> </optional> <optional> <ref name="element-died"/> </optional> </element> </define> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> </grammar>
and the character.rng schema now looks like:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="element-character"/> </start> <define name="element-character"> <element name="character"> <attribute name="id"> <data type="ID"/> </attribute> <parentRef name="element-name"/> <optional> <parentRef name="element-born"/> </optional> <ref name="element-qualification"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar>
The parentRef
pattern is translated to a
parent
keyword in the compact syntax. The
author.rnc schema looks like:
start = element-author element-author = element author { attribute id { xsd:ID }, parent element-name, parent element-born?, element-died? } element-died = element died { xsd:date }
while the character.rnc schema looks like:
start = element-character element-character = element character { attribute id { xsd:ID }, parent element-name, parent element-born?, element-qualification } element-qualification = element qualification { token }
You are using these features in the context of multiple schema
documents, but the semantic of the externalRef
pattern itself remains the same. This schema is equivalent to the
same schema, with its externalRef
patterns
expanded in a single monolithic schema with two embedded grammars:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <element name="library"> <oneOrMore> <element name="book"> <ref name="attribute-id"/> <attribute name="available"> <data type="boolean"/> </attribute> <element name="isbn"> <data type="token" datatypeLibrary=""/> </element> <element name="title"> <attribute name="xml:lang"> <data type="language"/> </attribute> <data type="token" datatypeLibrary=""/> </element> <oneOrMore> <grammar> <start> <ref name="element-author"/> </start> <define name="element-author"> <element name="author"> <attribute name="id"> <data type="ID"/> </attribute> <parentRef name="element-name"/> <optional> <parentRef name="element-born"/> </optional> <optional> <ref name="element-died"/> </optional> </element> </define> <define name="element-died"> <element name="died"> <data type="date"/> </element> </define> </grammar> </oneOrMore> <zeroOrMore> <grammar> <start> <ref name="element-character"/> </start> <define name="element-character"> <element name="character"> <attribute name="id"> <data type="ID"/> </attribute> <parentRef name="element-name"/> <optional> <parentRef name="element-born"/> </optional> <ref name="element-qualification"/> </element> </define> <define name="element-qualification"> <element name="qualification"> <data type="token" datatypeLibrary=""/> </element> </define> </grammar> </zeroOrMore> </element> </oneOrMore> </element> </start> <define name="element-name"> <element name="name"> <data type="token" datatypeLibrary=""/> </element> </define> <define name="element-born"> <element name="born"> <data type="date"/> </element> </define> <define name="attribute-id"> <attribute name="id"> <data type="ID"/> </attribute> </define> </grammar>
or, in the compact syntax:
start = element library { element book { attribute-id, attribute available { xsd:boolean }, element isbn { token }, element title { attribute xml:lang { xsd:language }, token }, grammar { start = element-author element-author = element author { attribute id { xsd:ID }, parent element-name, parent element-born?, element-died? } element-died = element died { xsd:date } }+, grammar { start = element-character element-character = element character { attribute id { xsd:ID }, parent element-name, parent element-born?, element-qualification } element-qualification = element qualification { token } }* }+ } element-name = element name { token } element-born = element born { xsd:date } attribute-id = attribute id { xsd:ID }
You can see how start
and named patterns have been
defined in each of the three grammars composing this schema:
element-died
is defined in the grammar defining
the author
element and can be used only in this
grammar.
Similarly, element-qualification
is defined in the
grammar defining the character
element and can be
used only there.
element-name
, element-born
, and
attribute-id
are defined in the top-level grammar.
They can be used in this grammar through normal references (i.e.,
ref
patterns) and can also be used in its child
grammars (i.e., the grammars that are directly embedded into this
one, using a parentRef
pattern).
There are two more things to note about the
parentRef
pattern:
If the depth of nesting of grammar is higher than two, you may run
into trouble because you can make a reference only to your immediate
parent grammar, not to the other grammar ancestors. The RELAX NG
working group has considered this issue but hasn’t
found any real-world use case for generalizing
parentRef
patterns to greater depths of nesting.
If you find one, they will probably welcome a mail on the subject! In
practice, if you need to do so, you can, as a workaround, define
named patterns in the intermediary grammars that can act as proxies.
Now that we’ve added the
parentRef
patterns to our two schemas,
author.rng and
character.rng can’t be used as
standalone schemas for validating documents with
author
or character
root
elements. Using them now requires that they be embedded into grammars
that provide the definitions for the named patterns they are using to
be complete and operational.
3.145.164.47