The previous
couple of schemas can validate
instance
documents independently of the prefixes being used. They meet the
first goal of namespaces: disambiguating elements in multinamespace
documents. However, they will fail to validate the instance document
where we’ve added the
dc:publisher
element. You can easily update the
schema to explicitly add this element to the content model of our
book
element, but that won’t make
it an open schema that accepts the addition of elements from any
other namespace.
Instead of some magic feature that could have been quite rigid, RELAX NG introduced a flexible and clever feature that lets you define your own level of “openness.” The idea is to let you define your own wildcard, and, once you have it, you can include it wherever you want in your content model.
Before we start, I’ll define
what we are trying to achieve! We
want a named pattern allowing any element or attribute that
doesn’t belong to our lib
or
hr
namespaces. We probably want to exclude
attributes and elements with no namespaces; attributes, because our
own attributes have no namespace, and we might want to differentiate
them; and elements, because allowing elements without namespaces in a
document using namespaces violates the general intent of
disambiguating content. The content model of the elements
we’ll accept can be anything.
Let’s start by defining the inner content of the wildcard and define what we want our “anything” to be. “Anything” in terms of patterns is any number of elements (themselves containing “anything”), attributes, and text, in any order. This is a good candidate for a recursively named pattern:
<define name="anything"> <zeroOrMore> <choice> <element> <anyName/> <ref name="anything"/> </element> <attribute> <anyName/> </attribute> <text/> </choice> </zeroOrMore> </define>
or:
anything = ( element * { anything } | attribute * { text } | text )*
The only things new here are the
anyName
element (in the XML syntax) and the
*
operator (in the compact syntax),
which replace the name of an element or attribute. This is your first
example of a name
class
(a class of names). You’ll see that there are many
ways to restrict this name class. Now that we have a named pattern to
express what “anything” is, we can
use it to define what “foreign”
elements mean:
<define name="foreign-elements"> <zeroOrMore> <element> <anyName> <except> <nsName ns=""/> <nsName ns="http://eric.van-der-vlist.com/ns/library"/> <nsName ns="http://eric.van-der-vlist.com/ns/person"/> </except> </anyName> <ref name="anything"/> </element> </zeroOrMore> </define>
or:
default namespace lib = "http://eric.van-der-vlist.com/ns/library" namespace local = "" namespace hr = "http://eric.van-der-vlist.com/ns/person" ... foreign-elements = element * - (local:* | lib:* | hr:*) { anything }*
To achieve our purpose, we’ve introduced two new
elements embedded in the anyName
name class:
except
(-
in compact syntax)
has the same meaning it does with enumerations.
nsName
(xxx:*
in compact
syntax) means “any name from the specified
namespace.”
When using the XML syntax, nsName
uses an
ns
attribute, while prefixes are employed when
using the compact syntax. This usage of prefixes in the compact
syntax implies that declarations are added to define prefixes not
only for the lib
(which is also the default
namespace) and hr
namespaces, but also for
“no namespace” (here I have used
the prefix local
).
Note that name classes aren’t considered patterns; instead, they are a specific set of elements with a specific purpose. A consequence of this statement is that name class definitions can’t be placed within named patterns to be reused. Also, we have to repeat the same name class for both elements and attributes.
The same can be done to define foreign attributes:
<define name="foreign-attributes"> <zeroOrMore> <attribute> <anyName> <except> <nsName ns=""/> <nsName ns="http://eric.van-der-vlist.com/ns/library"/> <nsName ns="http://eric.van-der-vlist.com/ns/person"/> </except> </anyName> </attribute> </zeroOrMore> </define>
or:
foreign-attributes = attribute * - (local:* | lib:* | hr:*) { text }*
For convenience, we can also define foreign nodes by combining foreign elements and attributes:
<define name="foreign-nodes"> <zeroOrMore> <choice> <ref name="foreign-attributes"/> <ref name="foreign-elements"/> </choice> </zeroOrMore> </define>
or:
foreign-nodes = ( foreign-attributes | foreign-elements )*
Now that
we have
defined
what the
foreign-nodes
wildcard is, we can use the concept to
give more extensibility to our schema. To enable
foreign-nodes
to which we`ve
added the dc:publisher
element—between the
title
and author
elements—we can write (switching to a
“flatter” style to make it more
readable):
<element name="book"> <attribute name="id"/> <attribute name="available"/> <ref name="isbn-element"/> <ref name="title-element"/> <ref name="foreign-nodes"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> </element>
or:
book-element = element book { attribute id { text }, attribute available { text }, isbn-element, title-element, foreign-nodes, author-element*, character-element* }
This does the trick for the instance document shown earlier, but it
wouldn’t validate a document where foreign nodes
were added in any other place—for instance, between the
isbn
and title
elements. We
could insert a reference to the foreign-nodes
pattern between all the elements, but that method would be very
verbose. If you think about it, what we really want to do is
interleave
these foreign nodes between the content
defined for the book
element. This is a good
opportunity to use the interleave
pattern:
<element name="book"> <interleave> <group> <attribute name="id"/> <attribute name="available"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> </group> <ref name="foreign-nodes"/> </interleave> </element>
or:
element book { ( attribute id { text }, attribute available { text }, isbn-element, title-element, author-element*, character-element* ) & foreign-nodes }
We may be tempted to allow foreign nodes everywhere in our
document. However, while the extensibility gained is often acceptable
in elements such as book
that already have child
elements, it’s often considered a bad practice to do
the same in elements that contain only text or data. An example would
be the isbn
element, where this practice would
transform a text-content model into a mixed-content model. The reason
this trick is considered bad practice comes from the weak support for
mixed content models, as mentioned in Chapter 6, where I discussed the limitations of the
mixed
pattern. A consequence of allowing foreign
elements in isbn
elements would be that the
content of this element could no longer be considered
data
. Neither datatypes nor restrictions could be
applied.
Beyond this limitation of RELAX NG, applications would have to concatenate text nodes spread over the foreign elements. This concatenation can produce verbosity with tools such as XPath and XSLT.
One compromise on this issue is to allow only foreign attributes in
text-content models. That’s not an problem here
because our foreign-attributes
is ready for this
purpose:
<element name="isbn"> <ref name="foreign-attributes"/> <text/> </element>
or:
element isbn { foreign-attributes, text }
This way, the isbn
element is extensible but only
with attributes from foreign namespaces.
Although most of the time wildcard use is straightforward, there are some situations in which wildcards may lead to unexpected schema errors—especially with attributes, whose usage is subject to restrictions.
The first of the traps is related to the limitation that the definition of attributes can’t be duplicated in a schema. The following definition is invalid:
element title { attribute xml:space, attribute xml:space, text } # this is invalid
This seems to be pretty sensible, since duplicate attributes are
forbidden in the instance document. Unfortunately, the attribute
xml:space
is allowed by our
“foreign-attributes” named
template. We will get an error as well if we unthinkingly extend the
definition of our title element and write:
element title { foreign-attributes, attribute xml:space, text } # also invalid
To fix this error, we need to remove either the
xml:space
attribute from the name class of our
foreign attributes or the implicit mention of
xml:space
in our definition and just write:
element title { foreign-attributes, text }
Of course, this doesn’t remove the possibility of
including an xml:space
attribute in the
title
element because this attribute is a foreign
attribute as defined in our named pattern.
The second trap operates at a higher level but along the same lines.
It’s specific to the DTD compatibility
ID
feature. In Chapter 8, when
you saw this datatype, it was used to define the
book
element:
<element name="book"> <attribute name="id"> <data datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0" type="ID"/> </attribute> ... </element>
or:
element book { attribute id {dtd:ID}, ... }
Once again, an error will be generated if we add our foreign nodes.
Because this feature is emulating the DTD in all its aspects,
including the requirement that if an element book
is defined with an id
attribute having a type of
ID
, all the other definitions of an attribute
id
hosted by an element book
must have the same type ID
. The problem here is
that, hidden in the definition of anything
, there
can be a book
having an attribute
id
of type text
. This situation
would result in an error.
There is a way to work around this problem. If we want to use the
DTD
type ID
, we have to remove the problematic
possibility from the named pattern anything
. A
fast solution would be to exclude our own namespaces from the class
names in anything
. A better solution will be
introduced using features shown in Section 12.3 of the next
chapter.
In adding our foreign nodes, we have transformed:
<element name="book"> <attribute name="id"/> <attribute name="available"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> </element>
or:
element book { attribute id { text }, attribute available { text } isbn-element, title-element, author-element*, character-element* }
into:
<element name="book"> <interleave> <group> <attribute name="id"/> <attribute name="available"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> </group> <ref name="foreign-nodes"/> </interleave> </element>
or:
element book { ( attribute id { text }, attribute available { text }, isbn-element, title-element, author-element*, character-element* ) & foreign-nodes }
This operation can instead be accomplished as a pattern combination
using interleave if the content of the element
book
is described as a named pattern:
<define name="book-content"> <attribute name="id"/> <attribute name="available"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> </define>
or:
book-content = attribute id { text }, attribute available { text }, isbn-element, title-element, author-element*, character-element*
This pattern can then easily be extended as:
<define name="book-content" combine="interleave"> <ref name="foreign-nodes"/> </define>
or:
book-content &= foreign-nodes
and used to define the book
element:
<element name="book"> <ref name="book-content"/> </element>
or:
element book { book-content }
This combination can be done in a single document, but the mechanism can also extend a vocabulary by merging a grammar containing only these combinations. The exact same approach also works for appending foreign attributes to the elements that have text-based content models.
13.59.140.238