Alternative modelling languages

A number of alternatives to the DTD have arisen since the release of the XML standard. They differ in capability and scope, but on two things they all agree: name and syntax.

'Schema' languages

The term schema is used to describe all of the alternative modelling languages. In the IT field, this name has its roots in database technologies. It is used to describe the tables and fields in a database, including constraints on the values of particular fields, and the relationships between tables (to those familiar with these concepts, the similarity to document modelling should be obvious). More generally, the term simply means a representation of something using a diagram, a plan or an outline.

Note that the plural of the word 'schema' is often given as 'schemas', but 'schemata' can be used as well.

XML syntax

In every case, these languages adopt standard XML document syntax. This approach has a number of benefits, as outlined above. The only disadvantages of this approach are that the syntax is quite verbose (hindering legibility, and slowing down data transfer), and that, due to the rather obvious choice of element and attribute names, explanatory texts (including this chapter) cannot easily avoid such clumsy phrases as 'the Element element defines an element'.

Multiple languages

A single, universally supported standard is generally considered to be desirable. Yet, in recent times, the following proposals have all been in contention:

  • RELAX

  • TREX

  • RELAX NG

  • Schematron

  • XML Schema.

For the moment, the DTD modelling language remains dominant, simply because it has a significant head-start on the competition. At the time of writing, it is the only universally supported language. But almost all of the alternatives are superior, and it is widely expected that DTDs will eventually fade into obscurity. There will no doubt be a shake-out of these proposals (and this is already happening, as the third language listed above is a consolidation of the previous two); though whether there will be a single survivor, rather than a small number with different strengths, is too early to say. While it is possible that more than one will succeed, the only one that is almost guaranteed to do so is XML Schema, if only because it is backed both by the W3C and the major software vendors. But, while the XML Schema standard deserves particular attention, the others cannot be ignored.

RELAX

RELAX was initially developed by INSTAC (Information Technology Research and Standardization Center), and a small number of volunteers contributed to its design. The Core standard has been approved as an ISO Technical Report (ISO TR 22250-1). RELAX has been described by its originators as 'a specification for describing XML-based languages'. RELAX grammars use XML document syntax, and have adopted the data types developed for XML Schema (see Chapter 15). This proposed standard aimed to provide eighty percent of the functionality of the XML Schema standard, for only twenty percent of the complexity, and many have agreed that early drafts of this language met this objective.

Based on the theory of tree automata, RELAX has a modular design, is namespace-aware, and has a number of powerful features. For example, grammars can be compared for upward compatibility, by computing the difference between versions, and the functionality of SGML exclusions and inclusions can be supported (see Chapter 32), without the processing complexity.

However, work on RELAX has now ceased (for the reasons given below).

TREX

The TREX (Tree Regular Expressions for XML) proposal was developed by James Clark (see jclark.com). The following example shows how this standard uses XML document syntax, and an intuitive structure for modelling XML document structures:

<element name="book">
  <optional>
    <element name="intro">
      <element name="title"><anyString/></element>
      <element name="author"><anyString/></element>
    </element>
  </optional>
  <zeroOrMore>
    <element name="chapter">
      <element name="title"><anyString/></element>
    </element>
  </zeroOrMore>
</element>
<!-- DTD EQUIVALENT:
<!ELEMENT book     (intro?, chapter*)>
<!ELEMENT intro    (title, author)>
<!ELEMENT title    (#PCDATA)>
<!ELEMENT autor    (#PCDATA)>
<!ELEMENT chapter  (title, ...)>

However, work on TREX has now ceased (for the reasons given next).

RELAX and TREX combined

Because RELAX Core and TREX are very similar, they are going to be unified by OASIS (see www.oasis-open.org/committees/relax-ng). The unified language is to be called RELAX NG (Relax Next Generation), which is pronounced 'relaxing', and there are plans to submit the same proposal to the ISO. See relaxng.org for details.

Schematron

Schematron (now at version 1.5) is described as a 'simple XML-based assertion language using patterns in trees'. It can be used for validation, for automated link generation, and for triggering actions based on very complex criteria.

The Schematron language is not grammar-based, like the other languages covered here. Instead, it is 'rule-based', allowing validation to include the kind of checks that grammars cannot achieve, such as ensuring that a particular element is present when another element has a particular attribute value. This language uses the XPath expression language (see Chapter 13) to interrogate a document. It uses expressions to discover whether or not a pattern can be matched in the document. Any 'false' return values means that the document is invalid.

Recent changes have added a way of grouping patterns together to allow dynamic validation (different rules and assertions are tested according to the phase), and 'abstract rules', which permit more convenient declarations and type extensions. See www.ascc.net/xml/resource/schematron for the latest details.

This language is recognized by the namespace 'http://www.ascc.net/xml/schematron'. It is envisaged that Schematron validation instructions could be embedded within other documents, and would typically form an extension to another, grammar-based document validation scheme (one suggestion makes use of the XML Schema AppInfo element for this purpose (see below)).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.214.215