Chapter 23. Versioning

As business and technical requirements change over time, you will need to define new versions of your schemas. Defining new versions is a special case of extension and restriction. You may be both adding and removing components, with the intention of replacing the older version.

When you create a new version intended to replace a previous one, you should create a completely new schema rather than attempt to extend or restrict the existing one. Otherwise, as time goes on and additional versions are created, the definitions could become unnecessarily complicated and difficult to process. If you are not using the restriction and extension mechanisms of XML Schema, though, you need to take extra care to make the new definitions compatible with the old ones.

23.1. Schema compatibility

In many cases, you will want to maintain some compatibility between versions. You might want to allow instances to be validated against either schema, or be processed by an application that supports either version. This is especially true if your instances persist for a period. If your instances are short-lived messages between applications, compatibility is less of an issue. However, you should still try to be as consistent as possible to reduce learning curves and minimize the changes in the applications that process the instances.

There are two kinds of compatibility:

1. Backward compatibility, where all instances that conform to the previous version of the schema are also valid according to the new version

2. Forward compatibility, where all instances that conform to the new version are also valid according to the previous version of the schema

23.1.1. Backward compatibility

Ideally, you should have backward compatibility of the schemas from one version to the next. That is, instances that were created to conform to version 2.0 of the schema should also be valid according to version 2.1.1 This is possible if you are only adding optional new components and/or reducing restrictiveness. To accomplish this, the previous version must allow a subset of what is allowed by the new version.

The following changes to a schema are backward-compatible:

• Adding optional elements and attributes.

• Making required elements and attributes optional.

• Making occurrence constraints less restrictive—for example, allowing more than one color element where only one was allowed before.

• Turning specific element declarations into choice groups. For example, where color was allowed, now it can be color or size or weight. Similarly, you can declare new substitution groups. For example, where the content model allowed color, now size and weight are valid substitutes.

• Making simple types less restrictive by making bounds facets and length facets less restrictive, adding enumeration values, or making patterns less restrictive.

• Turning a simple type into a union of that simple type and one or more other simple types.

• Turning a simple type into a list type that allows multiple values of the original type.

• Adding optional wildcards or open content.

• Replacing element or attribute declarations with wildcards.

• Turning a sequence group into an all group or a repeating choice group.

The following changes to a schema are not backward-compatible:

• Changing the order of elements or imposing an order where none was imposed previously.

• Changing the structure of elements, for example adding more levels of elements.

• Removing any element or attribute declarations.

• Removing wildcards or open content, or making them more restrictive in terms of what namespaces they allow or how strictly replacement elements are validated.

• Changing the names of any elements or attributes.

• Changing the target namespace of the schema.

• Adding any required elements or attributes.

• Making optional elements or attributes required.

• Making occurrence constraints more restrictive—for example, allowing only one color element where more than one was allowed before.

• Making simple types more restrictive by making bounds facets and length facets more restrictive, removing enumeration values, or making patterns more restrictive.

For example, suppose you have the complex type definition shown in Example 23–1. Its version number is 2.0.

Example 23–1. Version 2.0 of a complex type


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           version="2.0">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="size" type="SizeType"
                  maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="SizeType">
    <xs:simpleContent>
      <xs:extension base="xs:integer">
        <xs:attribute name="system" type="xs:token"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:schema>


Example 23–2 shows a backward-incompatible definition for a new version, 2.1. It is backward-incompatible for a number of reasons.

• The order of the element declarations changed; name is now after size.

• The number element was removed, which is incompatible even though it was optional.

Example 23–2. Backward-incompatible definition


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="size" type="SizeType" maxOccurs="3"/>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="description" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="SizeType">
    <xs:simpleContent>
      <xs:extension base="xs:positiveInteger">
        <xs:attribute name="system" type="xs:token"
                      use="required"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:schema>


• A required description element was added.

• The optional system attribute was made required.

• The occurrence constraints on size were made more restrictive.

• The contents of SizeType were made more restrictive, allowing only positive integers instead of all integers.

As a result of all these changes, an instance that conformed to version 2.0 may not be valid according to version 2.1. Example 23–3 shows such an instance. On the other hand, Example 23–4 shows a definition that is backward-compatible.

All of the changes in this example were backward-compatible because they do not affect the validity of version 2.0 instances. For example:

• No element or attribute declarations were removed or reordered.

• Only declarations for optional elements and attributes (desc and units) were added.

• The required name element was made optional.

Example 23–3. Backward-incompatible instance


<product>
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
  <size>0</size>
  <size>2</size>
  <size>4</size>
  <size>6</size>
</product>


Example 23–4. Backward-compatible definition


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer"
                  minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="name" type="xs:string" minOccurs="0"/>
      <xs:element name="size" type="SizeType"
                  maxOccurs="unbounded"/>
      <xs:element name="desc" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="SizeType">
    <xs:simpleContent>
      <xs:extension base="xs:decimal">
        <xs:attribute name="system" type="xs:token"/>
        <xs:attribute name="units" type="xs:token"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:schema>


• The number element was made repeating, which is less restrictive.

• The contents of SizeType were made less restrictive, allowing any decimal number instead of only an integer.

23.1.2. Forward compatibility

Some schema designers take their versioning strategy a step further: They make their schemas forward-compatible, so that a version 2.1 instance is valid according to the version 2.0 schema. This requires some careful planning when developing the 2.0 schema. An area needs to be set aside for the elements that might be added in version 2.1. This area needs to be allowed to contain unspecified content in the 2.0 schema, but be more specifically defined (by adding new element declarations) in the 2.1 schema.

This is typically done by defining wildcards in the original schema. In Example 23–5, both element and attribute wildcards are used in the version 2.0 schema. The processContents option is set to skip so that the processor does not look for declarations that do not exist in this version of the schema.

Example 23–5. Version 2.0 of a forward-compatible complex type


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           version="2.0">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="size" type="xs:integer"
                  maxOccurs="unbounded"/>
      <xs:any minOccurs="0" maxOccurs="unbounded"
              processContents="skip"/>
    </xs:sequence>
    <xs:anyAttribute processContents="skip"/>
  </xs:complexType>
</xs:schema>


Example 23–6 shows version 2.1 of the schema, with a new element desc and a new attribute dept. This version of the schema also includes element and attribute wildcards to allow it to be forward-compatible with version 2.2 of the schema.

Example 23–6. Version 2.1 of a forward-compatible complex type


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="size" type="xs:integer"
                  maxOccurs="unbounded"/>
      <xs:element name="desc" type="xs:string"/>
      <xs:any minOccurs="0" maxOccurs="unbounded"
              processContents="skip"/>
    </xs:sequence>
    <xs:attribute name="dept" type="xs:token"/>
    <xs:anyAttribute processContents="skip"/>
  </xs:complexType>
</xs:schema>


Example 23–7 shows an instance that is valid according to version 2.1, but is also allowed by version 2.0 because of the wildcards.

Example 23–7. Forward-compatible instance


<product dept="WMN">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
  <size>0</size>
  <desc>Our best-selling shirt</desc>
</product>


image

The method shown in Examples 23–5 and 23–6 works fine, but only because size is required. In version 1.0 of XML Schema, if size were optional, this complex type would violate the Unique Particle Attribution rule. A processor, upon encountering a size element, would not know whether to use the size element declaration or the wildcard to validate it. In version 1.1, this constraint has been eliminated, and the element declaration will always be used instead of the wildcard when both might apply.

In version 1.0, this problem can be avoided either by inserting a dummy required element at the end of the defined content model, or by putting the wildcard inside a child element, for example one called extension. Unfortunately, neither is a great option, because for every minor version there needs to be a new dummy element or extension child, cluttering up the content model. Instead, it is highly recommended that you upgrade to XML Schema 1.1 if you need forward compatibility.

One disadvantage of forward compatibility is that having an openended wildcard on every type makes the schemas very flexible. The wildcard in Example 23–6 will allow any replacement elements, including ones that are already declared in that version 2.1 schema. In version 1.1 of XML Schema, this can be mitigated by putting a notQName="##defined" attribute on the wildcard. This means that a replacement element cannot be one that is already declared in the schema.

image
image

In fact, version 1.1 offers a number of features to make forward compatibility easier, including:

• Open content, where wildcards can be automatically interleaved everywhere in a type. This removes the need to specify wildcards between each pair of adjacent elements when maximum future flexibility is needed. Open content is covered in Section 12.7.2 on p. 292.

• Negative wildcards, where you can specify names and namespaces that are not allowed as replacement elements, thus limiting excessive flexibility of wildcards. Negative wildcards are covered in Section 12.7.1.3 on p. 289.

• Looser restrictions on all groups, which means that it is easier to create types where the order of child elements doesn’t matter. This makes it easier to insert elements that can be interleaved in future versions without requiring that all new content comes at the end. This is covered in Section 12.5.4 on p. 276.

image

Forward compatibility is harder to achieve and therefore less common. However, it is a worthy goal, especially in cases where it is likely that older application code (designed to process prior versions) is likely to persist unchanged for long periods of time.

Note that forward compatibility does not automatically include backward compatibility. It is possible to introduce backward-incompatible changes to a forward-compatible schema. In fact, Example 23–6 is not backward-compatible because the desc element is required. It is possible to have a 2.0 instance without a desc element, in which case it is invalid in version 2.1. If both forward compatibility and backward compatibility are desired, both must be considered when designing schemas.

23.2. Using version numbers

23.2.1. Major and minor versions

The version numbers used in this chapter have the format of two numbers separated by a period, for example, 2.1. It is implied that “2” represents the major version number and “1” represents the minor version number. There is no requirement for version numbers to have this format in XML Schema. In fact, the version attribute will accept any string. However, it is common practice to use numeric version numbers because they make it easy to see the order over time.

It is also typical to use both major and minor version numbers. A change in the minor version number only indicates a minor release—one that has little impact in terms of the number or extensiveness of changes. A change in the major version number indicates a major release, which tends to be more disruptive and involve more changes. Many designers of XML vocabularies make this definition more formal: They use minor versions for releases that are backward-compatible and major versions for releases that are backward-incompatible.

Figure 23–1 depicts this approach. There is backward compatibility among the 2.x releases, and backward compatibility among the 3.x releases, but not between the two major releases. Within a particular major version, there is backward-compatibility from one release to the next. Version 2.1 is obviously designed to be backward-compatible with 2.0. Version 2.2 should be built on version 2.1, including all of the new (optional) elements and attributes and other changes made in version 2.1, so it is backward-compatible with both versions 2.1 and 2.0.

Image

Figure 23–1. Major and minor versions

When version 3.0 is released, it doesn’t have to be backward-compatible with version 2.3 or any of the 2.x versions. This is a chance to make significant changes. It may be useful during a major release to consider making some of the optional elements and attributes added in minor versions required. They may have been added as optional simply to achieve backward compatibility in minor releases, even though it was actually preferable to make them required. It is also an opportunity to remove any elements or attributes that were deprecated in previous releases.

23.2.2. Placement of version numbers

Every schema should have an associated version number. There are at least four possible places to indicate the version number of a schema, none of which is actually required by XML Schema. They are discussed in this section.

23.2.2.1. Version numbers in schema documents

The version attribute of schema is an arbitrary string that represents the version of the vocabulary being described by the schema document. Note that it is not intended to convey whether you are using version 1.0 or 1.1 of the XML Schema language itself; there is no need to indicate this in your schema document. The version attribute is strictly for documentation; an XML Schema processor does not use it. It is optional, but its use is encouraged.

Example 23–8 shows a schema that uses the version attribute to indicate that it is version 2.1 of the schema, along with an instance that conforms to it. The instance in this example is not doing anything special to indicate the version of the schema to which it conforms.

Example 23–8. Using a version number in a schema


Schema (prod.xsd):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Instance:

<product xmlns="http://datypic.com/prod"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://datypic.com/prod prod.xsd">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
</product>


The version attribute is intended to apply to the schema document itself and all the components defined within it. It is also possible to use non-native attributes or annotations to indicate version numbers for individual components in a schema. Example 23–9 shows a schema document that uses non-native attributes to add a doc:version attribute to indicate that the catalog element declaration and its type are at version 2.1, while the product element declaration and its type are at version 2.0.

Example 23–9. Using a version number on individual schema components


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           xmlns:doc="http://datypic.com/doc"
           version="2.1">
  <xs:element name="product" type="ProductType" doc:version="2.0"/>
  <xs:complexType name="ProductType" doc:version="2.0">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="catalog" type="CatalogType" doc:version="2.1"/>
  <xs:complexType name="CatalogType" doc:version="2.1">
    <xs:sequence>
      <xs:element name="catalog_id" type="xs:string"/>
      <xs:element ref="product" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>


This may be useful as a way to clearly delineate which components have changed over multiple versions. However, it does require some extra management of the components.

23.2.2.2. Versions in schema locations

The filename or URL of the schema document can also contain the version number. For example, the new version may have a filename of prod_2.1.xsd or be located in a directory structure that indicates the version number, for example 2.1/prod.xsd or http://datypic.com/prod/2.1/prod.xsd. Changing the URL makes it easier for other schema documents that may include or import your schema document to continue to use the previous version until they can upgrade. Example 23–10 shows a schema whose filename contains its version number.

Example 23–10. Using a version number in the schema location


Schema (prod_2.1.xsd):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Instance:

<product xmlns="http://datypic.com/prod"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://datypic.com/prod prod_2.1.xsd">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
</product>


23.2.2.3. Versions in instances

It may be worthwhile to have instances identify the schema version to which they conform. This will allow an application to process it accordingly. For example, in XSLT, the stylesheet element has a required attribute named version. This is a signal to processors that the stylesheet instance conforms to, for example, version 2.0 of XSLT.

Typically a version attribute in an instance appears on the root element. Example 23–11 shows a schema and related instance where the version number 2.1 is indicated on the root element. Note that the version attribute has to be declared in the schema; it is not a special attribute that can appear without a declaration.

Some schemas put a fixed value on the version declaration to ensure that an instance can only be validated by a particular version of a schema. For example, if fixed="2.1" were added to the attribute declaration in Example 23–11, the version number would have to be 2.1 (or the attribute would have to be absent) in the instance for it to be valid according to this schema. However, this is not recommended, at least for minor versions, because it breaks backward compatibility.

Example 23–11. Using a version number in the instance


Schema (prod.xsd):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
    <xs:attribute name="version"/>
  </xs:complexType>
</xs:schema>

Instance:

<product xmlns="http://datypic.com/prod"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://datypic.com/prod prod.xsd"
         version="2.1">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
</product>


23.2.2.4. Versions in namespace names

Many vocabularies also indicate their version number in the namespace name. When you change a namespace name, it is as if you were completely renaming the components in that namespace. This instantly breaks backward compatibility between schema versions, as the names have essentially changed. It also frequently requires applications that process the instances to change, since many XML technologies (such as XPath, XQuery, and XSLT) are namespace-aware.

That may be desirable in the case of a major release where there is no intention of backward compatibility and the instances change so much that it is necessary for applications to change the way they process the instances. It is definitely not appropriate for minor releases intended to be backward-compatible. Therefore, when a version number is included in a namespace name, it is frequently only the major version number.

Example 23–12 shows a schema that uses the major version number (“2”) in the namespace name.

Example 23–12. Using a version number in the namespace name


Schema (prod.xsd):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod/2"
           targetNamespace="http://datypic.com/prod/2">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Instance:

<product xmlns="http://datypic.com/prod/2"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://datypic.com/prod/2 prod.xsd">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
</product>


23.2.2.5. A combination strategy

It is likely that you will use a combination of some, or all, of the four version number locations. Example 23–13 shows a schema that uses all four methods.

Example 23–13. Using multiple methods to indicate version number


Schema (schemas/prod/2.1/prod.xsd):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified"
           xmlns="http://datypic.com/prod/2"
           targetNamespace="http://datypic.com/prod/2"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer" minOccurs="0"/>
      <xs:element name="name" type="xs:string"/>
    </xs:sequence>
    <xs:attribute name="version" type="xs:decimal"/>
  </xs:complexType>
</xs:schema>

Instance:

<product xmlns="http://datypic.com/prod/2"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://datypic.com/prod/2
                             schemas/prod/2.1/prod.xsd"
         version="2.1">
  <number>557</number>
  <name>Short-Sleeved Linen Blouse</name>
</product>


23.3. Application compatibility

Whether or not you can achieve schema compatibility, it is also worthwhile to try to achieve application compatibility. Well-designed applications that were written to process the previous version should be able to process instances of the new version without crashing.

Likewise, applications that process the new version can be made to support both versions. If the new version only contains optional additions, the application can use the same logic for both versions of instances. Alternatively, the application can check the version number (as described in Section 23.2) and process each version differently.

It is impossible to predict how people will modify or extend a schema over time, but several simple practices can help handle changes more gracefully.

Ignore irrelevant elements or attributes. The application should process the elements and attributes it is expecting, without generating errors if additional elements or attributes appear. This is especially true if they are in a different namespace. The application should treat every content model as if it had both attribute and element wildcards, even if it does not.

Avoid overdependence on the document structure. Minimize the amount of structural checking you do in the application code. If you are using a SAX parser, process the element you are interested in by name, but do not necessarily keep track of its parent or grandparent. In XSLT, consider using more of a “push” model instead of “pull,” creating templates for individual elements such as product/number rather than hard-coding entire paths like catalog/product/number. This will allow the XSLT to still work even if a department element is added between catalog and product.

Avoid overdependence on namespaces. A change in a namespace name, for example to include a new version number, is disruptive to your code. While you may need to write entirely new code for the new version, it is ideal if you can reuse some of the code from the previous version. Avoiding the use of namespace names when referring to element names, or at least parameterizing the namespace names instead of hard-coding them throughout your code, can make the upgrade easier and promote reuse.

23.4. Lessening the impact of versioning

A few best practices can ease the pain of versioning for the implementers of your XML vocabulary. They are discussed in this section.

23.4.1. Define a versioning strategy

If you are defining a complex vocabulary, one that changes frequently or one that is used by a variety of implementers, it is helpful to clearly define a versioning strategy. That way, implementers know what to expect when a new version is released. A versioning strategy should specify the following information:

• How will version numbers be formatted and ordered?

• Where will version numbers be indicated in the schemas? In the version attribute? In the schema document URL? In the instance? In the namespace? Using some combination of these?

• Are minor releases backward-compatible? Are they forward-compatible?

• Are major releases backward-compatible? Are they forward-compatible?

• How will deprecated components be indicated?

• How will changes be documented?

• Are implementers expected to support multiple versions?

• Are implementers expected to upgrade to the newest version in a particular time frame?

23.4.2. Make only necessary changes

When developing a new version that is not required to be backward-compatible, it is tempting to make small fixes—change names that are not as descriptive as they could be, reorder elements to be more intuitive, or change cardinalities to be slightly more constrained. Sometimes there are good reasons to make these changes, for example because the schema is not conformant to a particular NDR specification or is genuinely confusing. But if there is no good reason for that, don’t give in to the temptation. The changes may seem small, but they can add up and cause confusion, software bugs, and incompatibilities, placing a significant burden on implementers.

23.4.3. Document all changes

All changes to a schema in a new version should be clearly documented in a set of release notes or a formal change log. Each entry in the change log should have the following information:

• Description of the change

• Reason for the change

• Whether the change is backward-compatible

• Notes on upgrading or downgrading—for example, if a required element is added, how should that value be determined when upgrading instances to the new version?

If there are a lot of changes, consider creating a side-by-side mapping document that shows all the differences, like the one shown in Table 23–1. The first two columns contain the element names used in new and old instances, indented to show the hierarchy of elements in each version. The third column describes the change, and the fourth column indicates whether the change is backward-compatible.

Table 23–1. Sample change log showing mapping

Image

23.4.4. Deprecate components before deleting them

To ease the transition from one version to the next, it is possible to indicate that certain components are deprecated—that is, they are still in the schema but are not recommended for use, and are likely to be deleted in a future version of the schema.

There is no formal way to deprecate components in XML Schema, but deprecation can be indicated in non-native attributes or annotations. If the deprecated element is intended to be replaced by another element, the two can be put together in a choice group or substitution group during the deprecation period, so that either is allowed. It is also useful to provide human-readable descriptive information that includes its replacement, if any.

Example 23–14 shows one approach to deprecation. A deprecated element with a value true is inserted into appinfo to formally indicate that it is deprecated, and a human-readable description is also provided in documentation. The deprecated element, color, is placed in a choice group with its intended replacement, colorList. In the next version, color will be deleted, leaving colorList as the only choice.

Example 23–14. One approach to deprecation


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:doc="http://datypic.com/doc"
           version="2.1">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer"/>
      <xs:element name="name" type="xs:string"/>
      <xs:choice>
        <xs:element name="color" type="xs:NMTOKEN">
          <xs:annotation>
            <xs:documentation>Deprecated in
                favor of colorList.</xs:documentation>
            <xs:appinfo>
                <doc:deprecated>true</doc:deprecated>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
        <xs:element name="colorList" type="xs:NMTOKENS"/>
      </xs:choice>
    </xs:sequence>
  </xs:complexType>
</xs:schema>


23.4.5. Provide a conversion capability

If a new version is not backward-compatible, you should provide a clear upgrade path from the old version to the new version. A good way to do this is by providing an XSLT stylesheet to upgrade instances, which can be done automatically by an application.

Such a conversion needs to handle two changes carefully:

1. If required elements or attributes are added in the new version, the XSLT should insert them, ideally with a default value if one can be determined or calculated. Otherwise, an empty or nil value may be appropriate.

2. If elements or attributes are deleted, the XSLT should provide messages to the user that it is deleting information. It could also insert the deleted data as comments in the output, for a human user who may be reviewing the converted documents.

It may also be worthwhile to write an opposite conversion—one that downgrades instances from the newer version to the older one. This makes sense if older implementations that only support the previous version are likely to persist for some time, and there is no forward compatibility. The considerations listed above when adding and deleting components also apply to the downgrade conversion.

image

23.5. Versions of the XML Schema language

In addition to having multiple versions of your XML vocabulary, you may be dealing with multiple versions of the XML Schema language itself. This book describes two different versions of XML Schema: 1.0 and 1.1. Depending on which processor you are using, you may be required to use one version or the other. Unlike some other XML vocabularies, there is no way to indicate in your schema which version of XML Schema you are using. Instead, this might be a setting that you pass to your XML Schema processor, or the processor may only support one of the versions.

23.5.1. New features in version 1.1

Version 1.1 of XML Schema introduces a number of useful new features, including:

• Assertions (XPath constraints) on types (Sections 14.1.1 on p. 353 and 14.1.2 on p. 365)

• Conditional type assignment for elements (Section 14.2 on p. 375)

• Open content for complex types (Section 12.7.2 on p. 292)

• Relaxed constraints on all groups (Section 12.5.4 on p. 276)

• More powerful namespace constraints for wildcards (Section 12.7.1.3 on p. 289)

• Multiple inheritance for substitution groups (Section 16.5 on p. 413)

• Default attributes (Section 15.3.3 on p. 399)

• Inheritable attributes (Section 7.6 on p. 126)

• Overrides, as a replacement for redefines (Section 18.2 on p. 459)

• A new explicitTimezone facet (Section 8.4.7 on p. 150)

• Three new built-in simple types: yearMonthDuration (Section 11.4.11 on p. 231), dayTimeDuration (Section 11.4.12 on p. 232), and dateTimeStamp (Section 11.4.4 on p. 224)

• Support for implementation-defined facets and types (Section 8.6 on p. 154)

• Simplification of restrictions through relaxed rules for valid restrictions (Section 13.5.2 on p. 318), the ability to reuse identity constraints (Section 17.10 on p. 442), and the ability to restrict element and attribute declarations in a different target namespace (Section 13.5.7.1 on p. 339)

These new features required the introduction of new elements and attributes into the XML Schema language. Version 1.1 of XML Schema is backward-compatible with version 1.0, so any 1.0 schema will also work with a 1.1 processor and have the same meaning. However, there is no forward compatibility between the two versions, so a 1.0 processor will not be able to handle a 1.1 schema if it uses any of the 1.1 elements or attributes.

23.5.2. Forward compatibility of XML Schema 1.1

Version 1.1 of XML Schema has some new capabilities to accommodate the fact that there may be new versions of the XML Schema language in the future. Specifically, it provides a mechanism for indicating that a particular XML Schema component applies only to certain versions of the XML Schema language. These constructs use the minVersion and/or maxVersion attributes, which are in the Version Control Namespace, http://www.w3.org/2007/XMLSchema-versioning.

In Example 23–15, the first declaration for product indicates that it should only be honored by processors using version 1.3 or higher. Presumably, it makes use of special version 1.3 constructs that are unknown to version 1.1. If an XML Schema 1.1 processor parses this schema, it will ignore the first declaration and all of its descendants. The second product declaration indicates that it should be honored by processors using versions from 1.1 up to, but not including, version 1.3. A processor will only be using one version of XML Schema during any given validation.

Example 23–15. Using minVersion and maxVersion


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">
  <xs:element name="product" vc:minVersion="1.3">
    <!-- a declaration that uses XML Schema 1.3 constructs -->
  </xs:element>
  <xs:element name="product" vc:minVersion="1.1"
                             vc:maxVersion="1.3">
    <!-- a declaration conformant to versions 1.1 and 1.2 -->
  </xs:element>
</xs:schema>


This example may seem to violate one of the basic rules of XML Schema—namely, it has two global element declarations with the same name. However, the version control attributes have a special power, signaling to the processor that it should preprocess the schema (using a process called conditional inclusion) to strip out all the declarations that don’t apply to the version it is using. It is the output of this preprocessing that must follow all the rules of XML Schema. In Example 23–15, there will never be more than one product declaration in the schema after preprocessing. However, care must be taken not to use overlapping values for minVersion and/or maxVersion, lest duplicate declarations remain after preprocessing.

Unfortunately, this mechanism does not help with the transition from XML Schema 1.0 to 1.1, because a typical 1.0 processor will not honor or even know about the minVersion and/or maxVersion attributes.

23.5.3. Portability of implementation-defined types and facets

Another aspect of handling variations in the XML Schema language involves support for implementation-defined types and facets. Section 8.6 on p. 154 introduced the concept, providing examples of type definitions and element declarations that depend on type names and facets that may only be supported by specific implementations.

Example 23–16 provides a recap, showing a simple type definition based on a hypothetical implementation-defined type (ext:ordinalDate), as well as a simple type definition that uses an implementation-defined facet (saxon:preprocess) which is currently implemented in Saxon.

Example 23–16. Using implementation-defined types and facets


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:ext="http://example.org/extensions"
           xmlns:saxon="http://saxon.sf.net/">
  <xs:element name="anyOrdinalDate" type="ext:ordinalDate"/>
  <xs:element name="recentOrdinalDate" type="OrdinalDateIn2011"/>
  <xs:simpleType name="OrdinalDateIn2011">
    <xs:restriction base="ext:ordinalDate">
      <xs:minInclusive value="2011-001"/>
      <xs:maxInclusive value="2011-365"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:element name="size" type="SMLXSizeType"/>
  <xs:simpleType name="SMLXSizeType">
    <xs:restriction base="xs:token">
      <saxon:preprocess action="upper-case($value)"/>
      <xs:enumeration value="SMALL"/>
      <xs:enumeration value="MEDIUM"/>
      <xs:enumeration value="LARGE"/>
      <xs:enumeration value="EXTRA LARGE"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>


While implementation-defined types and facets can be useful, they do affect the portability of your schema. In fact, if a processor encounters a reference to any implementation-defined type or facet that it does not understand, the entire component, and any other components that depend on it, is considered “unknown” and excluded from the schema used for validation. It is not technically an error in the schema, but if one of these dependent elements or attributes is used in an instance it will fail validation. In Example 23–16, that means that a recentOrdinalDate or anyOrdinalDate element could never be valid if the processor does not understand ext:ordinalDate, and a size element could never be valid if the processor does not understand saxon:preprocess.

It is possible to take special measures to ensure that implementation-defined types and facets are only used by processors that can understand them. This is accomplished through four attributes in the Version Control Namespace: typeAvailable, typeUnavailable, facetAvailable, and facetUnavailable. These attributes can be used on any element in a schema document, and their value is a qualified name or a space-separated list of qualified names.

23.5.3.1. Using typeAvailable and typeUnavailable

The typeAvailable attribute is used to test whether the named type(s) are known to the processor. If any of the listed types is known, the schema element on which it appears is retained; if the types are not known, that element and all of its descendants are ignored. The typeUnavailable has the opposite effect, and the two are often used in conjunction with each other.

Example 23–17 shows a more portable schema that uses ordinalDate: There are two separate anyOrdinalDate declarations, one with the typeAvailable attribute and one with the typeUnavailable attribute. If ordinalDate is known to the processor, the first declaration is used, and if it is not, the second declaration is used.

Likewise, there are two separate definitions of the OrdinalDateIn2011 type. If ordinalDate is known to the processor, the first type definition is used, and if it is not, the second one is used. This means that while validation is less strict if a different processor is used, at least it will not fail unnecessarily.

Example 23–17. Using vc:typeAvailable and vc:typeUnavailable


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:ext="http://example.org/extensions"
           xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">
  <xs:element name="anyOrdinalDate" type="ext:ordinalDate"
              vc:typeAvailable="ext:ordinalDate"/>
  <xs:element name="anyOrdinalDate" type="xs:string"
              vc:typeUnavailable="ext:ordinalDate"/>

  <xs:element name="recentOrdinalDate" type="OrdinalDateIn2011"/>
  <xs:simpleType name="OrdinalDateIn2011"
                 vc:typeAvailable="ext:ordinalDate">
    <xs:restriction base="ext:ordinalDate">
      <xs:minInclusive value="2011-001"/>
      <xs:maxInclusive value="2011-365"/>
    </xs:restriction>
  </xs:simpleType>
  <xs:simpleType name="OrdinalDateIn2011"
                 vc:typeUnavailable="ext:ordinalDate">
    <xs:restriction base="xs:string">
      <xs:pattern value="2011-d{3}"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>


23.5.3.2. Using facetAvailable and facetUnavailable

The facetAvailable and facetUnavailable attributes work similarly. Example 23–18 is a schema that contains two type definitions: The first is used if the saxon:preprocess facet is known, and the second is used if it is unknown.

As with the minVersion and maxVersion attributes, these attributes do not have to be on top-level components; they can appear on any element in the schema to indicate that it should be included only under the specified conditions. Example 23–19 shows a schema that uses the facetAvailable attribute on the ext:maxLengthWithoutWhitespace facet itself to instruct the processor to not read it if it does not understand it.

Example 23–18. Using vc:facetAvailable and vc:facetUnavailable


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:saxon="http://saxon.sf.net/"
           xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">
  <xs:element name="size" type="SMLXSizeType"/>
  <xs:simpleType name="SMLXSizeType"
                 vc:facetAvailable="saxon:preprocess">
    <xs:restriction base="xs:token">
      <saxon:preprocess action="upper-case($value)"/>
      <xs:enumeration value="SMALL"/>
      <xs:enumeration value="MEDIUM"/>
      <xs:enumeration value="LARGE"/>
      <xs:enumeration value="EXTRA LARGE"/>
    </xs:restriction>
  </xs:simpleType>
  <xs:simpleType name="SMLXSizeType"
                 vc:facetUnavailable="saxon:preprocess">
    <xs:restriction base="xs:token"/>
  </xs:simpleType>
</xs:schema>


Example 23–19. Using vc:facetAvailable


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:ext="http://example.org/extensions"
           xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">
  <xs:element name="astring" type="ShortString"/>
  <xs:simpleType name="ShortString">
    <xs:restriction base="xs:string">
      <ext:maxLengthWithoutWhitespace value="5"
               vc:facetAvailable="ext:maxLengthWithoutWhitespace"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>


This would not have been appropriate for a prelexical facet like saxon:preprocess, however, because if the facet is simply ignored, the instruction to turn the value to upper case before validating it would be skipped. The resulting schema would have been stricter than intended when using a processor other than Saxon, because lowercase values would not be allowed.

image
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.104.242