Chapter 22. Extensibility and reuse

Sometimes we forget that the “X” in XML stands for “eXtensible.” One of the beauties of XML is that additional elements and attributes can appear in an instance without affecting the core information. Specialized requirements for particular applications, industries, or organizations can be addressed using extensions.

However, in order to make XML extensible, you need to leave avenues open to do that. If you write schemas and applications that require very strict validation of an instance, the major benefit of XML is not realized! This chapter provides detailed recommendations for developing schemas that will be reusable and extensible in the future. It also provides guidelines for extending existing schemas.

22.1. Reuse

First, let’s talk about reusing schema components exactly as they are. Later in this chapter, we will look at extending and restricting existing schema components. The benefits of reuse are numerous.

• It reduces development time, because schema developers are not reinventing the wheel. In addition, developers of stylesheets and program code to process instances can reuse their code, saving their time too.

• It reduces maintenance time, because changes only need to be made in one place. Again, this applies not just to schemas, but also to applications.

• It increases interoperability. If two systems are reusing some of the same schema components, it is easier for them to talk to each other.

• It results in better-designed schemas with fewer errors. Two heads are better than one, and reused components tend to be designed more carefully and reviewed more closely.

• It reduces the learning curve on schemas, because the reused components only need to be learned once. In addition, it encourages consistency, which also reduces learning curves.

22.1.1. Reusing schema components

When creating a schema, you should attempt to reuse components that have already been developed—either within your organization, by standards bodies, or by technology vendors. In addition, you should attempt to reuse as much within your schema as possible.

You do not have to reuse everything from a schema document that you import or include. If it was properly modularized, you should not have to include many components that you do not want to reuse. Components that can be reused include:

• Named types, both simple and complex

• Named model groups and attribute groups

• Global element and attribute declarations

• Notations

22.1.2. Creating schemas that are highly reusable

You should also consider the reusability of your components as you define them. To increase the reusability of your components:

• Use named types, because anonymous types cannot be reused.

• Use named model groups for fragments of content models that could be reused by multiple unrelated types.

• Use global element declarations, so they can participate in substitution groups.

• Use generic names when declaring elements and defining types. For example, if you are defining an address type for customers, call it AddressType rather than CustomerAddressType.

• Think about a broader applicability of your types. For example, when defining an address type, consider adding a country element declaration, even if you will only be using domestic addresses.

• Modularize your schemas into smaller documents, so that others reusing your components will not have to include or import them all.

22.1.3. Developing a common components library

When designing a complex vocabulary, it is advisable to create libraries of low-level components that can be used in many contexts. These components (usually types) are sometimes referred to as “common components” or “core components.” Examples of good candidates for common components are:

• Identifiers (for example, product identifiers, customer identifiers, especially if they are made up of multiple parts)

• Code lists such as departments, product types, currencies, natural languages

• Measurement (i.e., an amount with a unit of measure)

• Price (a number with an associated currency)

• Person information, such as name, contact information, and mailing address

These are the kinds of data structures that tend to be rewritten over and over again if there is no plan in place to reuse them. Having one definition for these low-level components can save a lot of time in developing and maintaining not only the schema, but the code that processes and/or generates the messages.

If all of these common components are defined and placed in one or more separate schema documents, they are easier to reuse than if they are embedded in another context-specific schema document. Typically, they are defined as types rather than elements, so that they can be reused by many element declarations. Example 22–1 shows a simple common components library.

Example 22–1. Sample common components library


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="http://datypic.com/common"
           xmlns="http://datypic.com/common"
           elementFormDefault="qualified">
  <xs:simpleType name="ProductIDType">
    <xs:restriction base="xs:string">
      <xs:pattern value="[A-Z]{2}[0-9]{4}"/>
    </xs:restriction>
  </xs:simpleType>
  <xs:complexType name="PriceType">
    <xs:simpleContent>
      <xs:extension base="xs:decimal">
        <xs:attribute name="currency" type="CurrencyCodeType"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
  <xs:complexType name="MeasurementType">
    <xs:simpleContent>
      <xs:extension base="xs:decimal">
        <xs:attribute name="units" type="UnitsCodeType"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
  <xs:complexType name="AddressType">
    <xs:sequence>
      <xs:element name="street" type="xs:string" maxOccurs="3"/>
      <xs:element name="city" type="xs:string"/>
      <xs:element name="state" type="xs:string"/>
      <xs:element name="postalCode" type="xs:string"/>
      <xs:element name="country" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
  <!--...-->
</xs:schema>


22.2. Extending schemas

In some cases, you want to reuse existing schema components, but you have specific extensions you need to add to make them useful to you. Creating a completely new schema that copies the original definitions is tempting because it is easy and flexible. You do not have to worry about basing the new definitions on the original ones. However, there are some important drawbacks.

• Your new instances could be completely incompatible with the original ones.

• You will have duplicate definitions of the same components. This makes maintenance more difficult and discourages consistency.

• You do not have a record of the differences between the two definitions.

This section identifies several ways in which XML Schema allows extension. It describes both how to make your schemas extensible and how to extend others’ schemas. The various extension mechanisms are summarized in Table 22–1.

Table 22–1. Comparison of extension mechanisms

Image

Most of these extension methods take some planning, or at least require the use of certain design characteristics, such as global components, when the original schemas are being created. If you are designing an XML vocabulary, particularly a complex one or one that you intend for other organizations to use and extend, you should choose one or more of these extension methods and design your schemas accordingly. It is a good idea to document the extension method you have in mind, with examples, in your Naming and Design Rules document.

If you are in the position of extending another schema over which you have no control, you may not be able to use all of these methods, depending on how the original schema was designed.

22.2.1. Wildcards

Wildcards are the most straightforward way to define extensible types. They can be used to allow additional elements and attributes in your instances. Of the methods of extension discussed in this chapter, wildcards and open content are the only ones that allow an instance with extensions to validate against the original schema. All the other methods require defining a new schema for the extensions.

Example 22–2 shows a complex type definition that contains both an element wildcard (the any element) and an attribute wildcard (the anyAttribute element). For a complete discussion of element and attribute wildcards, see Sections 12.7.1 on p. 285 and 12.7.3 on p. 298, respectively.

Example 22–2. Original type using wildcards


<xs:complexType name="ProductType">
  <xs:sequence>
    <xs:element name="number" type="xs:integer"/>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="size" type="xs:integer" minOccurs="0"/>
    <xs:any minOccurs="0" maxOccurs="unbounded"
            namespace="##other" processContents="lax"/>
  </xs:sequence>
  <xs:anyAttribute namespace="##other" processContents="skip"/>
</xs:complexType>


There are several things to note about the definition of ProductType.

• The namespace constraint is set to ##other. This will avoid erroneous content from being validated, such as a product element that contains two color elements. It also avoids nondeterministic content models that violate the Unique Particle Attribute rule, as described at the end of this section.

• The value of processContents is lax. This allows the instance author to provide hints as to where to find the declarations for the additional elements or attributes. If they do not provide hints, or the particular processor ignores the hints, it is not a problem; no errors will be raised. However, if the declarations can be found, they will be validated.

• The values of minOccurs and maxOccurs are 0 and unbounded, respectively. This allows zero, one, or many replacement elements to appear. The values of these two attributes default to 1, which is generally not the intention of the schema author.

• The wildcard appears at the end of the complex type definition. This allows replacement elements only after the defined content model. This is similar to the way extension works. You are permitted to put wildcards anywhere in the content model, but it might make processing the instance more difficult. With a wildcard at the end, the application can process what it is expecting and ignore the rest.

Suppose some additional features have been added to the ordering process, such as a points system to reward regular customers and a gift wrap capability. The instance shown in Example 22–3 takes advantage of the wildcards in the ProductType definition to add an spc:giftWrap element to the end of the content, as well as an spc:points attribute.

Example 22–3. Instance with extensions


<order xmlns="http://datypic.com/ord"
       xmlns:spc="http://datypic.com/spc">
  <product spc:points="100">
    <number>557</number>
    <name>Short-Sleeved Linen Blouse</name>
    <size>10</size>
    <spc:giftWrap>ADULT BDAY</spc:giftWrap>
  </product>
</order>


Since processContents was set to lax, the instance shown would be valid according to the original schema, without specifying any declarations for the new attribute and element. If you want to validate the new attribute and element, you can create a schema that contains their declarations, as shown in Example 22–4.

Example 22–4. Schema for extensions


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/spc"
           targetNamespace="http://datypic.com/spc">
  <xs:element name="giftWrap" type="xs:string"/>
  <xs:attribute name="points" type="xs:nonNegativeInteger"/>
</xs:schema>


Note that the element and attribute declarations are global. This is necessary so that the processor can find the declarations.

Another approach for “extending” complex types with wildcards is actually to restrict them. You could define a complex type that restricts ProductType and includes the declarations of giftWrap and points. For more information, see Section 13.5.2.3 on p. 322.

The advantage of using wildcards for making types extensible is that this is very flexible: The instance author is not required to have a schema that declares the replacement elements and attributes. However, in some cases this flexibility may be a little too forgiving, as it can obscure real errors.

image

One challenge of using wildcards in version 1.0 is that if the wildcards allow extensions in the same namespace, i.e. the target namespace of the schema, you can run into Unique Particle Attribution violations. If the wildcard is preceded by a declaration for an optional element, the processor does not know whether to use the element declaration or the wildcard to validate an element whose name matches the declaration. Fortunately, this is alleviated in version 1.1, and the processor will always choose the element declaration.

image
image

22.2.2. Open content

Open content is an even more flexible form of wildcards available starting in version 1.1. Complex types that have open content can allow replacement elements to appear anywhere within its content, not just in places designated by element wildcards.

Example 22–5 shows a complex type definition that has open content. Note that openContent doesn’t apply to attributes, so it is necessary to include an attribute wildcard to support any attribute extensions. For a complete discussion of open content, see Section 12.7.2 on p. 292.

Example 22–5. Original type using open content


<xs:complexType name="ProductType">
  <xs:openContent>
    <xs:any namespace="##other" processContents="lax"/>
  </xs:openContent>
  <xs:sequence>
    <xs:element name="number" type="xs:integer"/>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="size" type="xs:integer" minOccurs="0"/>
  </xs:sequence>
  <xs:anyAttribute namespace="##other" processContents="skip"/>
</xs:complexType>


The use of the openContent element means that the extension elements can appear interleaved anywhere in the content. To allow them to only appear at the end, you can use a mode="suffix" attribute on openContent. The instance shown in Example 22–6 takes advantage of the open content in the ProductType definition to add an spc:giftWrap element into the middle of the content.

Example 22–6. Instance with open content extensions


<order xmlns="http://datypic.com/ord"
       xmlns:spc="http://datypic.com/spc">
  <product spc:points="100">
    <number>557</number>
    <spc:giftWrap>ADULT BDAY</spc:giftWrap>
    <name>Short-Sleeved Linen Blouse</name>
    <size>10</size>
  </product>
</order>


As with the wildcard example, since processContents was set to lax, the instance shown would be valid according to the original schema, without specifying any declarations for the new attribute and element. If you want to validate the new attribute and element, you can create a schema that declares them globally, as was shown in Example 22–4.

image

22.2.3. Type substitution

Deriving new types from the existing types is another possibility. You can create a new schema whose types extend the original types. You would then have to indicate the new types in the instance, using the xsi:type attribute. Unlike the wildcard approach, instances that contain extensions would not be valid according to the original schema. If you want to use the extended instance as a replacement for the original instance, you should first check to make sure that your application can handle the new extended instance.

This approach is appropriate when you want to extend a schema over which you have no control. Example 22–7 shows a complex type that you might want to extend.

Example 22–7. Original type


<xs:complexType name="ProductType">
  <xs:sequence>
    <xs:element name="number" type="xs:integer"/>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="size" type="xs:integer" minOccurs="0"/>
  </xs:sequence>
</xs:complexType>


There are several things to note about the definition of ProductType.

• It is a named complex type. Anonymous complex types cannot be extended.

• There are no block or final attributes to prohibit type derivation or substitution.

image

• A sequence group is used. Extension does not work well for choice groups, as described in the next section. For all groups, extension is forbidden in version 1.0 but permitted (and useful) in version 1.1.

image

Example 22–8 shows an extension of the original ProductType. For more information on complex content extension, see Section 13.4.2 on p. 307.

The instance shown in Example 22–9 conforms to the extended type definition, but not the base type definition. It is identical to the instance using wildcards shown in Example 22–3, except that the xsi:type attribute appears in the product tag. For more information on type substitution, see Section 13.6 on p. 341.

Example 22–8. Extended type


<xs:complexType name="ExtendedProductType">
  <xs:complexContent>
    <xs:extension base="ProductType">
      <xs:sequence>
        <xs:element ref="spc:giftWrap" minOccurs="0"/>
      </xs:sequence>
      <xs:attribute ref="spc:points"/>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>


Example 22–9. Instance using extended type


<order xmlns="http://datypic.com/ord"
       xmlns:spc="http://datypic.com/spc"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <product spc:points="100" xsi:type="ExtendedProductType">
    <number>557</number>
    <name>Short-Sleeved Linen Blouse</name>
    <size>10</size>
    <spc:giftWrap>ADULT BDAY</spc:giftWrap>
  </product>
</order>


22.2.4. Substitution groups

As we saw in Section 13.4.2.1 on p. 309, extending a content model which contains a choice group can have unexpected results. Example 22–10 shows a type ExpandedItemsType that extends ItemsType to add new product types. Intuitively, you may think that the two additional element declarations, sweater and suit, are added to the choice group, allowing a choice among the five elements. In fact, the effective content model of ExpandedItemsType is a sequence group that contains two choice groups. As a result, ExpandedItemsType will require any of the shirt, hat, and umbrella elements to appear before any of the sweater or suit elements.

Example 22–10. choice group extension


<xs:complexType name="ItemsType">
  <xs:choice maxOccurs="unbounded">
    <xs:element ref="shirt"/>
    <xs:element ref="hat"/>
    <xs:element ref="umbrella"/>
  </xs:choice>
</xs:complexType>
<xs:complexType name="ExpandedItemsType">
  <xs:complexContent>
    <xs:extension base="ItemsType">
      <xs:choice maxOccurs="unbounded">
        <xs:element ref="sweater"/>
        <xs:element ref="suit"/>
      </xs:choice>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>


Substitution groups are a better way to extend choice groups. If you add another element declaration, otherProduct, to the choice group in ItemsType, it can serve as the head of a substitution group. This makes extending the choice much easier. The element declarations for sweater and suit can be supplied in another schema document, even in another namespace.

In Example 22–11, the otherProduct element declaration is added to act as the head of the substitution group. It would also have been legal to simply make umbrella the head of the substitution group, but this would be less intuitive and would prevent you from ever allowing umbrella without also allowing sweater and suit in its place.

Example 22–11. Original type with an abstract element declaration


<xs:complexType name="ItemsType">
  <xs:choice maxOccurs="unbounded">
    <xs:element ref="shirt"/>
    <xs:element ref="hat"/>
    <xs:element ref="umbrella"/>
    <xs:element ref="otherProduct"/>
  </xs:choice>
</xs:complexType>

<xs:element name="otherProduct" type="ProductType"
            abstract="true"/>


Example 22–12 shows the two element declarations that are substitutable for otherProduct.

Example 22–12. Extension using substitution groups


<xs:element name="sweater" substitutionGroup="otherProduct"/>
<xs:element name="suit" substitutionGroup="otherProduct"/>


Example 22–13 shows a valid instance. As you can see, the child elements can appear in any order. In this case, they are all in the same namespace. It is also possible for substitution element declarations to be in different namespaces.

Example 22–13. Instance using extension via substitution groups


<items>
  <shirt>...</shirt>
  <sweater>...</sweater>
  <shirt>...</shirt>
  <suit>...</suit>
</items>


It would have also been valid to put an element wildcard in the choice group. However, the substitution group approach is more controlled, because you can specifically designate the substitutable element declarations. For complete coverage of substitution groups, see Chapter 16.

22.2.5. Type redefinition

Redefinition, unlike type substitution, does not require the use of the xsi:type attribute in instances. The redefined components have the same name as they had in the original definition. However, redefinition can only be done within the same namespace, so it is not appropriate for altering schemas over which you have no control. In addition, redefinition has some risks associated with it, as detailed in Section 18.3 on p. 468.

The original type might look exactly like the one shown in Example 22–7, with similar constraints. It must be named, and it should use a sequence group. Example 22–14 shows a redefinition of ProductType to add a new element declaration and attribute declaration. It is similar to the definition of the derived type shown in Example 22–8, with two important differences.

1. It is defined entirely within the redefine element.

2. The extended type and the original type have the same name.

For more information on type redefinition, see Section 18.1.4 on p. 453.

Again, a valid instance would look like Example 22–3.

Example 22–14. Redefined type


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:spc="http://datypic.com/spc"
           xmlns="http://datypic.com/ord"
           targetNamespace="http://datypic.com/ord">
  <xs:import namespace="http://datypic.com/spc"/>
  <xs:redefine schemaLocation="original.xsd">
    <xs:complexType name="ProductType">
      <xs:complexContent>
        <xs:extension base="ProductType">
          <xs:sequence>
            <xs:element ref="spc:giftWrap" minOccurs="0"/>
          </xs:sequence>
          <xs:attribute ref="spc:points"/>
        </xs:extension>
      </xs:complexContent>
    </xs:complexType>
  </xs:redefine>
</xs:schema>


Although a redefinition of the type must take place in the same namespace, the extended element and attribute declarations are not required to be in that namespace. In our example, they are not.

22.2.6. Named group redefinition

Another alternative is to define named model groups and attribute groups, and redefine these groups. This is less rigid than redefining types because the extensions do not have to be at the end of the content models.

Example 22–15 shows the original ProductType definition, this time using a named model group and an attribute group. The entire content model of the type is contained in the group ProductPropertyGroup.

Example 22–15. Original type


<xs:complexType name="ProductType">
  <xs:group ref="ProductPropertyGroup"/>
  <xs:attributeGroup ref="ExtensionGroup"/>
</xs:complexType>

<xs:group name="ProductPropertyGroup">
  <xs:sequence>
    <xs:element name="number" type="xs:integer"/>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="size" type="xs:integer" minOccurs="0"/>
  </xs:sequence>
</xs:group>

<xs:attributeGroup name="ExtensionGroup"/>


Example 22–16 shows a redefinition of the named model group and attribute group. Redefining the groups affects all the complex types that reference those groups.

Example 22–16. Redefined named model group and attribute group


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:spc="http://datypic.com/spc"
           xmlns="http://datypic.com/ord"
           targetNamespace="http://datypic.com/ord">
  <xs:import namespace="http://datypic.com/spc"/>
  <xs:redefine schemaLocation="original.xsd">
    <xs:group name="ProductPropertyGroup">
      <xs:sequence>
        <xs:element ref="spc:giftWrap"/>
        <xs:group ref="ProductPropertyGroup"/>
      </xs:sequence>
    </xs:group>
    <xs:attributeGroup name="ExtensionGroup">
      <xs:attributeGroup ref="ExtensionGroup"/>
      <xs:attribute ref="spc:points"/>
    </xs:attributeGroup>
  </xs:redefine>
</xs:schema>


A valid instance would look like the one shown in Example 22–17. In this case, giftWrap appears as the first child of product.

Example 22–17. Instance using redefined named model group and attribute group


<order xmlns="http://datypic.com/ord"
       xmlns:spc="http://datypic.com/spc">
  <product spc:points="100">
    <spc:giftWrap>ADULT BDAY</spc:giftWrap>
    <number>557</number>
    <name>Short-Sleeved Linen Blouse</name>
    <size>10</size>
  </product>
</order>


image

22.2.7. Overrides

Starting in version 1.1, overrides can be used instead of redefines. In fact, they are preferred because redefines are deprecated. Overrides work similarly to redefines, but have an advantage of being more flexible. The new definition does not have to relate to the original definition in any way.

Example 22–18 shows a schema similar to Example 22–14, but with an override instead of a redefine. In our case, we chose to modify it in a similar way: add the spc:giftWrap element declaration at the end and add the spc:points attribute. However, the spc:giftWrap element declaration could have appeared anywhere in the content model; in fact, the original element declarations could have been removed or reordered.

Example 22–18. Overridden type


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:spc="http://datypic.com/spc"
           xmlns="http://datypic.com/ord"
           targetNamespace="http://datypic.com/ord"
           elementFormDefault="qualified">
  <xs:import namespace="http://datypic.com/spc"/>
  <xs:override schemaLocation="original.xsd">
    <xs:complexType name="ProductType">
      <xs:sequence>
        <xs:element name="number" type="xs:integer"/>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="size" type="xs:integer" minOccurs="0"/>
        <xs:element ref="spc:giftWrap" minOccurs="0"/>
      </xs:sequence>
      <xs:attribute ref="spc:points"/>
    </xs:complexType>
  </xs:override>
</xs:schema>


Overrides can also be used on named groups. Example 22–19 shows a schema similar to Example 22–16, again replacing the redefine with override.

A valid instance would look like the one shown in Example 22–20. In this case, giftWrap appears as the first child of product.

Example 22–19. Overridden named model group and attribute group


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:spc="http://datypic.com/spc"
           xmlns="http://datypic.com/ord"
           targetNamespace="http://datypic.com/ord"
           elementFormDefault="qualified">
  <xs:import namespace="http://datypic.com/spc"/>
  <xs:override schemaLocation="original.xsd">
    <xs:group name="ProductPropertyGroup">
      <xs:sequence>
        <xs:element ref="spc:giftWrap"/>
        <xs:element name="number" type="xs:integer"/>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="size" type="xs:integer" minOccurs="0"/>
      </xs:sequence>
    </xs:group>
    <xs:attributeGroup name="ExtensionGroup">
      <xs:attribute ref="spc:points"/>
    </xs:attributeGroup>
  </xs:override>
</xs:schema>


Example 22–20. Instance using overridden named model group and attribute group


<order xmlns="http://datypic.com/ord"
       xmlns:spc="http://datypic.com/spc">
  <product spc:points="100">
    <spc:giftWrap>ADULT BDAY</spc:giftWrap>
    <number>557</number>
    <name>Short-Sleeved Linen Blouse</name>
    <size>10</size>
  </product>
</order>


image
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.7.240