Chapter 17. Identity constraints

Identity constraints allow you to uniquely identify nodes in a document and ensure the integrity of references between them. This chapter explains how to define and use identity constraints.

17.1. Identity constraint categories

There are three categories of identity constraints.

Uniqueness constraints enforce that a value (or combination of values) is unique within a specified scope. For example, all product numbers must be unique within a catalog.

Key constraints also enforce uniqueness, and additionally require that all values be present. For example, every product must have a number and it must be unique within a catalog.

Key references enforce that a value (or combination of values) corresponds to a value represented by a key or uniqueness constraint. For example, for every product number that appears as an item in a purchase order, there must be a corresponding product number in the product description section.

17.2. Design hint: Should I use ID / IDREF or key / keyref?

The identity constraints described in this chapter are much more powerful than using attributes of types ID and IDREF. Limitations of ID and IDREF include:

• They are recommended for use only for attributes, not elements.

• They are scoped to the entire document only.

• They are based on one value, as opposed to multifield keys.

• They require ID or IDREF to be the type of the attribute, precluding data validation of that attribute.

• They are based on string equality, as opposed to value equality.

• They require that the values be based on XML names, meaning they must start with a letter and can only contain letters, digits, and a few punctuation marks.

However, if ID and IDREF fulfill your requirements, there is no reason not to use them, particularly when representing simple cross-references in narrative documents or converting DTDs that are already in use.

17.3. Structure of an identity constraint

The three categories of identity constraints are similar in their definitions and associated rules. This section describes the basic structure of identity constraints. Example 17–1 shows an instance that contains product catalog information.

Example 17–1. Product catalog information


<catalog>
  <department number="021">
    <product>
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <price currency="USD">29.99</price>
    </product>
    <product>
      <number>563</number>
      <name>Ten-Gallon Hat</name>
      <price currency="USD">69.99</price>
    </product>
    <product>
      <number>443</number>
      <name>Deluxe Golf Umbrella</name>
      <price currency="USD">49.99</price>
    </product>
  </department>
</catalog>


Example 17–2 shows the definition of a uniqueness constraint that might be applied to the instance in Example 17–1.

Example 17–2. A uniqueness constraint


<xs:element name="catalog" type="CatalogType">
  <xs:unique name="prodNumKey">
    <xs:selector xpath="*/product"/>
    <xs:field xpath="number"/>
  </xs:unique>
</xs:element>


All three categories of identity constraints are defined entirely within an element declaration. It can be either a global or local element declaration, but it cannot be an element reference. Identity constraints must be defined at the end of the element declaration, after any simpleType or complexType child. There can be multiple identity constraints in a single element declaration.

Every identity constraint has a name, which takes on the target namespace of the schema document. The qualified name must be unique among all identity constraints of all categories within the entire schema. For example, it would be illegal to have a key constraint named customerNumber and a uniqueness constraint named customerNumber in the same schema, even if they were scoped to different elements.

There are three parts to an identity constraint definition.

1. The scope is an element whose declaration contains the constraint. In our example, a catalog element is the scope. It is perfectly valid to have two products with the same number if they are contained in two different catalog elements.

2. The selector serves to select all the nodes to which the constraint applies. In our example, the selector value is */product, which selects all the product grandchildren of catalog.

3. The one or more fields are the element and attribute values whose combination must be unique among the selected nodes. There can be only one instance of the field per selected node. In our example, there is one field specified: the number child of each product element.

17.4. Uniqueness constraints

A uniqueness constraint is used to validate that the values of certain elements or attributes are unique within a particular scope. This is represented by a unique element, whose syntax is shown in Table 17–1.

Table 17–1. XSD Syntax: uniqueness constraint

Image

In Example 17–2, we used a uniqueness constraint to ensure that all the product numbers in the catalog are unique. It is also possible to ensure uniqueness of a combination of multiple fields. In the instance shown in Example 17–3, each product may have an effective date.

Example 17–3. Product catalog information, revisited


<catalog>
  <department number="021">
    <product effDate="2000-02-27">
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <price currency="USD">29.99</price>
    </product>
    <product effDate="2001-04-12">
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <price currency="USD">39.99</price>
    </product>
    <product effDate="2001-04-12">
      <number>563</number>
      <name>Ten-Gallon Hat</name>
      <price currency="USD">69.99</price>
    </product>
    <product>
      <number>443</number>
      <name>Deluxe Golf Umbrella</name>
      <price currency="USD">49.99</price>
    </product>
  </department>
</catalog>


It is valid for two products to have the same number, as long as they have different effective dates. In other words, we want to validate that the combinations of number and effDate are unique. Example 17–4 shows the uniqueness constraint that accomplishes this.

Example 17–4. Constraining uniqueness of two combined fields


<xs:element name="catalog" type="CatalogType">
  <xs:unique name="dateAndProdNumKey">
    <xs:selector xpath="department/product"/>
    <xs:field xpath="number"/>
    <xs:field xpath="@effDate"/>
  </xs:unique>
</xs:element>


Note that this example works because both number and effDate are subordinate to the product elements. Using the instance in Example 17–3, it would be invalid to define a multifield uniqueness constraint on the department number and the product number. If you defined the selector to select all departments, the product/number field would yield more than one field node per selected node, which is not permitted. If you defined the selector to select all products, you would have to access an ancestor node to get the department number, which is not permitted.

You can get around this by defining two uniqueness constraints: one in the scope of catalog to ensure that all department numbers are unique within a catalog, and another in the scope of department to ensure that all product numbers are unique within a department.

17.5. Key constraints

A key constraint is similar to a uniqueness constraint in that the combined fields in the key must be unique. Key constraints have an additional requirement that all of the field values must be present in the document. Therefore, you should not define keys on elements or attributes that are optional. In addition, the fields on which the key is defined cannot be nillable.

Key constraints are represented by key elements, whose syntax is shown in Table 17–2. It is identical to that of the unique elements.

Table 17–2. XSD Syntax: key constraint

Image

Example 17–5 changes Example 17–2 to be a key constraint instead of a uniqueness constraint. In this case, every product element in the instance would be required to have a number child, regardless of whether the complex type of product requires it. The values of those number children have to be unique within the scope of catalog.

Example 17–5. Defining a key on product number


<xs:element name="catalog" type="CatalogType">
  <xs:key name="prodNumKey">
    <xs:selector xpath="*/product"/>
    <xs:field xpath="number"/>
  </xs:key>
</xs:element>


17.6. Key references

Key references are used to ensure that there is a match between two sets of values in an instance. They are similar to foreign keys in databases. Key references are represented by keyref elements, whose syntax is shown in Table 17–3.

Table 17–3. XSD Syntax: key reference

Image

The refer attribute is used to reference a key or uniqueness constraint by its qualified name. If the constraint is defined in a schema document with a target namespace, the refer attribute must reference a name that is either prefixed or in the scope of a default namespace declaration.

Suppose we have an order for three items: two shirts and one sweater, as shown in Example 17–6. The two shirts are the same except for their color, so they both have the same product number. All the descriptive product information appears at the end of the order. We want a way to ensure that every item in the order has a corresponding product description in the document.

Example 17–6. Key references


<order>
  <number>123ABBCC123</number>
  <items>
    <shirt number="557">
      <quantity>1</quantity>
      <color value="blue"/>
    </shirt>
    <shirt number="557">
      <quantity>1</quantity>
      <color value="sage"/>
    </shirt>
    <hat number="563">
      <quantity>1</quantity>
    </hat>
  </items>
  <products>
    <product>
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <price currency="USD">29.99</price>
    </product>
    <product>
      <number>563</number>
      <name>Ten-Gallon Hat</name>
      <price currency="USD">69.99</price>
    </product>
  </products>
</order>


Example 17–7 shows the definition of a key reference and its associated key. In this example, the number attribute of any child of items must match a number child of a product element. The meaning of the XPath syntax will be described in detail later in this chapter.

Note that the key reference field values are not required to be unique; that is not their purpose. It is valid to have duplicate shirt numbers in the items section.

As with key and uniqueness constraints, key references can be on multiple fields. There must be an equal number of fields in the key reference as there are in the key or uniqueness constraint that it references. The fields are matched in the same order, and they must have related types.

Example 17–7. Defining a key reference on product number


<xs:element name="order" type="OrderType">
  <xs:keyref name="prodNumKeyRef" refer="prodNumKey">
    <xs:selector xpath="items/*"/>
    <xs:field xpath="@number"/>
  </xs:keyref>
  <xs:key name="prodNumKey">
    <xs:selector xpath=".//product"/>
    <xs:field xpath="number"/>
  </xs:key>
</xs:element>


17.6.1. Key references and scope

There is an additional constraint on the scope of key references and key constraints. The key referenced by a keyref must be defined in the same element declaration or in a declaration of one of its descendants. It is not possible for a keyref to reference a key that is defined in a sibling or ancestor element declaration. In our example, the key and keyref were both defined in the declaration of order. It would also have been valid if the key had been defined in the products declaration. However, it would have been invalid if the keyref had been defined in the items declaration, because items is a child of order.

17.6.2. Key references and type equality

When defining key references, it is important to understand XML Schema’s concept of equality. When determining whether two values are equal, their type is taken into account. Values with unrelated types will never be considered equal. For example, a value 2 of type string is not equal to a value 2 of type integer. However, if two types are related by restriction, such as integer and positiveInteger, they can have equal values. When you define a key reference, make sure that the types of its fields are related to the types of the fields in the referenced key or uniqueness constraint. In Example 17–7, if the number attribute of shirt were declared as an integer and the number child of product were declared as a string, there would have been no matches. For more information on type equality, see Section 11.7 on p. 253.

17.7. Selectors and fields

All three categories of identity constraints are specified in terms of a selector and one or more fields. This section explains selectors and fields in more detail.

17.7.1. Selectors

The purpose of a selector is to identify the set of nodes to which the constraint applies. The selector is relative to the scoping element. In Example 17–2, our selector was */product. This selects all the product grandchildren of catalog. There may be other grandchildren of catalog, or other product elements elsewhere in the document, but the constraint does not apply to them.

The selector is represented by a selector element, whose syntax is shown in Table 17–4.

Table 17–4. XSD Syntax: constraint selector

Image

17.7.2. Fields

Each field must identify a single node relative to each node selected by the selector. The key reference in Example 17–7 works because there can only ever be one number attribute per selected node. In the instance in Example 17–6, the selector selects three nodes (the three children of items), and there is only one number attribute per node.

You might have been tempted to define a uniqueness constraint as shown in Example 17–8. This would not work because the selector would select one node (the single department element) and there would be three product/number nodes relative to it.

Example 17–8. Illegal uniqueness constraint


<xs:element name="catalog" type="CatalogType">
  <xs:unique name="prodNumKey">
    <xs:selector xpath="department"/>
    <xs:field xpath="product/number"/>
  </xs:unique>
</xs:element>


The elements or attributes that are used as fields must have simple content and cannot be declared nillable.

Fields are represented by field elements, whose syntax is shown in Table 17–5.

Table 17–5. XSD Syntax: constraint field

Image

17.8. XPath subset for identity constraints

All values of the xpath attribute in the selector and field tags must be legal XPath expressions. However, they must also conform to a subset of XPath that is defined specifically for identity constraints.

XPath expressions are made up of paths, separated by vertical bars. For example, the XPath expression department/product/name| department/product/price uses two paths to select all the nodes that are either name or price children of product elements whose parent is department.

Each path may begin with the .// literal, which means that the matching nodes may appear anywhere among the descendants of the current scoping element. If it is not included, it is assumed that matching nodes may appear only as direct children of the scoping element.

Each path is made up of steps, separated by forward slashes. For example, the path department/product/name is made up of three steps: department, product, and name. Table 17–6 lists the types of steps that may appear in the identity constraint XPath subset.

Table 17–6. XPath subset steps

Image

The context node of the selector expression is the element in whose declaration the identity constraint is defined. The context node of the field expression is the result of evaluating the selector expression.

Table 17–7 shows some legal XPath expressions for selectors and fields. They assume that the scope of the identity constraint is the catalog element, as shown in Example 17–3.

Table 17–7. XPath subset expressions in the scope of catalog

Image
Image

Technically, any of the XPath expressions in Table 17–7 is legal for a field. However, since the field XPath can only identify a node that appears once relative to the selected node, most of the expressions that contain wildcards to select multiple nodes are inappropriate for fields. The field XPath will usually consist of a single child element or a single attribute.

Table 17–8 shows some expressions that, while they are legal XPath, are not in the identity constraint XPath subset.

Table 17–8. Illegal XPath subset expressions

Image
Image

17.9. Identity constraints and namespaces

Special consideration must be given to namespaces when defining identity constraints. By default, qualified element names and attribute names used in the XPath expressions must be prefixed in order to be legal. Let’s take another look at our uniqueness constraint from Example 17–4. That definition assumed that the schema document had no target namespace. If we add a target namespace, it looks like Example 17–9.

Each of the element names in the XPath is prefixed with prod, mapping it to the http://datypic.com/prod namespace. In our example, all element declarations (department, product, and number) are global, and therefore their names must be prefixed. Let’s assume that the attribute effDate is locally declared and unqualified, so its name is not prefixed in the XPath expression.

Example 17–9. Prefixing names in the XPath expression


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:prod="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:element name="catalog" type="prod:CatalogType">
    <xs:unique name="dateAndProdNumKey">
      <xs:selector xpath="prod:department/prod:product"/>
      <xs:field xpath="prod:number"/>
      <xs:field xpath="@effDate"/>
    </xs:unique>
  </xs:element>
  <xs:element name="department" type="prod:DepartmentType"/>
  <xs:element name="product" type="prod:ProductType"/>
  <xs:element name="number" type="xs:integer"/>
  <!--...-->
</xs:schema>


The names that must be qualified in an XPath expression are those that must be qualified in an instance, namely:

• All element names and attribute names in global declarations

• Element names and attribute names in local declarations whose form is qualified, either directly, using the form attribute, or indirectly through elementFormDefault or attributeFormDefault

Note that the target namespace is mapped to a prefix, rather than being the default namespace. This is because XPath expressions are not affected by default namespace declarations. Unprefixed names in XPath expressions are assumed to be in no namespace, even if a default namespace declaration is in scope.

Therefore, if you want to use identity constraints in a schema document that has a target namespace, you must map the target namespace to a prefix. Example 17–10 uses unprefixed names in the XPath expressions, assuming that these names take on the default namespace. This is not the case; in fact, these elements will not be found because the processor will be looking for elements with unqualified names when evaluating the XPath expressions.

Example 17–10. Illegal attempt to apply default namespace to XPath


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:element name="catalog" type="CatalogType">
    <xs:unique name="dateAndProdNumKey">
      <xs:selector xpath="department/product"/>
      <xs:field xpath="number"/>
      <xs:field xpath="@effDate"/>
    </xs:unique>
  </xs:element>
  <xs:element name="department" type="DepartmentType"/>
  <xs:element name="product" type="ProductType"/>
  <xs:element name="number" type="xs:integer"/>
  <!--...-->
</xs:schema>


image

17.9.1. Using xpathDefaultNamespace

In version 1.1, this problem is alleviated somewhat because you can specify an xpathDefaultNamespace attribute, which designates the default namespace for all unprefixed element names that are used in the XPath. It does not affect attribute names.

Example 17–11 uses the xpathDefaultNamespace attribute on the schema element. This means that the element names department, product, and number used in the selector and field XPaths are interpreted as being in the http://datypic.com/prod namespace.

Instead of specifying a namespace name, the xpathDefaultNamespace attribute can contain one of three special keywords: ##targetNamespace, ##defaultNamespace, or ##local. These are described in detail in Section 14.1.3.1 on p. 373.

Example 17–11. Using xpathDefaultNamespace


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           xpathDefaultNamespace="http://datypic.com/prod">
  <xs:element name="catalog" type="CatalogType">
    <xs:unique name="dateAndProdNumKey">
      <xs:selector xpath="department/product"/>
      <xs:field xpath="number"/>
      <xs:field xpath="@effDate"/>
      </xs:unique>
  </xs:element>
  <xs:element name="department" type="DepartmentType"/>
  <xs:element name="product" type="ProductType"/>
  <xs:element name="number" type="xs:integer"/>
  <!--...-->
</xs:schema>


17.10. Referencing identity constraints

In version 1.1, identity constraints can be defined once and referenced from multiple elements. This is true for all three kinds of identity constraints: uniqueness constraints, key constraints, and key references. This is useful if you have the same constraints in multiple scopes and want to reuse the code.

The syntax for referencing an identity constraint is shown in Table 17–9. It is the same for all three kinds of identity constraints. Instead of a name attribute, it has a ref attribute that references the identity constraint by its qualified name. References to identity constraints do not contain selector or field elements; they take their definition from the constraint they reference.

Table 17–9. XSD Syntax: identity constraint reference

Image

Example 17–12 shows a new element declaration discontinuedProductList that has the same uniqueness constraint as catalog. To indicate this, it contains a unique element, but with a ref attribute instead of a name. Note that the two element declarations specify the same type; this is not a requirement, but it is common since most identity constraints would only be shared among elements that contain a similar structure.

Example 17–12. Referencing an identity constraint


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           xpathDefaultNamespace="http://datypic.com/prod">
  <xs:element name="catalog" type="CatalogType">
    <xs:unique name="dateAndProdNumKey">
      <xs:selector xpath="department/product"/>
      <xs:field xpath="number"/>
      <xs:field xpath="@effDate"/>
    </xs:unique>
  </xs:element>

  <xs:element name="discontinuedProductList" type="CatalogType">
    <xs:unique ref="dateAndProdNumKey"/>
  </xs:element>

  <!--...-->
</xs:schema>


Being able to reference identity constraints is also useful when restricting types. In version 1.0, if you used a local element declaration that contained an identity constraint, it was impossible to restrict the complex type that contained it because there was no formal definition of a valid restriction of an identity constraint. Now that it can be named and referenced, there is a formal way of indicating that an identity constraint is the same as the identity constraint in the base type. This is shown in Example 17–13, where the catalog element declaration in the base type has an identity constraint, and the catalog element declaration in the derived type references that identity constraint.

Example 17–13. Referencing an identity constraint in a restriction


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="http://datypic.com/prod"
           xmlns="http://datypic.com/prod"
           xpathDefaultNamespace="http://datypic.com/prod">
  <xs:complexType name="CatalogListType">
    <xs:sequence>
      <xs:element name="catalog" type="CatalogType"
                  maxOccurs="unbounded">
        <xs:unique name="dateAndProdNumKey">
          <xs:selector xpath="department/product"/>
          <xs:field xpath="number"/>
          <xs:field xpath="@effDate"/>
        </xs:unique>
      </xs:element>
    </xs:sequence>
  </xs:complexType>

  <xs:complexType name="RestrictedCatalogListType">
    <xs:complexContent>
      <xs:restriction base="CatalogListType">
        <xs:sequence>
        <xs:element name="catalog" type="CatalogType"
                    maxOccurs="1">
            <xs:unique ref="dateAndProdNumKey"/>
          </xs:element>
        </xs:sequence>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>

  <!--...-->
</xs:schema>


image
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.49.252