9.6. Subtyping

Both UML and ORM support subtyping, using substitutability (“is-a”) semantics, where each instance of a subtype is also an instance of its supertype(s). For example, declaring Woman to be a subtype of Person entails that each woman is a person, and hence Woman inherits all the properties of Person. Given two object types, A and B, we say that A is a subtype of B if, for each database state, the population of A is included in the population of B. For data modeling, the only subtypes of interest are proper subtypes. We say that A is a proper subtype of B if and only if A is a subtype of B, and there is a possible state where the population of B includes an instance not in A. From now on, we’ll use “subtype” as shorthand for “proper subtype”.

In both UML and ORM, specialization is the process of introducing subtypes, and generalization is the inverse procedure of introducing a supertype. Both UML and ORM allow single inheritance, as well as multiple inheritance (where a subtype has more than one direct supertype). For example, Asian Woman may be a subtype of both AsianPerson and Woman. In UML, “subclass” and “superclass” are synonyms of “subtype” and “supertype”, respectively, and generalization may also be applied to things other than classes (e.g., interfaces, use case actors, and packages). This section restricts its attention to subtyping between object types (classes).

In ORM, a subtype inherits all the roles of its supertypes. In UML, a subclass inherits all the attributes, associations, and operations/methods of its supertype(s). An operation implements a service and has a signature (name and formal parameters) and visibility, but may be realized in different ways. A method is an implementation of an operation, and hence includes both a signature and a body detailing an executable algorithm to perform the operation. In an inheritance graph, there may be many methods for the same operation (polymorphism), and scoping rules are used to determine which method is actually used for a given class. If a subclass has a method with the same signature as a method of one of its supertypes, this is used instead for that subclass (overriding). For example, if Rectangle and Triangle are subclasses of Shape, all three classes may have different methods for display(). This section focuses on data modeling, not behavior modeling, covering inheritance of static properties (attributes and associations), but ignoring inheritance of operations or methods.

Subtypes are used in data modeling to assert typing constraints, encourage reuse of model components, and show a classification scheme (taxonomy). In this context, typing constraints ensure that subtype-specific roles are played only by the relevant subtype.

Since a subtype inherits the properties of its supertype(s), only its specific roles need to be declared when it is introduced. Apart from reducing code duplication, the more generic supertypes are likely to find reuse in other applications. At the coding level, inheritance of operations/methods augments the reuse gained by inheritance of attributes and association roles. Using subtypes to show taxonomy is of limited use, since taxonomy is often more efficiently captured by predicates. For example, the fact type Person is of Gender {male, female} implicitly provides the taxonomy for the subtypes MalePerson and FemalePerson.

Both UML and ORM display subtyping using directed acyclic graphs. A directed graph is a graph of nodes with directed connections, and “acyclic” means that there are no cycles (a consequence of proper subtyping). Figure 9.35 shows a subtype pattern in UML and ORM. An arrow from one node to another shows that the first is a subtype of the second. UML uses a thin arrow shaft with an open arrowhead, while ORM uses a solid shaft and arrowhead. As an alternative notation, UML also allows separate shafts to merge into one, with one arrowhead acting for all (e.g., E and F are subtypes of C). Since subtypehood is transitive, indirect connections are omitted (e.g., since £ is a subtype of C, and C is a subtype of A, it follows that £ is a subtype of A, so there is no need to display this implied connection).

Figure 9.35. Subtyping displayed by directed acyclic graphs in (a) UML and (b) ORM.


UML includes four predefined constraints to indicate whether subtypes are exclusive or exhaustive. If subtype connections are shown with separate arrowheads, the constraints are placed in braces next to a dotted line connecting the subtype links, as in Figure 9.35(a) (top). We assume that this line may include elbows, as shown for the disjoint constraint, to enable such cases to be specified. If the subtype connections are shared, the constraints are placed near the shared arrowhead, as in Figure 9.35(a) (bottom). The {overlapping} and {disjoint} options, respectively, indicate that the subtypes overlap or are mutually exclusive. Originally {complete} simply meant that all subtypes were shown, but this was redefined to mean exhaustive (i.e., the supertype equals the union of its subtypes). The {incomplete} option means that the supertype is more than the union of its subtypes. The default is {disjoint, incomplete}. Users may add other keywords.

By default, ORM subtypes may overlap, and subtypes need not collectively exhaust their supertype. ORM allows graphic constraints to indicate that subtypes are mutually exclusive (a circled “X” connected to the relevant subtype links), collectively exhaustive (a circled dot), or both (a circled, crossed dot), as shown in Figure 9.35(b). ORM’s approach is that exclusion and totality constraints are enforced on populations, not types. An overlapping “constraint” does not mean that the populations must overlap, just that they may overlap. Hence from an ORM viewpoint, this is not really a constraint at all, so there is no need to depict it. In ORM, subtype exclusion and totality constraints are often implied by other constraints in conjunction with formal subtype definitions.

For any subtype graph, the top supertype is called the root, and the bottom subtypes (those with no descendants) are called leaves. In UML this can be made explicit by adding “{root}” or “{leaf}” below the relevant class name. If we know the whole subtype graph is shown, there is little point in doing this, but if we were to display only part of a subtype graph, this notation makes it clear whether or not the local top and bottom nodes are also like that in the global schema. For example, from Figure 9.36 we know that globally Party has no supertype and that MalePerson and FemalePerson have no subtypes. Since Party is not marked as a leaf node, it may have other subtypes not shown here.

Figure 9.36. Party may have other subtypes not shown here.


UML also allows an ellipsis “...” in place of a subclass to indicate that at least one subclass of the parent exists in the global schema, but its display is suppressed on the diagram. Currently ORM does not include a root/leaf notation or an ellipsis notation for subtypes. Such notations could be a useful extension to ORM diagrams.

UML distinguishes between abstract and concrete classes. An abstract class cannot have any direct instances and is shown by writing its name in italics or by adding “{abstract}” below the class name. Abstract classes are realized only through their descendants. Concrete classes may be directly instantiated. This distinction seems to have little relevance at the conceptual level and is not depicted explicitly in ORM. For code design, however, the distinction is important (e.g., abstract classes provide one way of declaring interfaces, and in C+ + abstract operations correspond to pure virtual operations, while leaf operations map to nonvirtual operations). For further discussion of this topic, see Fowler (1997, pp. 85–88) and Booch et al. (1999, pp. 125–126).

Like other ER notations, UML provides only weak support for defining subtypes. A discriminator label may be placed near a subtype arrow to indicate the basis for the classification. For example, Figure 9.37 includes a “gender” discriminator to specialize Person into MalePerson and FemalePerson.

Figure 9.37. Gender is used as a discriminator to partition Patient.


The UML specification says that the discriminator names “a partition of the subtypes of the superclass”. In formal work, the term “partition” usually implies the division is both exclusive and exhaustive. In UML, the use of a discriminator does not imply that the subtypes are exhaustive or complete, but at least some authors argue that they must be exclusive (Fowler 1997, p. 78). If that is the case, there does not appear to be any way in UML of declaring a discriminator for a set of overlapping subtypes.

The same discriminator name may be repeated for multiple subclass arrows to show that each subclass belongs to the same classification scheme. This repetition can be avoided by merging the arrow shafts to end in a single arrowhead, as in Figure 9.37.

In Figure 9.37, the gender attribute of Patient is used as a discriminator. This attribute is based on the enumerated type Gendercode, which is defined using the stereotype «enumeration», and listing its values as attributes. The notes at the bottom are needed to ensure that instances populating these subtypes have the correct gender. For example, without these notes there is nothing to stop us populating MalePatient with patients that have the value ‘f’ for their gendercode.

As discussed in Section 6.5, ORM overcomes this problem by requiring that if a taxonomy is captured both by subtyping and a classifying fact type, these two representations must be synchronized, either by deriving the subtypes from formal subtype definitions, or by deriving the classification fact type from asserted subtypes. For example, the populated ORM schema in Figure 9.38 adopts the first approach. The ORM partition (exclusion and totality) constraint is now implied by the combination of the subtype definitions and the three constraints on the fact type Patient is of Gender.

Figure 9.38. With formal subtype definitions, subtype constraints are implied.


While the subtype definitions in Figure 9.38 are trivial, in practice more complicated subtype definitions are sometimes required. As a basic example, consider a schema with the fact types City is in Country, City has Population, and now define LargeUS-city as follows: Each LargeUScity Is a City that is in Country ‘US’ and has Population > 1000000. There does not seem to be any convenient way of doing this in UML, at least not with discriminators. We could perhaps add a derived Boolean isLarge attribute, with an associated derivation rule in OCL, and then add a final subtype definition in OCL, but this would be less readable than the ORM definition just given. For a more detailed discussion of subtyping in ORM, including the notion of context-dependent reference, see Sections 6.5 and 6.6. Mapping of subtypes is discussed in Chapter 11.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.111.24