9.5. Set-Comparison Constraints

Set-comparison constraints declare a subset, equality, or exclusion relationship between the populations of role sequences. This section compares support for these constraints in UML and ORM. A detailed discussion of these constraints in ORM can be found in Section 6.4.

As an extension mechanism, UML allows subset constraints to be specified between whole associations by attaching the constraint label “{subset}” next to a dashed arrow between the associations. For example, the subset constraint in Figure 9.21(a) indicates that any person who chairs a committee must be a member of that committee. Figure 9.21(b) shows the same example in ORM.

Figure 9.21. A subset constraint in (a) UML and (b) ORM.


ORM has a mature formalization, including a rigorous theory of schema consistency, equivalence, and implication. Since formal guidelines for working with UML are somewhat immature, care is needed to avoid logical problems. As a simple example, consider the modified version of our committee example shown in Figure 9.22(a), which comes directly from an earlier version of the UML specification, with reference schemes added. Do you spot anything confusing about the constraints?

Figure 9.22. (a) A misleading UML diagram and (b) a misleading ORM diagram.


You probably noticed the problem. The multiplicity constraint of 1 on the chair association indicates that each committee must have at least one chair. The subset constraint tells us that a chair of a committee must also be a member of that committee. Taken together, these constraints imply that each committee must have a member. Hence we would expect to see a multiplicity constraint of “1..*” (one or more) on the Person end of the membership association. However, we see a constraint of “*” (zero or more) instead, which at best is misleading. An equivalent, misleading ORM schema is shown in Figure 9.22(b), where the upper role played by Committee appears optional when in fact it is mandatory.

One might argue that it’s okay to leave these schemas unchanged, as the constraint that each committee includes at least one person is implied by other constraints. However, while display options for implied constraints may sometimes be a matter of taste, practical experience has shown that in cases like this it is better to show implied constraints explicitly, as in Figure 9.23, rather than expect modelers or domain experts to figure them out for themselves.

Figure 9.23. All constraints are now shown explicitly in (a) UML and (b) ORM.


Some ORM tools can detect the misleading nature of constraint patterns like that of Figure 9.22(b) and ask you to resolve the problem. Human interaction is the best policy here, since there is more than one possible mistake (e.g., is the subset constraint correct leading to Figure 9.23, or is the optional role correct resulting in Figure 9.21?).

If a schema in Figure 9.23 is mapped to a relational database, it generates a referential cycle, since the mandatory fact types for Committee map to different tables (so each committee must appear in both tables). The relational schema is shown in Figure 9.24 (arrows show the foreign key references, one simple and one composite, which correspond to the subset constraints).

Figure 9.24. The relational schema mapped from Figure 9.23.


Although referential cycles are sometimes unavoidable, they are awkward to implement. In this case, the cycle arose from applying a mandatory role constraint to a nonfunctional role. Unless the business requires it, this should be avoided at the conceptual level (e.g. by leaving the upper role of committee optional, as in Figure 9.21). Relational mapping is covered in detail in Chapter 11.

Since UML does not allow unary relationships, subset constraints between ORM unaries need to be captured textually, using a note to specify an equivalent constraint between Boolean attributes. For example, the ORM subset constraint in Figure 9.25(b), which verbalizes as Each Patient who smokes is cancer prone, may be captured textually in UML by the note in Figure 9.25(a).

Figure 9.25. (a) UML note for (b) ORM subset constraint between unaries.


UML 2 introduced a subsets property to indicate that the population (extension) of an attribute or association role must be a subset of the population of another compatible attribute or association role respectively. For example, adorning the citizen role in Figure 9.26(a) with {subset resident} means that all citizens are residents (not necessarily of the same country). Figure 9.26 (b) shows the equivalent ORM schema.

Figure 9.26. A single role subset constraint in (a) UML and (b) ORM.


However there are still many subset constraint cases in ORM that cannot be represented graphically as a subset constraint in UML. For example, the subset constraint in Figure 9.27(b) that each student with a second given name must have a first given name is captured as a note in Figure 9.27(a) because the relevant ORM fact types are modeled as attributes in UML, and the required subset constraint applies between student sets not name sets. The subset constraint in Figure 9.25(b) is another example.

Figure 9.27. (a) UML model capturing (b) some subset constraints in ORM.


Moreover, UML does not support subset constraints over arguments that are just parts of relationships, such as the subset constraint in Figure 9.27(b) that students may pass tests in a course only if they enrolled in that course. Figure 9.27(a) models this constraint in UML by transforming the ternary into a binary association class (Enrollment) that has a binary association to Test. Although in this situation an association class provides a good way to cater for a compound subset constraint, sometimes this nesting transformation leads to a very unnatural view of the world. Ideally the modeler should be able to view the world naturally, while having any optimization transformations that reduce the clarity of the conceptual schema performed under the covers.

As another constraint example in UML, consider Figure 9.28, which is the UML version of an OMT diagram used in Blaha and Premerlani (1998, p. 68) to illustrate a subset constraint (if a column is a primary key column of a table, it must belong to that table). Can you spot any problems with the constraints?

Figure 9.28. Spot anything wrong?


One obvious problem is that the “1” on the primary key association should be “0..1” (not all columns belong to primary keys), as in Figure 9.29(a). If we allow tables to have no columns (e.g., the schema is to cater for cases where the table is under construction), then the “*” on the define association is fine; otherwise it should be “1..*”. Assuming that tables and columns are identified by oids or artificial identifiers, the subset constraint makes sense, but the model is arguably sub-optimal since the primary key association and subset constraint could be replaced by a Boolean is-aPKfield attribute on Column.

Figure 9.29. A corrected UML schema (a) remodeled in (b) ORM.


From an ORM perspective, heuristics lead us to initially model the situation using natural reference schemes as shown in Figure 9.29(b). Here ColumnName denotes the local name of the column in the table. We’ve simplified reality by assuming that tables may be identified just by their name. The external uniqueness constraints suggest two natural reference schemes for Column: name plus table, or position plus table. We chose the first of these as preferred, but could have introduced an artificial identifier. The unary predicate indicates whether a column is, or is part of, a primary key. If desired, we could derive the association Column is a primary key field of Table from the path: Column is in Table and Column is a primary key column (the subset constraint in the UML model is then implied).

What is interesting about this example is the difference in modeling approaches. Most UML modelers seem to assume that oids will be used as identifiers in their initial modeling, whereas ORM modelers like to expose natural reference schemes right from the start and populate their fact types accordingly. These different approaches often lead to different solutions.

The main thing is to first come up with a solution that is natural and understandable to the domain expert, because here is where the most critical phase of model validation should take place. Once a correct model has been determined, optimization guidelines can be used to enhance it.

One other feature of the example is worth mentioning. The UML solution in Figure 9.29(a) uses the annotation “{ordered, unique}” to indicate that a table is composed of an ordered set (i.e., a sequence with no duplicates) of columns. UML 2 allows the unique property to be specified with or without the ordered property. By default, ordered = false and unique = true. So either of the settings {ordered} or {ordered, unique} may be used to indicate an ordered set. Either no setting, or the single setting {unique}, indicates a set (the default). If {nonunique} is allowed in this context (this in unclear in the UML specification), one could specify a bag or sequence with the settings {nonunique} or {ordered, nonunique}, respectively

In the ORM community, a debate has been going on for many years regarding the best way to deal with constructors for collection types (e.g., set, ordered set, bag, sequence) at the conceptual level. Our view is that such constructors should not appear in the base conceptual model. Hence the use of Position in Figure 9.29(b) to convey column order (the uniqueness of the order is conveyed by the uniqueness constraint on Column has Position). Keeping fact types elementary has so many advantages (e.g., validation, constraint expression, flexibility, and simplicity) that it seems best to relegate constructors to derived views. We have more to say about collection types in Chapter 10.

In ORM, an equality constraint between two compatible role sequences is shorthand for two subset constraints (one in either direction) and is shown as circled “=”. Such a constraint indicates that the populations of the role-sequences must always be equal. If two roles played by an object type are mandatory, then an equality constraint between them is implied (and hence not shown). UML has no graphic notation for equality constraints. For whole associations we could use two separate subset constraints, but this would be very messy. In general, equality constraints in UML may be specified as textual constraints in notes.

As a simple example, the equality constraint in Figure 9.30(b) indicates that if a patient’s systolic blood pressure is measured, so is his/her diastolic blood pressure (and vice versa). In other words, either both measurements are taken, or neither.

Figure 9.30. A simple equality constraint in (a) UML and (b) ORM.


This kind of constraint is fairly common. Less common are equality constraints between sequences of two or more roles. Figure 9.30(a) models this in UML as a textual constraint between two attributes for blood pressure readings.

Subset and equality constraints enable various classes of schema transformations to be stated in their most general form, and ORM’s more general support for these constraints allows more transformations to be easily visualized (e.g., see the equivalence theorem PSG2 in Section 14.2).

Although UML does not include a graphic notation for a pure exclusion constraint, it does include an exclusive-or constraint to indicate that each instance of a class plays exactly one association role from a specified set of alternatives. To indicate the constraint, “{xor}” is placed beside a dashed line connecting the relevant associations. Figure 9.31(a), which is based on an example from the UML specification, indicates that each account is used by a person or corporation but not both. For simplicity, reference schemes and other constraints are omitted.

Figure 9.31. Exclusive-or: each account is used by a person or corporation but not both.


Prior to version 1.3 of UML, “{or}” was used for this constraint, which was misleading since “or” is typically interpreted in the inclusive sense. The equivalent ORM model is shown in Figure 9.31(b), where the exclusive-or constraint is simply an orthogonal combination of a disjunctive mandatory role (inclusive-or) constraint (circled dot) and an exclusion constraint (circled “X”).

Although the current UML specification describes the xor constraint as applying to a set of associations, we need to apply the constraint to a set of roles (association-ends) to avoid ambiguity in cases with multiple common classes. Visually this could be shown by attaching the dashed line near the relevant ends of the associations, as shown in Figure 9.32(a). Unfortunately, UML attaches no significance to such positioning, so the xor constraint could be misinterpreted to mean that each company must lease or purchase some vehicle rather than the intended constraint that each vehicle is either leased or purchased, a constraint captured unambiguously by the ORM schema in Figure 9.32(b).

Figure 9.32. The exclusive-or constraint should apply between association roles.


UML has no symbols for exclusion or inclusive-or constraints. If UML symbols for these constraints are ever considered, then “{x}” and “{or)” respectively seem appropriate—this choice also exposes the composite nature of “{xor}”.

UML xor-constraints are intended to apply between single roles. The current UML specification seems to imply that these roles must belong to different associations. If so, UML cannot use an xor-constraint between roles of a ring fact type (e.g., between the husband and the wife roles of a marriage association). ORM exclusion constraints cover this case, as well as many other cases not expressible in UML graphic notation. As a trivial example, consider the difference between the following two constraints: no person both wrote a book and reviewed a book, and no person wrote and reviewed the same book. ORM clearly distinguishes these by noting the precise arguments of the constraint (compare Figure 9.33(a) with Figure 9.33 (b)).

Figure 9.33. (a) Nobody wrote and reviewed a book; (b) nobody wrote and reviewed the same book; (c) UML version of (b).


The pair-exclusion constraint in Figure 9.33(b) can be expressed in UML by a note connected by dotted lines to the two associations, as shown in Figure 9.33(c). Alternatively, one could attach a textual constraint to either the Person class (e.g., “bookAuthored and bookReviewed are disjoint sets”) or the Book class (e.g. “author and reviewer are disjoint sets”), but the choice of class would be arbitrary.

UML has no graphic notation for exclusion between attributes, or between attributes and association roles. An exclusion constraint in such cases may often be captured as a textual constraint. For example, in Figure 9.34(a), the exclusion constraint that each employee is either tenured or is contracted until some date may be captured by the textual constraint shown.

Figure 9.34. An exclusion constraint modeled in (a) UML and (b) ORM.


Here the constraint is specified in OCL. The expressions “-> isEmptyO” and “-> notEmptyO” are equivalent to “is null” and “is not null” in SQL. Figure 9.34(b) depicts the exclusion constraint graphically in ORM. There are other ways to model this case in UML (e.g., using subtypes) that offer more chance to capture the constraints graphically.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.177.14