9.7. Other Constraints and Derivation Rules

A value constraint restricts the population of a value type to a finite set of values specified either in full (enumeration), by start and end values (range), or some combination of both (mixture). The values themselves are primitive data values, typically character strings or numbers.

In UML, enumeration types may be modeled as classes, stereotyped as enumerations, with their values listed (somewhat unintuitively) as attributes. Ranges and mixtures may be specified by declaring a textual constraint in braces, using any formal or informal language. For example, see Figure 9.39(a).

Figure 9.39. Data value restrictions declared as enumerations or textual constraints.


Figure 9.39(b) depicts the same value constraints in ORM. Value constraints other than enumeration, range, and mixture may be declared in UML or ORM as textual constraints, for example, {committeeSize must be an odd number}. For further UML examples, see Rumbaugh et al. (1999, pp. 236, 268).

A ring fact type has at least two roles played by the same object type (either directly, or indirectly via a supertype). A ring constraint applies a logical restriction on the role pair. For example, the association Person is a parent of Person might be declared acyclic and intransitive.

UML does not provide ring constraints built in, so the modeler needs to specify these as a textual constraint in some chosen language or as a note. In UML, if a textual constraint applies to just one model element (e.g., an association), it may be added in braces next to that element, as in Figure 9.40(a). Here the {acyclic, intransitive} notation is nonstandard but is assumed to be user supported.

Figure 9.40. Ring constraints expressed in (a) UML, (b) UML, and (c) ORM.


It is the responsibility of the modeling tool to ensure that the constraint is linked internally to the relevant model element and to interpret any textual constraint expressions. If the tool cannot interpret the constraint, it should be placed inside a note (dogeared rectangle), without braces, showing that it is merely a comment, and explicitly linked to the relevant model element(s), as shown in Figure 9.40(b). Figure 9.40(c) displays the ring constraints graphically in ORM (see Section 7.3 for a detailed discussion of ring constraints in ORM).

A join constraint applies to one or more role sequences, at least one of which is projected from a path from one predicate through an object type to another predicate. The act of passing from one role through an object type to another role invokes a conceptual join, since the same object instance is asserted to play both the roles. Although join constraints arise frequently in real applications, UML has no graphic symbol for them. To declare them on a UML diagram, write a constraint or comment in a note attached to the model elements involved.

For example, Figure 9.41 links a comment to three associations. This example is based on a room scheduling application at a university with built-in facilities in various lecture and tutorial rooms. Example facility codes are PA = Personal Address system, DP = Data Projection facility, INT = Internet access.

Figure 9.41. Join constraint specified as a comment in UML.


As discussed in Section 10.1, ORM provides deep support for join constraints. Role sequences featuring as arguments in set comparison constraints may arise from projections over a join path.

For example, in Figure 9.42, the subset constraint runs from the Room-Facility role-pair projected from the path: Room at an HourSlot is booked for an Activity that requires a Facility. This path includes a conceptual join on Activity. The constraint may be formally verbalized as: If a Room at an HourSlot is booked for an Activity that requires a Facility then that Room provides that Facility. Figure 9.42 includes a satisfying population for the three fact types. This again illustrates how ORM facilitates validation constraints via sample populations. The UML associations in Figure 9.41 are not so easily populated on the diagram. Other join constraint examples are discussed in Section 10.1.

Figure 9.42. A join-subset constraint in ORM.


In UML, the term “aggregation” is used to describe a whole/part relationship. For example, a team of people is an aggregate of its members, so this membership may be modeled as an aggregation association between Team and Person. Several different forms of aggregation might be distinguished in real world cases. For example, Odell and Bock (Odell 1998, pp. 137-165) discuss six varieties of aggregation (component-integral, material-object, portion-object, place-area, member-bunch, and member-partnership), and Henderson-Sellers (Barbier et al., 2003) also distinguishes several kinds of aggregation.

UML 2 associations are classified into one of three kinds: ordinary association (no aggregation), shared (or simple) aggregation, or composite (or strong) aggregation. Hence UML recognizes only two varieties of aggregation: shared and composite. Some versions of ER include an aggregation symbol (typically only one kind). ORM and popular ER approaches currently include no special symbols for aggregation.

These different stances with respect to aggregation are somewhat reminiscent of the different modeling positions with respect to null values. Although over 20 kinds of null have been distinguished in the literature, the relational model recognizes only 1 kind of null. Codd’s version 2 of the relational model includes 2 kinds of null, and ORM argues that nulls have no place in base conceptual models (because all its asserted facts are atomic). But let’s return to the topic at hand.

Shared aggregation is denoted in UML as a binary association, with a hollow diamond at the “whole” or “aggregate” end of the association. Composition (composite aggregation) is depicted with a. filled diamond. For example, Figure 9.43(a) depicts a composition association from Club to Team and a shared aggregation association from Team to Person.

Figure 9.43. Composition (composite aggregation) and shared aggregation in UML.


In ORM, which currently has no special notation for aggregation, this situation would be modeled as shown in Figure 9.43(b). Does Figure 9.43(a) convey any extra semantics, not captured in Figure 9.43(b)? At the conceptual level, it is doubtful whether there is any additional useful semantics. At the implementation level, however, there is additional semantics. Let’s discuss this in more detail.

The UML specification declares that “both kinds of aggregation define a transitive ... relationship”. The use of “transitive” here is somewhat misleading, since it refers to indirect aggregation associations rather than base aggregation associations. For example, if Club is an aggregate of Team, and Team is an aggregate of Person, it follows that Club is an aggregate of Person.

However, if we wanted to discuss this result, it should be exposed as a derived association. In UML, derived associations are indicated by prefixing their names with a slash “/”. The derivation rule can be expressed as a constraint, either connected to the association by a dependency arrow or simply placed beside the association as in Figure 9.44(a).

Figure 9.44. A derived aggregation in (a) UML and (b) ORM.


In ORM, derived fact types are marked with a trailing asterisk, with their derivation rules specified in an ORM textual language (see Figure 9.44(b). In many cases, derivation rules may also be diagrammed as a join-subset or join-equality constraint. As this example illustrates, the derived transitivity of aggregations can be captured in ORM without needing a special notation for aggregation.

The UML specification declares that “both kinds of aggregation define a transitive, antisymmetric relationship (i.e., the instances form a directed, noncyclic graph)”. Recall that a relation R is antisymmetric if and only if, for all x and y, if x is not equal to y then xRy implies that not yRx. It would have been better to simply state that paths of aggregations must be acyclic.

At any rate, this rule is designed to stop errors such as that shown in Figure 9.45. If a person is part of a team, and a team is part of a club, it doesn’t make sense to say that a club is part of a person. Since ORM does not specify whether an association is an aggregation, illegal diagrams like this can’t occur in ORM.

Figure 9.45. Illegal UML model. Aggregations should not form a cycle.


Of course, it is possible for an ORM modeler to make a silly mistake by adding an association such as Club is part of Person, where “is part of is informally understood in the aggregation sense, and this would not be formally detectable. But avoidance of such a bizarre occurrence doesn’t seem to be a compelling reason to add aggregation to ORM’s formal notation. There are plenty of associations between Club and Person that do make sense, and plenty that don’t. In some cases, however, it is important to assert constraints such as acyclicity, and this is handled in ORM by ring constraints. That said, there have been some recent proposals to add formal semantics for various forms of the part-of relationship to ORM. For example, Keet (2006) proposes adding several different mereological part-of predicates as well as four kinds of meronymic relations.

Composition does add some important semantics to shared aggregation. To begin with, it requires that each part belongs to at most one whole at a time. In ORM, this is captured by adding a uniqueness constraint to the role played by the part (e.g., see the role played by Team in Figure 9.43(b)). In UML, the multiplicity at the whole end of the association must be 1 or 0..1. If the multiplicity is 1, as in Figure 9.43(a), the role played by the part is both unique and mandatory, as in Figure 9.43(b).

As an example where the multiplicity is 0..1 (i.e., where a part optionally belongs to a whole), consider the ring fact type of Figure 9.46: Package contains Package. Here “contains” is used in the sense of “directly contains”. The UML specification notes that “composition instances form a strict tree (or rather a forest)”. This strengthening from directed acyclic graph to tree is an immediate consequence of the functional nature of the association (each part belongs to at most one whole), and hence ORM requires no additional notation for this. In this example, the ORM schema explicitly includes an acyclic constraint. This direct containment association is intransitive by implication (acyclicity implies irreflexivity, and any functional, irreflexive association is intransitive).

Figure 9.46. Direct containment modeled in (a) UML and (b) ORM.


UML allows some alternative notations for aggregation. If a class is an aggregate of more than one class, the association lines may be shown joined to a single diamond, as in Figure 9.47(a). For composition, the part classes may be shown nested inside the whole by using role names, and multiplicities of components may be shown in the top right-hand corners, as in Figure 9.47(b).

Figure 9.47. Alternative UML notations for aggregation.


Some authors list kinds of association that are easily confused with aggregation but should not be modeled as such (e.g., topological inclusion, classification inclusion, attribution, attachment, and ownership (Martin and Odell 1998; Odell 1998)).

For example, Finger belongs to Hand is an aggregation, but Ring belongs to Finger is not. There is some disagreement among authors about what should be included on this list. For example, some treat attribution as a special case of aggregation, namely, a composition between a class and the classes of its attributes (Rumbaugh et al. 1999).

For conceptual modeling purposes, agonizing over such distinctions might not be worth the trouble. Obviously there are different stances that you could take about how, if at all, aggregation should be included in the conceptual modeling phase. You can decide what’s best for you. The chapter notes provide further discussion on this issue.

Let’s now look at the notion of initial values. The basic syntax of an attribute specification in UML includes six components as shown. Square and curly brackets are used literally here as delimiters (not as BNF symbols to indicate optional components).

visibililty name [multiplicity] : type-expression = initial-value {property string}

If an attribute is displayed at all, its name is the only thing that must be shown. The visibility marker (+, #, –, ~ denote public, protected, private, package respectively) is an implementation concern and will be ignored in our discussion. Multiplicity has been discussed earlier and is specified for attributes in square brackets, e.g., [1..*].

For attributes, the default multiplicity is 1, that is, [1..1]. The type expression indicates the domain on which the attribute is based (e.g., String, Date). Initial-value and property string declarations may optionally be declared. Property strings may be used to specify aspects such as changeability (see later).

An attribute may be assigned an initial value by including the value in the attribute declaration after “=” (e.g., diskSize = 9; country = USA; priority = normal). The language in which the value is written is an implementation concern.

In Figure 9.48(a), the nrColors attribute is based on a simple domain (e.g., Positiveln-teger) and has been given an initial value of 1. The resolution attribute is based on a composite domain (e.g., PixelArea) and has been assigned an initial value of (640,480).

Figure 9.48. Attributes assigned initial values in (a) UML and (b) ORM extension.


Unless overridden by another initialization procedure (e.g., a constructor), declared initial values are assigned when an object of that class is created. This is similar to the database notion of default values, where during the insertion of a tuple an attribute may be assigned a predeclared default value if a value is not supplied by the user.

However, UML uses the term “default value” in other contexts only (e.g., template and operation parameters), and some authors claim that default values are not part of UML models (Rumbaugh et al. 1999, p. 249).

The SQL standard treats null as a special instance of a default value, and this is supported in UML, since the specification notes that “a multiplicity of 0..1 provides for the possibility of null values: the absence of a value”. So an optional attribute in UML can be used to model a feature that will appear as a column with the default value of null, when mapped to a relational database. Presumably a multiplicity of [0..*] or [0..n] for any n > 1 also allows nulls for multivalued attributes, even though an empty collection could be used instead.

Currently, ORM has no explicit support for initial/default values. However, UML initial values and relational default values could be supported by allowing default values to be specified for ORM roles. At the meta-level, we add the fact type: Role has default- Value. At the external level, instances of this could be specified on a predicate properties sheet, or entered on the diagram (e.g., by attaching an annotation such as d: value to the role, and preferably allowing this display to be toggled on/off). For example, the role played by NrColors in Figure 9.48(b) is allocated a default value of 1. When mapped to SQL, this should add the declaration “default 1” to the column definition for ClipArt.nrColors.

To support the composite initial values allowed in UML, composite default values could be specified for ORM roles played by compositely identified object types (co-referenced or nested). When coreferencing involves at least two roles played by the same or compatible object types, an order is needed to disambiguate the meaning of the composite value. For example, in Figure 9.48(b) the role played by Resolution is assigned a default composite value of (640,480). To ensure that the 640 applies to the horizontal pixelcount and the 480 applies to the vertical pixelcount (rather than the other way round), this ordering needs to be applied to the defining roles of the external uniqueness constraint. ORM tools often determine this ordering from the order in which the roles are selected when entering this constraint.

If all or most roles played by an object type have the same default, it may be useful to allow a default value to be specified for the object type itself. This could be supported in ORM by adding the meta fact type ObjectType has default- Value and providing some notation for instantiating it (e.g., by an entry in an Object Type Properties sheet or by annotating the object type shape with d: value). This corresponds to the default clause permitted in a create-domain statement in the SQL standard. Note that an object-type default can always be expressed instead by role-based defaults, but not conversely (since the default may vary with the role).

Specification of default values does not cover all the cases that can arise with regard to default information in general. A proposal for providing greater support for default information in ORM is discussed in Halpin and Vermeir (1997), but this goes beyond the built-in support for defaults in either UML or SQL. Default information can be modeled informally by using a predicate to convey this intention to a human. For example, we might specify default medium (e.g., ‘CD’, ‘DVD’) preferences for delivery of soft products (e.g., music, video, software) using the l:n fact type: Medium is default preference for SoftProduct.

In cases like this where default values overlap with actual values, we may also wish to classify instances of relevant fact types as actual or default (e.g., Shipment used Medium). For the typical case where the uniqueness constraint on the fact type spans n - 1 roles, this can be achieved by including fact types to indicate the default status (e.g., Shipment was based on Choice {actual, default}), resulting in extra columns in the database to record the status. While this approach is generic, it requires the modeler and user to take full responsibility for distinguishing between actual and default values.

In UML, restrictions may be placed on the changeability of attributes, as well as the roles (ends) of binary associations. It is unclear whether changeability may be applied to the ends of n-ary associations. UML 2 recognizes the following four values for changeability, only one of which can apply at a given time:

  • unrestricted

  • readonly

  • addOnly

  • removeOnly

The default changeability is “unrestricted” (any change is permitted). The value “unrestricted” was formerly called “changeable”, which itself was formerly called “none”. The other settings may be explicitly declared in braces. For an attribute, the braces are placed at the end of the attribute declaration. For an association, the braces are placed at the opposite end of the association from the object instance to which the constraint applies.

Recall that in UML a “link” is an instance of an association. The term “readonly” (formerly called “frozen”) means that once an attribute value or link has been inserted, it cannot be updated or deleted, and no additional values/links may be added to the attribute/association (for the constrained object instance).

The term “addOnly” means that although the original value/link cannot be deleted or updated, others values/links may be added to the attribute/association (for the constrained object instance). Clearly, addOnly is only meaningful if the maximum multiplicity of the attribute/association-role exceeds its minimum multiplicity. The term “removeOnly” means that the only change permitted for an existing attribute value or link is to delete it.

As a simple if unrealistic example, see Figure 9.49. Here employee number, birth-date, and country of birth are readOnly for Employee, so they cannot be changed from their original value. For instance, if we assign an employee the employee number 007, and enter his/her birthdate as 02/15/1946 and birth country as ‘Australia’, then we can never make any changes or additions to that.

Figure 9.49. Changeability of attributes and association roles.


Notice also that for a given employee, the set of languages and the set of countries visited are addOnly. Suppose that when facts about employee 007 are initially entered, we set his/her languages to {Latin, Japanese} and countries visited to {Japan}. As long as employee 007 is referenced in the database, these facts may never be deleted. However we may add to these (e.g., later we might add the facts that employee 007 speaks German and visited India).

By default, the other properties are changeable. For example, employee 007 might legally change his name from ‘Terry Hagar’ to ‘Hari Seldon’, and the countries he wants to visit might change over time from {Ireland, USA} to {Greece, Ireland}.

Some traditional data modeling approaches also note some restrictions on changeability. As discussed in the previous chapter, the Barker ER notation includes a diamond to mark a relationship as nontransferable (once an instance of an entity type plays a role with an object, it cannot ever play this role with another object). Although changeability restrictions can be useful, in practice their application in database settings is limited.

One reason for this is that we almost always want to allow facts entered into a database to be changed. With snapshot data, this is the norm, but even with historical data changes can occur. The most common occurrence of this is to allow for corrections of mistakes, which might be because we were told the wrong information originally or because we carelessly made a misspelling or typo when entering the data.

In exceptional cases, we might require that mistakes of a certain kind be retained in the database (e.g., for auditing purposes) but be corrected by entering later facts to compensate for the error. This kind of approach makes sense for bank transactions (see Figure 9.50). For example, if a deposit transaction for $100 was mistakenly entered as $1000, the record of this error is kept, but once the error is detected it can be compensated for by a bank withdrawal of $900. As a minor point, the balance is both derived and stored, and its readOnly status is typically implied by the readOnly settings on the base attributes, together with a rule for deriving balance.

Figure 9.50. All attributes of Transaction are read only.


Some authors allow changeability to be specified for a class, as an abbreviation for declaring this for all its attributes and opposite association ends (Booch et al. 1999, p. 184). For instance, all the {readonly} constraints in Figure 9.50 might be replaced by a single {readonly} constraint below the name “Transaction”. While this notation is neater, it would be rarely used. Even in this example, we would probably want to allow for the possibility of adding nonfrozen information later (e.g., a transaction might be audited by zero or more auditors).

Changeability settings are useful in the design of program code. Although changeability settings are not currently supported in ORM, which focuses on static constraints, they are being considered in extensions to support dynamic constraints (see next chapter). In the wider picture, being able to completely model security issues (e.g., who has the authority to change what) would provide extra value.

As discussed earlier, UML allows {ordered} and {unique} properties to be specified for multivalued attributes and association ends. Since {unique} is true by default, the use of {ordered} alone indicates an ordered set (a sequence with no duplicates). For example, Figure 9.51(a) shows one way of modeling authorship of papers in UML. Each paper has a list or sequence of authors, each of whom may appear at most once on the list.

Figure 9.51. An ordered set modeled in (a) UML and (b) ORM.


This may be modeled in flat ORM by introducing a Position object type to store the sequential position of any author on the list, as shown in Figure 9.51(b). The uniqueness constraint on the first two roles declares that for each paper an author occupies at most one position; the constraint covering the first and third roles indicates that for any paper, each position is occupied by at most one author. The textual constraint indicates that the list positions are numbered sequentially from 1.

Although this ternary representation may appear awkward, it is easy to populate and it facilitates any discussion involving position (e.g., who is the second author for paper 21?). From an implementation perspective, an ordered set structure could still be chosen.

An ordered set is an example of a collection type. As discussed in Chapter 10, some versions of ORM allow collections to be specified as mapping annotations in a similar way to UML, and some ORM versions allow collections to be modeled directly as first class objects.

UML 2 introduced the notion of association redefinition. This concept is complex and applies to generalizations as well as associations. One main use of it is to specify stronger constraints on an association role that specializes a role played by a super-type. For example, in Figure 9.52(a) the executiveCar role redefines the assignedCar role, applying a stronger multiplicity constraint on it that applies only to executives. Effectively, the association Executive is assigned CompanyCar is treated as a specialization of the Employee is assigned CompanyCar association. Although some versions of ORM support a similar notion, most ORM versions require the stronger multiplicity to be asserted in a textual constraint, as shown in Figure 9.52 (b).

Figure 9.52. Association redefinition in (a) UML and (b) ORM.


Now let’s consider derived data. In UML, derived elements (e.g., attributes, associations, or association-roles) are indicated by prefixing their names with “/”. Optionally, a derivation rule may be specified as well. The derivation rule can be expressed as a constraint or note, connected to the derived element by a dashed line. This line is actually shorthand for a dependency arrow, optionally annotated with the stereotype name «derive». Since a constraint or note is involved, the arrow-tip may be omitted (the constraint or note is assumed to be the source). For example, Figure 9.53(a) includes area as a derived attribute. Figure 9.53(b) shows the ORM schema.

Figure 9.53. Derivation of area in (a) UML and (b) ORM.


The UML dependency line may also be omitted entirely, with the constraint shown in braces next to the derived element (in this case, it is the modeling tool’s responsibility to maintain the graphical linkage implicitly). A club-membership example of this was included earlier.

As another example, Figure 9.54(a) expresses uncle information as a derived association. For illustration purposes, role names are included for all association ends. The corresponding ORM schema is shown in Figure 9.54(b), where the derivation rule is specified in relational style.

Figure 9.54. Derived uncle association in (a) UML and (b) ORM.


Although precise role names are not always elegant, the use of role names in derivation rales involving a path projection can facilitate concise expression of rules, as shown here in the UML model. By adding role names to the ORM schema, the derivation rule may be specified compactly in attribute style thus: * define uncle of Person as brother of parent of Person. More complex derivation rules can be stated informally in English or formally in a language such as OCL.

One advantage of ORM’s approach to derivation rules is that it is more stable, since it is not impacted by schema changes such as attributes being later remodeled as associations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.69.255