5.2. Mandatory and Optional Roles

Consider the output report of Table 5.1. The question mark “?” denotes a null, indicating that an actual value is not recorded. For instance, patient 002 may have a phone but this information is not recorded, or he/she may simply have no phone.

Table 5.1. Details about hospital patients.
Patient NrPatient NamePhone
001Adams C205764
002Brown S?
003Collins T8853020

Patients are identified by a patient number. Different patients may have the same name and even the same phone number, so the population of Table 5.1 is not significant. We must record each patient’s name, but it is optional whether we record a phone number. Figure 5.4. The question mark “?” denotes a null, indicating that an actual value is not recorded. For instance, patient 002 may have a phone but this information is not recorded, or he/she may simply have no phone.

Figure 5.4. A populated schema for Table 5.1.


Patients are identified by a patient number. Different patients may have the same name and even the same phone number, so the population of Table 5.1 is not significant. We must record each patient’s name, but it is optional whether we record a phone number. Figure 5.4shows a preliminary conceptual model for this situation. Conceptual facts are elementary, so they cannot contain nulls. The null in Table 5.1 is catered for by the absence of a fact for patient 002 in the fact type Patient has PhoneNr.

In step 5 of the design procedure, each role is classified as mandatory or optional. A role is mandatory if and only if, for all states of the database, the role must be played by every member of the population of its object type; otherwise the role is optional. A mandatory role is also called a total role, since it is played by the total population of its object type.

Which of the four roles in Figure 5.4 are mandatory and which are optional? If the diagram includes all the fact types for the UoD, and its sample population is significant, we can easily determine which roles are mandatory.

In practice, however, we mostly work with subschemas rather than the complete, or global schema; and sample populations are rarely significant. In such cases, we should check with the domain expert whether the relevant information must be recorded for all instances of the object type (e.g., must we record the name of each patient?).

Consider the two fact roles played by Patient. For the database state shown, the populations of these roles are {001, 003} and {001, 002, 003}. If these are the only fact roles of Patient, then pop(Patient) = {001, 002, 003}. The role of having a name is played by all recorded patients, and the role of having a phone number is played by only some. Assuming the population is significant in this regard, the name role is mandatory and the phone number role is optional.

To indicate explicitly that a role is mandatory, we add a mandatory role dot to the line that connects the role to its object type. This dot may be placed at either end of the role line. In Figure 5.5(a), the dot is placed at the object type end. This reinforces the global nature of the constraint in applying to the object type’s population. If we add a patient instance to the population of a mandatory Patient role, we must also include this instance in the population of each other role declared mandatory for Patient.

Figure 5.5. Mandatory role constraint shown as a dot at either end of the role line.


In this sense, a mandatory role constraint can have an impact beyond its predicate. In contrast, a uniqueness constraint is local in nature, constraining just the population of its role(s), with no impact on other predicates.

In Figure5.5(b), the mandatory dot is placed at the role end. This choice is useful when an object type’s role lines are so close together that if we added a dot at the object type end it would be unclear which role is intended. This situation can arise when an object type plays a large number of roles displayed on the same schema page.

In Figure 5.5 the first role of the top predicate is both mandatory and unique. The mandatory role constraint verbalizes as Each Patient has at least one PatientName, or equivalently, as Each Patient has some PatientName. The uniqueness constraint verbalizes as Each Patient has at most one PatientName. In combination, these constraints verbalize as Each Patient has exactly one PatientName (i.e., each recorded patient has one and only one name recorded). In general, at least one + at most one = exactly one.

In a completed schema, if two or more fact roles are played by the same object type, then individually these roles are optional unless marked mandatory. For example, in Figure 5.5 the first role of Patient has PhoneNr is optional. This means it is possible to add a patient to the database population without adding a phone number for that patient.

What about the roles played by PatientName and PhoneNr? If these are the only roles played by these object types in the global schema, then these roles are mandatory. Unless declared independent (see Section 6.3), each primitive object must play some role, and each primitive entity must play some fact role. Hence by default, if a primitive object type plays only one role, or a primitive entity type plays only one fact role (in the global schema), this role is mandatory. In such cases, the implied mandatory role dot is usually omitted. In Figure 5.5, if no other roles are played by PatientName and PhoneNr, their roles are mandatory by implication. In this case, although not recommended, these implied constraints could be marked explicitly as shown in Figure 5.6.

Figure 5.6. Implied mandatory constraints specified explicitly (not recommended).


Explicit depiction of implied mandatory role constraints has several disadvantages. First, it complicates schema evolution. For example, suppose that tomorrow we decide to add the fact types: Patient had previous- PatientName; Patient has secondary- PhoneNr. With the implicit approach, this is simply an addition to the current schema. But with the explicit approach, we need to delete the formerly implied mandatory role constraints and replace them with weaker constraints (e.g., Each PatientName is of, or was of, at least one Patient; Each PhoneNr is used by, or is a secondary number for, at least one Patient—such inclusive-or constraints are discussed shortly).

Another problem with marking implied mandatory constraints is that this de-emphasizes the mandatory role constraints that are really important, that is, the ones we need to enforce (e.g., you must record a name for each patient). If you think about it, you should see that implied mandatory constraints are enforced automatically and hence have no implementation impact.

If we demanded the explicit approach for simple mandatory constraints, we should do the same for disjunctive mandatory (inclusive-or) constraints. But if the roles involved in the disjunction occur on separate schema pages, there is no convenient way to mark the constraint between them.

Another problem with the explicit approach is that it complicates theorem specification. Later on we consider some schema equivalence theorems. In this context, the implicit approach allows us to discuss schema fragments independently of other roles played in the global schema, but the explicit approach would forbid this, leading to extra complexity.

For such reasons, the explicit specification should be avoided unless we have some special reason for drawing the attention of a human reader to the implied constraints. As discussed in the next chapter, the rule for implicit mandatory roles does not apply to subtypes: if they play just one fact role, this is optional unless marked mandatory.

If a role is mandatory, its population always equals the total population of its attached object type. Figure 5.7 depicts the general case for an object type A and an attached role r. The role r may occur in any position in a predicate of any arity, and A may participate in other predicates as well. Mandatory role constraints are enforced on populations rather than types. If the only fact stored in the database is that patient 001 is named “Adams C”, the name role played by Patient is still mandatory even though many more instances of the Patient type have yet to be added to the population.

Figure 5.7. The role r is mandatory.


Recall that the UoD (typically part of the real world) is not the same thing as the recorded world or formal model of the UoD. Like other constraints, mandatory role constraints are assertions about our model, and do not necessarily apply to the world being modeled. With our current example, it is optional whether a patient has a phone. This simply means that we do not need to record a phone number for each patient. Maybe this is because in the real world not every patient has a phone or perhaps each patient does have a phone but some patients won’t supply their phone number.

The real world schema of Figure 5.8(a) concerns applicants for an academic position. By nature, each person has exactly one gender. As a business decision, each applicant is required to have a degree. Applicants may or may not have a fax number. Figure 5.8(b) shows the model actually used. Here, applicants may choose whether to have their gender recorded, but must provide details about their degrees. For this model, a business decision has been made (perhaps to avoid gender bias) to remove a mandatory constraint that applies in the real world. This fairly common practice of relaxing some real world mandatory constraint should always be a conscious decision.

Figure 5.8. A conscious decision to make recording of an Applicant’ gender optional.


Do not read a mandatory role constraint to mean “if an object plays that role in real life then we must record it”. The information system can work only with the model we give it of the world—it cannot enforce real world constraints not expressed in this model.

The following checking procedure helps ensure that our mandatory role constraints are correct. For each mandatory role: is it mandatory in the real world? If not, make it optional. For each optional role: is it optional in the real world? If not, what reasons are there for making it optional? We refine this further when we discuss subtypes.

In Figure 5.8(b) the mandatory constraint is on a nonfunctional role (i.e., a role not covered by a simple uniqueness constraint). In this case it is acceptable, since we wouldn’t want to hire any academic without a degree, and we would be interested in all their degrees.

In general, however, you should be wary of adding mandatory constraints to nonfunctional roles. Do so only if you are sure the constraint is needed. Such cases can lead to complexities (e.g., referential cycles) in the implementation, which are best avoided if possible. Of course if your application really requires such a constraint then you should declare it, regardless of its implementation impact.

Novice modelers tend to be too heavy handed with mandatory role constraints, automatically making a role mandatory if it’s mandatory in the real world. In practice, however, it is often best to make some of these roles optional in the model to allow for cases where for some reason we can’t obtain all the information. As a general piece of advice, make a role mandatory if and only if you need to.

When an object type plays more than one role, special care is needed in updating the database to take account of mandatory roles. Suppose we want to add some facts from Table 5.1 into a database constrained by the conceptual schema of Figure 5.5 (recall that it is mandatory to record a patient’s name but optional to record a phone number).

Assuming the database is initially empty, an untutored user might proceed as follows.

UserCIP
add: Patient 001 has PatientName ‘Adams C’ #→accepted.
add: Patient 001 has PhoneNr ‘2057642’ →accepted.
add: Patient 003 has PhoneNr ‘8853020’ →rejected. Violates constraint: Each Patient has some PatientName.

To add the third fact into the database we must either first record the fact that patient 003 has the name ‘Collins T’ or include this with the phone fact in a compound transaction.

Now consider the two report extracts shown in Figure 5.9. These list sample details maintained by a sporting club. Membership of this club is restricted to players and coaches. For simplicity, assume members may be identified by name. The term “D.O.B.” means date of birth. On joining the club, each person is assigned to a team, in the capacity of player or coach (possibly both).

Figure 5.9. Extracts of two reports from a sporting club.


Teams are identified by semantic codes (e.g., ‘MR-A’, ‘BS-B’, ‘WS-A’ denote the men’s rugby A team, the boy’s soccer B team, and the women’s soccer A team), but the semantics of these codes is left implicit rather than being stored explicitly in the information system. A record is kept of the total number of points scored by each team. Each team is initially assigned zero points, even if it doesn’t yet have any members.

As an exercise, you might like to model this yourself before peeking at the solution in Figure 5.10. The uniqueness constraints assert that each member coaches at most one team, plays for at most one team, was born on at most one date, joined the club on at most one date, and each team has at most one coach and scored at most one total. The reference mode “mdy” for Date indicates that for verbalization purposes date instances are in month-day-year format—of course, this conceptual choice does not exclude other formats being specified for internal or external schemas.

Figure 5.10. Inclusive-or constraint: each member coaches or plays (or both).


The mandatory role dot on Team indicates that each team has a total score (possibly 0). The other mandatory role dots on Member indicate that for each member we must record their birth date and the date they joined the club. The circled mandatory role dot is linked to two roles: this is an inclusive-or (disjunctive mandatory role) constraint, indicating that the disjunction of these two roles is mandatory for Member. That is, each member either coaches or plays (or both).

For example, Adams is a player only, Downes is a coach only, and Collins is both a player and a coach. A coach of one team may even play for the same team (although the population doesn’t illustrate this). The lack of an inclusive-or constraint over the “is coached by” and “includes” roles of Team indicates that it is possible to know about a team without knowing its coach or any of its players.

If Figure 5.10 includes all the roles played by Date in the global schema, then the disjunction of Date’s roles is also mandatory (each date is a birth date or a join date). This could be shown explicitly by linking these roles to an inclusive-or dot; however this constraint is implicitly understood, and the explicit constraint would need to be changed if we added another fact type for Date, so it is better to leave the figure as is.

This Date example illustrates the following generalization of the rule mentioned earlier for single mandatory roles: by default, the disjunction of roles played by a primitive object type is mandatory (this default can be over-ridden by declaring the object type independent—see next chapter). If the object type is an entity type, the disjunction of its fact-roles is also mandatory by default.

Apart from highlighting the important cases where disjunctive mandatory roles need to be enforced (e.g., the coach-player disjunction), and simplifying schema evolution and theorem specification, this rule facilitates the drawing of schemas. For example, object types such as Date and MoneyAmount often have several roles that are disjunctively mandatory, and to display this constraint explicitly would be messy (or even impossible if the object type occurs on separate schema pages).

To avoid confusion, a mandatory role constraint (simple or disjunctive) should be shown explicitly if the object type it constrains is a subtype (see Chapter 6).

In the case of implied (possibly disjunctive) mandatory role constraints, it is okay to mark the constraint explicitly if it will still apply if more roles are added to the object type (where these extra roles are excluded from the constraint). For object types such as Date, this would almost never happen.

An inclusive-or (disjunctive mandatory role) constraint indicates that each instance in the population of the object type must play at least one of the constrained roles (and possibly all of them).

Figure 5.11 indicates in general how to specify explicitly that a disjunction of roles r1, r2,, ..., rn is mandatory by linking the n roles to a circled dot. The roles may occur at any position in a predicate, and the predicates need not be distinct.

Figure 5.11. Inclusive-or constraint (the role disjunction is mandatory).


The inclusive “or” is used in the verbalization of the constraint. For example, the inclusive-or constraint in Figure 5.10 is formally verbalized as Each Member coaches some Team or plays for some team. As in logic and computing, “or” is always interpreted in the inclusive sense unless we say otherwise.

Inclusive-or constraints sometimes apply to roles in the same predicate. Figure 5.12 relates to a small UoD where people are identified by their first name. A sample population is shown for the ring binary; here each person plays one (or both) of the two roles. For instance, Terry is a child of Alice and Bernie, and a parent of Selena. Since Person plays another role in the schema, the disjunctive mandatory role constraint must be depicted explicitly.

Figure 5.12. Each person is a parent or child.


As an example involving derivation, consider Table 5.2. Here students sit a test and exam, and their total score is computed by adding these two scores. The schema for this, shown in Figure 5.13, includes a derived fact type. If included on the diagram, a derived fact type must be marked with an asterisk, and its constraints should be shown explicitly (even though these constraints are typically derived). The mandatory constraint on the derived fact type means the total score must be known—this does not mean that it must be stored.

Table 5.2. Student scores.
Student NrStudent NameTestExamTotal
1001Adams, A156075
1002Brown, C106575
1003Einstein. A2080100

Figure 5.13. Here, constraints on the derived fact type are derivable.


The roles played by Score are named “testScore”, “examScore”, and “totalScore” to enable the derivation rule to be specified in attribute style. The uniqueness constraint on the total score fact type is derivable from the derivation rule and the other uniqueness constraints (Each student has only one test score and only one exam score and the total score is the sum of these). The mandatory role on the total score fact type is derivable from the derivation rule and the other mandatory roles (Each student has a test score and an exam score, and the rule then provides the total score).

By default, all constraints shown on a derived fact type are derivable. This default rule is discussed in more detail later. In rare cases we may want to declare a derived fact type with a nonderivable constraint. We consider some examples much later.

In principle, a derived fact type may be drawn with no constraints if these can be derived. Typically, however, a derived fact type is included on the diagram for discussion purposes, and it is illuminating to show the constraints explicitly.

The term “mandatory role” is used in the sense of “must be known” (either by storing it or deriving it). If desired, we may talk about the population of derived fact types as well as asserted fact types. For a given object type A, the population of A, pop(A) includes all members of A that play any of its roles (asserted, derived, or semiderived). Derived roles may be optional. For example, if either the test or the exam role in Figure 5.13 is optional, the total score role is also optional (although it is still unique).

With large schemas, an object type may play so many roles that it becomes awkward to connect a single shape for the object type to all its roles. To solve this problem, object types may be duplicated on a schema. In this case, the rule for implicit mandatory disjunctive roles applies to the union of all the duplicate shapes for the object type. Large schemas are typically divided into pages, and the same object type might appear on many pages and might also be duplicated on the same page.

An object type may also be imported from an external model where it is originally defined. Different ORM tools may use different notations for such cases.

To indicate that an object type or predicate is duplicated in the same model, either on the same or a different page, the NORMA tool adds a shadow to the shape. In Figure 5.14 the object type Date is duplicated on the same page of the model. The fact type Member has MemberName is duplicated on another page (not shown here).

Figure 5.14. Types may be duplicated, or imported from an external model.


At the time of writing, the NORMA tool does not yet allow an object type to be declared external (imported from another model), but this is planned for a later release. The tentative notation to depict an external object type is a circumflex “A” (pointing outside the model). Assuming that this notation is adopted, the Medication object type in Figure 5.14 is an external object type. Since it is defined in another model, its reference scheme may be suppressed.

Exercise 5.2

  1. Schematize the UoD described by the following sample output report. At the time these figures were recorded, the former Commonwealth of Independent States (CIS) was treated as a single country. Include uniqueness and mandatory role constraints.

    CountryReserves (Gt)
    CoalOil
    CIS2338.6
    USA2234.1
    China992.7
    Germany82?
    Australia59?
    Saudi Arabia0.123.0

  2. Set out the conceptual schema for the following output report. Include uniqueness and mandatory role constraints. Identify any derived fact type(s).

    SubjectYearEnrollmentRatingNrStudents%Lecturer
    CS1212007200752.50P.L. Cook
       6105.00 
       57537.50 
       48040.00 
       3105.00 
       252.50 
    CS1232007150742.67R.V. Green
       685.33 
       56040.00 
       47046.67 
       164.00 
    CS12120082507104.00A.B. White
       63012.00 
       510040.00 
       48032.00 
       3156.00 

  3. A cricket fan maintains a record of boundaries scored by Australia, India and New Zealand in their competition matches. In the game of cricket a six is scored if the ball is hit over the field boundary on the full. If the ball reaches the boundary after landing on the ground a four is scored. In either case, a boundary is said to have been scored.

    Although it is possible to score a 4 or 6 by running between the wickets, such cases do not count as boundaries and are not included in the database. A sample output report from this information system is shown. Here “4s” means “number of fours” and “6s” means “number of sixes” Schematize this UoD, including uniqueness constraints, mandatory roles and derived fact types. Use nesting.

    YearAustraliaIndiaNew Zealand
    4s6stotal4s6stotal4s6stotal
    1874120301501352315811535150
    1985112331451103014012025145
    1986140291691353016512335158

  4. Report extracts are shown from an information system used for the 1990 Australian federal election for the House of Representatives (the main body of political government in Australia). The table lists codes and titles of political parties that fielded candidates in this election or the previous election (some parties competed in only one of these elections). For this exercise, treat Independent as a party.

PartyCodeTitle
ALPAustralian Labor Party
ADAustralian Democrats
GRNGreens
GRYGrey Power
INDIndependent
LIBLiberal Party of Australia
NDPNuclear Disarmament Party
NPNational Party of Australia
......

A snapshot of voting details is shown about two seats (i.e., voting regions) in the 1990 election. This snapshot was taken during the later stage of the vote counting. The number of votes for each politician, as well as the informal vote, is initially set to 0. During the election the voting figures are updated continually. The percentage of votes counted is calculated assuming all those on roll actually vote. For simplicity, assume that each politician and each seat has a unique name. Figures are maintained for all seats in all states.

An asterisk (*) preceding a politician’s name indicates a sitting member (e.g., Fife is the sitting member for the seat of Hume): express this as a binary. Sitting members are recorded only if they seek re-election. Some seats may be new (these have no results for the previous election). States are identified by codes (e.g., ‘QLD’ denotes Queensland). Each state has many seats (not shown here). Draw a conceptual schema diagram for this UoD. Include uniqueness constraints and mandatory roles. If a fact type is derived, omit it from the diagram but include a derivation rule for it (you may specify this rule informally). Do not attempt any nesting.

QLD:

FADDEN (On roll: 69110) 
CROSS (AD)7555
FRECKLETON (NP)2001
JULL (LIB)24817
HEYMANN (IND)641
WILKINSON (ALP)21368
Informal1404
% counted: 84 
Previous election: 
AD 4065; ALP 24481; 
LIB 21258; NP 9581 

NSW:

HUME (On roll: 70093) 
JONES (IND)9007
* FIFE (LIB)29553
KIRKWOOD (IND)507
MARTIN (ALP)20554
ROBERTS (AD)3889
Informal1309
% counted: 92 
Previous election: 
AD 2498; ALP 24516; 
IND (total): 2086; 
LIB 33687; NDP 1105 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.172.252