8.2. Barker Notation

We use the term “Barker notation” for the ER notation discussed in the classic treatment by Richard Barker (Barker, 1990a). Originating in the late 1980s at CACI in the United Kingdom, the notation was later adopted by Oracle Corporation in its CASE design tools. Oracle now supports UML as an alternative to the Barker ER notation, although for database applications, many modelers still prefer the Barker notation over UML. Recently, Embarcadero has added basic support for the Barker notation in its EA/Studio product. Although dozens of ER dialects exist, we consider the Barker notation to be the best ER notation with wide support in industry.

Dave Hay, an experienced modeler and fan of the Barker notation, argues that “UML is ... not suitable for analyzing business requirements in cooperation with business people” (Hay, 1999b). While UML class diagrams are less than ideal for data modeling, the Barker ER notation, like UML, is attribute-based. As discussed in Chapter 1, using attributes in a conceptual model adds complexity and instability, and makes it harder to validate models using verbalization and sample populations.

Attributes are great for logical design, since they allow compact diagrams that directly represent the implementation data structures (e.g., tables or classes). However, for conceptual analysis, we just want to know what the facts and rules are about the business and to communicate this information in sentences so that the model can be understood by the domain experts. Whether some fact ends up in the design as an attribute should not be a conceptual issue.

In defense of ER, it can be useful to view a conceptual schema in attribute-style to gain a compact but still high level view of the business domain. For this reason, we value ER, and the NORMA tool for ORM is currently being extended to provide ER diagrams as live views of underlying ORM schemas.

But, as Ron Ross (1998, p. 15) says, “Sponsors of business rule projects must sign off on the sentences—not on graphical data models. Most methodologies and CASE tools have this more or less backwards”. ORM allows the domain expert to inspect ORM models fully verbalized into sentences with examples, making validation much easier and safer.

Now that we’ve stated our bias up front, let’s examine the Barker notation itself. The basic conventions are illustrated in Figure 8.4. Entity types are shown as soft rectangles (rounded corners) with their name in capitals. Attributes are written below the entity type name. Some constraint information may appear before an attribute name. An octothorpe “#” indicates that the attribute is, or is a component of, the primary identifier of the entity type.

Figure 8.4. A simple ER model in the Barker notation.


A “*” or heavy dot “•” indicates that the attribute is mandatory (i.e., each instance in the database population of the entity type must have a nonnull value recorded for this attribute). A “°” indicates the attribute is optional. Some modelers also use a period “.” to indicate that the attribute is not part of the identifier.

Relationships are restricted to binaries (no unaries, ternaries, or longer relationships) and are shown as lines with a relationship name at the end from which that relationship name is to be read. This name placement overcomes the ambiguous direction problem mentioned earlier. Both forward and inverse readings may be displayed for a binary relationship, one on either end of the line. This makes the Barker notation superior to UML for verbalizing relationships.

From an ORM perspective, each end (or half) of a relationship line corresponds to a role. Like ORM, Barker treats role optionality and cardinality as distinct, orthogonal concepts instead of lumping them together into a single concept (e.g., multiplicity in UML).

A solid half-line denotes a mandatory role, and a dotted half-line indicates an optional role. For cardinality, a crow’s foot intuitively indicates “many”, by its many “toes”. The absence of a crow’s foot intuitively indicates “one”. The crow’s foot notation was invented by Dr. Gordon Everest, an ORM advocate, who originally used the term “inverted arrow” (Everest, 1976) but now calls it a “fork”. Figure 8.5 shows the basic correspondence with the ORM notation for simple mandatory and uniqueness constraints.

Figure 8.5. The ER diagram (a) is equivalent to the ORM diagram (b).


To enable the optionality and cardinality settings to be verbalized, Barker (1990a, p. 3-5) recommends the following naming discipline for relationships. Let A R B denote an infix relationship R from entity type A to entity type B. Name R in such a way that each of the following four patterns results in an English sentence:

Each A (must | may) be R (one and only one B | one or more B-plural-form)

Use “must” or “may” when the first role is mandatory or optional respectively. Use “one and only one” or “one or more” when the cardinality on the second role is one or many, respectively. For example, the optionality/cardinality settings in Figure 8.5(a) verbalize as: Each Employee must be an occupier of one and only one Room; Each Room may be occupied by one or more Employees.

The constraints on the left-hand role in the equivalent ORM model in Figure 8.5(b) verbalize as: Each Employee occupies some Room; Each Employee occupies at most one Room. These constraints may be combined to verbalize as Each Employee occupies exactly one Room. The lack of a uniqueness constraint on the right-hand role is verbalized as It is possible that the same Room is occupied by more than one Employee, or (if no inverse reading exists) as It is possible that more than one Employee occupies the same Room.

Regarding the lack of an explicit mandatory role constraint on the right-hand role, we don’t want that verbalized explicitly, because it may well be unstable. If Room plays no other fact roles, the role is mandatory by implication (Room is not declared independent), so verbalization may well confuse here. If Room does play another fact role, and we decide that some rooms may be unoccupied, we could declare this explicitly either as It is possible that some Room is occupied by no Employee or as It is not necessary that each Room is occupied by some Employee. If no inverse reading is available, it may be verbalized thus: It is possible that no Employee occupies some Room.

To its credit, the Barker verbalization convention is good for basic mandatory and uniqueness constraints on infix binaries. But it is far less general than ORM’s approach, which applies to instances as well as types, for predicates of any arity, infix or mixfix, and covers many more kinds of constraint, with no need for pluralization. As a trivial example, the fact instance “Employee ‘101’ an occupier of Room 23” is not proper English, but “Employee ‘101’ occupies Room 23” is good English.

If each role in a binary association is assigned one of optional/mandatory and one of many/one, there are 16 patterns. The equivalent Barker ER and ORM diagrams for these cases are shown in Figure 8.6. The last case where both roles of a many:many relationship are mandatory is considered illegal in Barker ER.

Figure 8.6. Equivalent constraint patterns.


Ring associations considered illegal by Barker are shown in Figure 8.7(a). Although rare, they sometimes occur in reality, so they should be allowed at the conceptual level, as permitted in ORM.

Figure 8.7. Illegal ring associations in Barker ER (a) that are allowed in ORM (b).


As an exercise, you may wish to invent satisfying populations for the ORM associations in Figure 8.7(b). Although considered illegal by Barker, at least some of these patterns are allowed in Oracle’s CASE tools.

In the Barker notation, a bar “|” across one end of a relationship indicates that the relationship is a component of the primary identifier for the entity type at that end. For example, in Figure 8.8, Employee and Building have simple identifiers, but Room is compositely identified by combining its room number and building.

Figure 8.8. Room is identified by combining its room nr and its Building relationship.


The use of identification bars provides some of the functionality afforded by external uniqueness constraints in ORM. For example, the schemas in Figure 8.9 are equivalent. Any other attributes of Room and Building would be modeled in ORM as relationships. ORM’s external uniqueness notation seems to us to convey more intuitively the idea that each RoomNr, Building combination is unique (i.e., refers to at most one room). But maybe we’re biased. At any rate, this constraint (as well as any other graphic constraint) can be automatically verbalized in natural language.

Figure 8.9. Composite identification in (a) Barker ER and (b) ORM.


Some people misread the bar notation for composite identification as a “1”, since this is what the symbol means in many other ER notations. But the main problem with the bar and “#”notations is that they cannot be used to declare uniqueness constraints not used for primary reference (see later). A second problem, arising from the practice of modeling some fact types as attributes and others as relationships, is that two very different notations are used for the same fundamental concept (uniqueness).

Section 1.2 used a room scheduling example to illustrate several fundamental differences between ORM and modeling approaches such as ER and UML. You may recall that the schema shown in Figure 8.10 was used to model the example in the Barker notation. If you skipped Section 1.2, you may wish to read it now, since it discussed this example in a lot of detail. The use of attributes and the binary-only relationship restriction in this model makes it hard to verbalize and populate the schema for validation purposes. Moreover, there is at least one constraint missing.

Figure 8.10. A room schedule model in Barker notation.


A populated ORM schema for this example is reproduced in Figure 8.11 (minus the counterexample rows discussed in Section 1.2). Here the facts are naturally verbalized as a ternary and a binary, and the constraints are easily checked using verbalization and sample data. With the ER model there is no way of specifying the right-hand uniqueness constraints on either fact type, since the Barker notation doesn’t capture uniqueness constraints on attributes or relationships unless they are used for primary identification.

Figure 8.11. A populated ORM model for room scheduling.


In case it looks like we’re just bashing attribute-based approaches such as ER, let us reiterate that we find attribute-based models useful for compact overviews and for getting closer to the implementation model. However, we generate these by mapping from ORM, which we use exclusively for the initial conceptual analysis. This makes it easier to get the model right in the first place and to modify it as the underlying domain evolves.

Unlike ER (and UML for that matter), ORM was built from a linguistic basis, and its graphic notation was carefully chosen to exploit the potential of sample populations. To reap the benefits of verbalization and population for communication with and validation by domain experts, it’s better to use a language that was designed with this in mind. An added benefit of ORM is that its graphic notation can capture many more business rules than popular ER notations. Although not as rich as ORM, the Barker notation is more expressive than many ER notations. The rest of this section discusses its support for advanced constraints and subtyping.

In Barker notation, an exclusion constraint over two or more roles is shown as an “exclusive arc” connected to the roles with a small dot or circle. For example, Figure 8.12(a) includes the constraint that no employee may be allocated both a bus pass and a parking bay. In ORM this constraint is depicted by connecting “⊗” to the relevant roles by a dotted line, as shown in Figure 8.12(b).

Figure 8.12. A simple exclusion constraint in (a) Barker notation and (b) ORM.


To declare that two or more roles are mutually exclusive and disjunctively mandatory, the Barker notation uses the exclusive arc, but each role is shown as mandatory (solid line). For example, in Figure 8.13(a) each account is owned by a person or a company, but not both.

Figure 8.13. An exclusive-or constraint in (a) Barker notation and (b) ORM.


This notation is liable to mislead, since it violates the orthogonality principle in language design. Viewed by itself, the first role of the association Account owned by Person would appear to be mandatory, since a solid line is used. But the role is actually optional, since superimposing the exclusive arc changes the semantics of the solid line to mean the role belongs to a set of roles that are disjunctively mandatory.

Contrast this with the equivalent ORM model shown in Figure 8.13(b). Here an exclusion constraint ⊗ is orthogonally combined with a disjunctive mandatory (inclusive-or) constraint ⊙ to produce an exclusive-or constraint, shown here by the “lifebuoy” symbol formed by overlaying one constraint symbol on the other.

The ORM notation makes it clear that each role is individually optional and that the exclusive-or constraint is a combination of inclusive-or and exclusion constraints. Suppose we modified our business so that the same account could be owned by both a person and a company. Removing just the exclusion constraint from the model leaves us with the inclusive-or constraint ⊙ that each account is owned by a person or company. Like UML, the Barker ER notation doesn’t even have a symbol for an inclusive-or constraint, so it is unable to diagram this or the many other cases of this nature that occur in practice.

In the Barker notation, a role may occur in at most one exclusive arc. ORM has no such restriction. For example, in Figure 8.14(a) no student can be both ethnic and aboriginal, and no student can be both an aboriginal and a migrant (these rules come from a student record system in Australia). Even if the Barker notation supported un-aries (it doesn’t), this situation could not be handled by exclusive arcs. Like UML, Barker ER does not provide a graphic notation for exclusion constraints over role sequences. For instance, it cannot capture the ORM pair-exclusion constraint in Figure 8.14(b), which declares that no person who wrote a book may review the same book. Such rules are very common. Moreover, the Barker notation cannot express any ORM subset or equality constraints at all, even over simple roles.

Figure 8.14. Some ORM exclusion constraints not handled by Barker’s exclusive arcs.


The Barker notation for ER allows simple frequency constraints to be specified. For any positive integer n, a constraint of the form = n, < n, ≤ n, > n, ≥ n may be written next to a single role to indicate the number of instances that may be associated with an instance playing the other role. For example, the frequency constraint “≤ 2” in Figure 8.15 indicates that each person is a child of at most two parents. In the Barker notation, this constraint is placed on the parent role, making it easy to read the constraint as a sentence starting at the other role.

Figure 8.15. A simple frequency constraint in (a) Barker notation and (b) ORM.


In ORM the constraint is placed on the child role, making it easy to see the impact of the constraint on the population (each person appears at most twice in the child role population). Unlike the Barker notation, ORM allows frequency constraints to include ranges (e.g., 2. .5) and to apply to role sequences.

In Barker notation, subtyping is depicted with a version of Euler diagrams. In effect, only partitions (exclusive and exhaustive) can be displayed. For example, Figure 8.16(a) indicates that each patient is a male patient or female patent but not both. As discussed in Section 6.5, ORM displays subtyping using directed acyclic graphs (DAGs), and may use an exclusive-or constraint symbol to display a partition constraint (typically implied by subtype definitions and other constraints).

Figure 8.16. A subtype partition in (a) Barker ER and (b) ORM


Euler diagrams are good for simple cases, since they intuitively show the subtype inside its supertype. However, unlike DAGs, they are hopeless for complex cases (e.g., many overlapping subtypes), and they make it inconvenient to attach details to the subtypes. For the latter reason, attributes are sometimes omitted from subtypes when the Barker notation is used. Another problem with Euler diagrams is in displaying multiple partitions on a single diagram (e.g., try partitioning Patient into not just MalePatient and FemalePatient, but also InPatient and OutPatient).

In the Barker notation, if the original subtype list is not exhaustive, an “Other” subtype is added to make it so, even if it plays no specific role. For example, in Figure 8.17 a vehicle is a car or truck or possibly something else, and a car is a sedan or wagon or possibly something else. In ORM, there is no need to introduce subtypes for OtherCar or OtherVehicle unless they play specific roles.

Figure 8.17. Nonexhaustive, exclusive subtypes in (a) Barker ER and (b) ORM.


A major problem with the Barker notation for subtyping is that it does not depict overlapping subtypes (e.g., Manager and FemaleEmployee as subtypes of Employee) or multiple inheritance (e.g., FemaleManager as a subtype of FemaleEmployee and Manager). While it is possible to implement multiple inheritance in single inheritance systems (e.g., Java) by using some low level tricks, for conceptual modeling purposes multiple inheritance should be simply modeled directly. As a final comparison point about subtyping, Barker ER lacks ORM’s capability for formal subtype definitions and context-dependent identification schemes.

In addition to its static constraints, Barker ER includes a dynamic “changeability constraint” for marking “nontransferable relationships”. This constraint declares that once an instance of an entity type plays a role with an object, it cannot ever play this role with another object. This is indicated by an open diamond on the constrained role. For example, Figure 8.18(a) declares that the birth country of a person is non-transferable.

Figure 8.18. Nontransferability declared in Barker, but not ORM.


As indicated in Figure 8.18(b), ORM does not currently include a graphic notation for this constraint. At the time of writing, adding formal support for this and other dynamic constraints in ORM is an active research area. In practice, specification of nontransferable constraints needs to ensure that the implemented model is still open to error corrections by duly authorized users. For example, if an Australian’s birth country is mistakenly entered as Austria, it should be possible to change this to Australia. We discuss this issue further in the next chapter when examining changeability properties in UML.

Well that pretty well covers the Barker notation for ER. As we’ve seen, it does a good job of expressing simple mandatory, uniqueness, exclusion, and frequency constraints, simple subtyping, and also nontransferable relationships. However, if a feature is modeled as an attribute instead of as a relationship, very few of these constraints can be specified for it.

Unlike ORM, the Barker notation does not support unary, n-ary, or objectified associations (nesting). Moreover it lacks support for most of the advanced ORM constraints (e.g., subset, multi-role exclusion, ring constraints, and join constraints). It does not include a formal textual language for specifying queries, other constraints and derivation rules at the conceptual level. Nevertheless it is better than many other ER notations and is still widely used. If you ever need to specify a model in Barker ER notation, we suggest that you first do the model in ORM, map it to the Barker notation, and make a note of any rules that can’t be expressed there diagrammatically.

Rather than giving you some exercises on the Barker notation at this point, we’ll wait until the end of the chapter, when we’ve covered the main ER notations in use as well as some techniques for mapping from ORM to ER. You can then decide which notation(s) you would like to have some practice with.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.15.149