8.1. Overview of ER

The Entity Relationship (ER) modeling approach views a business domain in terms of entities that have attributes and participate in relationships. For example, the fact that an employee was born on a date is modeled by a birthdate attribute of the Employee entity type, whereas the fact that an employee works for a department is modeled as a relationship between them. This world view is quite intuitive, and despite the recent rise of UML for modeling object-oriented applications, ER is still the most popular data modeling approach for database applications.

In Chapter 1, we argued that ORM is better than ER for conceptual analysis. However, ER is widely used, and its diagrams are good for compact summaries, so you should become familiar with at least some of the mainstream ER notations. This is the main purpose of this chapter. A second purpose is to have a closer look at how ORM compares to ER. To save some explanation, we assume that you have already studied the basics of ORM in earlier chapters so that we can examine ER from an ORM perspective.

The ER approach was originally proposed by Dr. Peter Chen in the very first issue of an influential journal (Chen 1976). Figure 8.1 is based on examples from this journal paper. Chen’s original notation uses rectangles for entity types and diamonds for relationships (binary or longer). Attributes may be defined, but are excluded from the ER diagram. As in ORM, roles are defined as parts played in relationships. Role names may optionally be shown at relationship ends (e.g., Employee plays the worker role in the Proj-Work relationship).

Figure 8.1. The original ER notation used by Chen.


Chen formalized relationships in terms of ordered tuples of entities, allowing the order to be dropped if role names are used (as with attribute names in the relational model). Although not displayed on the ER diagram, relationships may have attributes, but cannot play roles in other relationships. So objectified associations are not fully supported.

Roles may be annotated with a maximum cardinality of 1 or many. For example, read left to right, the Proj-Mgr relation is one-to-many (each employee manages zero or more projects, and each project is managed by at most one employee).

As shown in Figure 8.1, Chen used noun phrases for relationships, eliminating natural verbalization. Even if verb phrases are used, the direction in which relationship names are to be read is formally undecided, unless we add additional marks (e.g., arrows) or rules (e.g., always read from left to right and from top to bottom). For example, does the employee manage the project or does the project manage the employee? Although we can use our background knowledge to informally disambiguate this example, it is easy to find examples of relationships whose direction can only be guessed at by anybody other than the model’s creator.

This problem is exacerbated if the verb phrase used to name the relationship is shortened to one word (e.g., “work”), unfortunately still a fairly common practice. As a simple example, is the ER diagram in Figure 8.2 meant to capture the fact type Person killed Animal, or Animal killed Person? We could disambiguate this diagram by adding role names (e.g., “victim”, “killer”) but there is no requirement to do so. Similarly, if we populate the Component relationship in Figure 8.1 with the pair (a, b), we don’t know whether this means a is a component of b, or vice versa. To disambiguate this, we need to add role names (e.g., “subpart”, “superpart”) or use a verb phrase (e.g., “is a component of) with a defined direction.

Figure 8.2. An ambiguous ER schema.


A rectangle with a double border denotes a weak entity type. This means that the entity type’s identification scheme includes a relationship to another entity type. In Figure 8.3(a) for example, an instance of Dependant might be identified by having the name “Eve Jones” and being related via the Emp-Dep relation to employee 007. In ORM this would be modeled by a coreferenced object type as in Figure 8.3(b).

Figure 8.3. A weak entity type in ER (a) remodeled in ORM (b).


The arrow tip at the Dependant end of the ER relationship indicates that Dependant is existence dependent on Employee (its existence depends on the existence of the other). Given that Dependant is weak, this is basically redundant.

A far better approach is to introduce the concept of a mandatory role, as in ORM and many other ER versions (e.g., Each Dependant is a dependant of at least one Employee). This ability to establish a minimum multiplicity of at least one for any given role was absent from Chen’s original notation.

Over time, many variant notations developed. Attributes were sometimes displayed as named ellipses, connected by an arrow from their entity type, with double ellipses for identifier attributes. Chen’s current ER-Designer tool uses hexagons instead of diamonds. One problem with the ER approach is that there are so many versions of it, with no single standard. In industry, the most popular versions of pure ER are the Barker and Information Engineering (IE) notations. These are discussed in the next two sections. Another popular data modeling notation is IDEFIX, which is a hybrid of ER and relational notation, so is not a true ER representative. Nevertheless, many people talk of IDEFIX as a version of ER, so we cover it in this chapter.

The best way to develop an ER or IDEFIX model is to derive it from an ORM model; we briefly discuss mapping from ORM later in the chapter. The UML class diagram notation may be regarded as an extended version of ER, but because of the importance of UML, it is considered separately in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.39.142