Along with data describing entities, a database must somehow represent relationships between entities. Prior to the relational data model, databases used data structures embedded in the data to show relationships. However, the relational data model relies on it data to show relationships.
Types of Relationships
There are three types of relationships between entities that we encounter in our database environments: one-to-one, one-to-many, and many-to-many.
One-to-One Relationships
A one-to-one relationship exists between two entities when an occurrence of entity A is related to zero or one occurrences of entity B and an occurrence of entity B is related to zero or one occurrences of entity A. Although the specific occurrences involved in the relationship may change over time, there is never
more than one related occurrence at any given time. For example, a car and its engine have unique serial numbers. At any one time, an engine is installed in only one car; at the same time, a car has only one engine. The engine may be in no car or it can be moved from one car to another, but it can't be in more than one place at a time. By the same token, a car can have no engine or one engine. The specific engine may change over time but there is never more than one.
We include a relationship in an ERD by drawing a line between the rectangles for two related entities. The line ends identify the type of the relationship. In
Figure 1-5 you can see the way in which we would diagram the one-to-one relationship between a car and its engine. The |0 at the end of the line means “zero or one.”
If the relationship is required (mandatory), then the |0 at the end of the line changes to || (one and only one). We use mandatory relationships when we don't want an occurrence of an entity to be store in the database unless it is related to an occurrence of the entity at the other end of the relationship. For example, if we didn't want an engine in the database unless that engine was in a car, the end of the line next to the car entity would be ||.
One-to-many Relationships
True one-to-one relationships are very uncommon, but database environments are full of one-to-many relationships. When a one-to-many relationship exists between two entities, one occurrence of entity A is related to zero, one, or more occurrences of entity B; each occurrence of entity B is related to at most one occurrence of entity A. If, for example, we add car owners to our car database, then there will be a one-to-many
relationship between an owner and a car. At any time, a person can own zero, one, or more cars and a car belongs to zero or one owners.
In an ERD, the line between the related entities has |0 or || at one end, representing the zero, one, or more end of the relationship (or one and only one in the case of a mandatory relatioship). The end of the line at the “many” side of the relationship is marked with >0 or >|, representing zero, one, or more (or in the case of a mandatory relationship, one or more). In
Figure 1-6, the
owner entity is at the “one” end of the relationship and the
car entity is at the “many” end.
Many-to-many Relationships
The third type of relationship between entities, a many-to-many relationship, is also very common. When two entities are related in that way, one occurrence of entity A can be related to many occurrences of entity B (zero, one, or more) and one occurrence of entity B can be related to many occurrences of entity A. To demonstrate, let's add an entity for a Web site to the car database, indicating which cars are advertised on which Web sites. A car can be advertised on many Web sites and a site can advertise many cars.
The many-to-many relationship has been diagrammed in
Figure 1-7. Notice that each end of the line connecting the Web site and
Car entities has the “many” symbol, >0.
While many-to-many relationships are common, they are also a major problem: The relational data model cannot represent them directly, which means that they must be removed from the design and replaced with one-to-many relationships. In
Figure 1-8 we have introduced an entity called a
Listing. It represents one car being listed on one Web site.
The
listing entity is what we call a
composite entity. It's purpose is to represent the relationship between two other entities. Notice in
Figure 1-8 that its primary key is the concatenation of the primary key's of its parent entities (
car and
Web site).
Foreign Keys and Referential Integrity
If you look back at
Figure 1-8, you'll notice that some attributes appear in more than one entity. For example, you can see that the
engine entity contains the VIN of the car into which it is inserted. This is how a relational database shows data relationships. When
VIN in the
engine entity has a value, it represents the relationship with a specific occurrence of
car. Any attribute in an entity that is the same as the entire primary key of another entity is known as a
foreign key.
Table 1-1 lists all the foreign keys in the car database we have been developing.
Table 1-1 Foreign keys in the database design in Figure 1-8
Table containing the foreign key | Foreign key attributes | Table referenced by foreign key |
---|
Engine | VIN | Car |
Car | engine_serial_numb | Engine |
| owner_numb | Owner |
Listing | URL | Web site |
| VIN | Car |
A foreign key can be null. For example, if an engine isn't installed in a car, then the VIN attribute in the Engine entity will be null. However, foreign keys that are part of a primary key, such as the URL and VIN attributes in the listing entity, must have values to satisfy entity integrity.
When foreign keys are non-null, a matching primary key value must exist in the table referenced by the foreign key. When a car has an owner, for example, a row with the matching
owner_numb must exist in the
Owner table. Otherwise, it will be impossible to find information about that owner. This property is known as
referential integrity: Every non-null foreign
key value must reference an existing primary key value. As you will see throughout this book, much of what you do with SQL involves retrieving matching data using primary key-foreign key relationships.
Foreign keys are not limited to single columns; they can be concatenated, just like primary keys. As an example, consider a part of the database design for a very small accounting firm in
Figure 1-9. Because the firm is so small, the database designer decides that employee numbers aren't necessary and instead uses the accountants' first and last names as the primary key of the
accountant table. The
project table, used to gather data about one accountant preparing one year's tax returns for one customer, uses the tax year and the customer number as its primary key. However, it has three foreign keys. (We'll get to those in a moment.) The
form table that stores data about the forms that are part of a specific tax return uses the concatenation of the form's ID and the primary key of the project table for its primary key.
A foreign key is the same as the complete primary key of another table. Therefore, the acct_first_name attribute by itself in the project table is not a foreign key; neither is the acc_ last_name attribute. If you concatenate them, however, then they are the same as the primary key of the accountant table and, in fact, this is the unit with which referential integrity should be enforced.
Assume that “Jane Johnson” is working on customer 10100's 2014 tax return. It's not enough to ensure that “Jane” appears somewhere in the first name column in the accountant table and “Johnson” appears anywhere in the last name column in the accountant table. There could be many people named “Jane” and many with the last name of “Johnson.” What we need to ensure is that there is one person named “Jane Johnson” in the accountant table, the concatenation of the two attributes that make up the primary key.
The same holds true for the concatenated foreign key in the
form table: the tax year and the customer number. A row with a matching pair must exist in the
project table before referential integrity is satisfied.