5.1. Introduction to CSDP Step 5

So far you’ve learned how to proceed from familiar information examples to a conceptual schema diagram in which the elementary fact types are clearly set out, with the relevant uniqueness constraints marked on each. You also learned how to perform some checks on the quality of the schemas. In practice, other kinds of constraints and checks need to be considered also.

Next in importance to uniqueness constraints are mandatory role constraints.Basically these indicate which roles must be played by the population of an object type and which are optional. Once mandatory role constraints are specified, a check is made to see if some fact types may be logically derived from others. This constitutes the next step in the design procedure.

CSDP step 5:Add mandatory role constraints, and check for logical derivations


The next two sections cover this step in detail. The rest of this section discusses some basic concepts used in our treatment of mandatory roles and later constraints. Once mandatory roles are understood, we are in a good position to examine reference schemes in depth, especially composite reference—we do this later in the chapter.

Recall that a type may be equated with the set of all its possible instances. This is true for both object types and relationship types. For a given schema, types are fixed or unchanging. For a given state of the database and a given type T, we define pop(T), the populationof T, to be the set of all instances of Tin that state.

Let us use “{ }” to denote the null set, or empty set (i.e., the set with no members). When the database is empty, the population of each of its types = { }. As the database is updated, the population of a given type may change. For example, suppose the ternary in Figure 5.1. is used to store information about medals won by countries in the next Olympic Games. The fact type and roles are numbered for easy reference.

Figure 5.1. A fact type for Olympic Games results.


Initially the fact table for Fl is empty, since no sporting results are known, so pop(Fl) = { }. Now suppose the database is to be updated after each sporting event, and in the first event the gold and bronze medals are won by the United States and the silver medal by Japan. The new state of the fact table is shown in Figure 5.2, using ‘G’, ‘S’, and ‘B’ for gold, silver, and bronze. Now the population of the fact type contains three facts. The population grows each time the results of an event are entered.

Figure 5.2. The fact type populated with results of the first event.


We may also define the population of a role. Each role of a fact type is associated with a column in its fact table. Values entered in the column refer to instances of the object type that plays that role. Given any role r and any state of the database:

pop(r)= population of role r
 = set of objects referenced in the column for r

Typically the objects referenced are entities, not values. For example, in Figure 5.2, pop(rl) = {The Country with code ‘US’, The Country with code ‘JP’}. The valuation of a role r, written val(r), is the set of values entered in its column. For example, in Figure 5.2, val(rl) = {‘US’, ‘JP’}.

When there is no chance of confusion, object terms may be abbreviated to constants in listing populations. With this understanding, in Figure 5.2, pop(rl) = {US, JP}, pop(r2) = {G, S, B} and pop(r3) = {1}. Role populations are used to determine object type populations. With our current example, if each country referenced in the database must play rl, then after the first event pop(Country) changes from { } to {US, JP}, and pop(CountryCode) changes from { } to {‘US’, ‘JP’}. Assuming the UoD is just about the Olympics, the entity type Country is the set of all countries that might possibly compete in the Olympic Games.

A predicate occurs within an elementary fact type or reference type (also known as an existential fact type). Reference types are usually abbreviated as a parenthesized reference mode. Roles in a reference type are called referential roles. Roles in an elementary fact type are called “fact roles’”. Each entity type in a completed conceptual schema plays at least one referential role and, unless declared independent (see section 6.3), at least one fact role. In general, the population of an entity type equals the union of the population of its roles. Unless the entity type is independent, its population is the union of the populations of its fact roles.

For example, the populated schema in Figure 5.3 indicates the size (in square kilometers) and carbon dioxide emissions (as a percentage of worldwide emissions) of some countries as measured in 2004. The Country entity type might include many countries (e.g. from Afghanistan to Zimbabwe), but for this state of the database, only three countries are listed in each fact type. Each instance in the database population of Country plays the role rl, r2, or both. So the current population of Country is the set of all the instances referenced in either the rl column or the r2 column.

Figure 5.3. pop(Country) = pop(rl) ∪ pop(r2).


We could set this out as:

pop(Country)= pop(r1) ∪ pop(r2)
 = {AU,FR,US}∪{AU,CN,US}
 = {AU, CN, FR, US }

Here “∪” is the operator for set union. The union of two sets is the set of all the elements in either or both. For instance {1,2} ∪ {2, 3, 4} = {1, 2, 3, 4}. In the case just given, Australia and the United States occur in both role populations, while China and France occur in only one.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.40.207