2.2. The Conceptual Level

At the conceptual level, all communication between people and the information system is handled by the conceptual information processor. This communication may be divided into three main stages:

  1. The modeler enters the conceptual schema for the UoD into the system. The CIP accepts the schema if and only if it is consistent with the metaconceptual schema.

  2. The user updates the database by adding or deleting specific facts. The CIP accepts an update if and only if it is consistent with the conceptual schema.

  3. The user queries the system about the UoD and is answered by the CIP. The CIP can supply information about the conceptual schema or the database, provided it has stored the information or can derive it.

In these three stages, the CIP performs as a design filter, data filter, and information supplier, respectively. To process update and query requests, the CIP accesses the conceptual schema, which includes three main sections as shown in Figure 2.4.

Figure 2.4. The three main sections of the conceptual schema.


A fact (or fact instance) is a proposition that the business takes to be true. The fact types section lists the kinds of facts that may be represented in the database. Fact types indicate what types of object are permitted in the UoD (e.g., Employee, Country), how these are referenced by values in the database (e.g., SSN, CountryCode), and their relationships (e.g., was born in, lives in).

The constraints section lists the constraints or restrictions on populations of the fact types. These may be static or dynamic. Static constraints apply to every state of the database. For example, suppose a database stores information about countries and their populations. Although a country’s population may change over time, at any given time each country has at most one value recorded for its current population. Dynamic constraints determine what transitions are allowed between states of the database. For instance, a person’s age status may change from child to adult, but not vice versa.

Constraints are also known as validation rules or integrity rules. A database is said to have integrity when it is consistent with the UoD being modeled. Although most database constraints can be represented graphically on conceptual schema diagrams, some constraints may need to be represented in other ways (e.g., using conceptual sentences, formulae, tables, graphs, or program code).

The derivation rules section includes rules that may be used to derive new facts from other facts. A fact that is not a derived fact is an asserted fact (also known as a primitive or base fact). Derivation rules may involve mathematical calculation or logical inference. For example, an average score may be derived by summing individual scores and dividing by the number of them. Many operators and functions are defined for various data types. This permits a large variety of possible queries without the need to document each derivation. Typical mathematical facilities include arithmetic operators, such as +, -, * (multiply), and / (divide); set operators such as U (union); and functions for counting, summing, and computing averages, maxima, and minima.

In addition to such generic derivation facilities, specific derived fact types known to be required may be defined in the schema by means of derivation rules. Some mathematically computed fact types and almost all logically inferred fact types fit into this category. Fact types derived by use of logical inference typically involve rules that make use of logical operators such as and or if.

Although derivable facts could be stored in the database, it is usually better to avoid this. Apart from saving storage space, the practice of omitting derived facts from the stored data usually makes it much easier to deal with updates. For instance, suppose we want to regularly obtain individual ages (in years) of students in a class and, on occasion, the average age for the class. If we stored the average age in the database, we would have to arrange for this to be recomputed every time a student was added to or deleted from the class, as well as every time the age of any individual in the class increased.

As an improvement, we might store the number of students as well as their individual ages, and have the average age derived only upon request. But this is still clumsy. Can you think of a better design?

As you probably realized, there is no need to store the number of students in a class since this can be derived using a count function. This avoids having to update the class size every time someone joins or leaves the class. Moreover, there is no need to store even the individual ages of students. Computer systems have a built-in clock that enables the current date to be accessed. If we store the birth date of the students, then we can have the system derive the age of any student upon request by using an appropriate date subtraction algorithm. This way, we don’t have to worry about updating ages of students as they become a year older.

Sometimes, what is required is not the current age but the age on a certain date (e.g., entrance to schooling, age grouping for a future sports competition). In such cases where a single, stable age is required for each person, it may be appropriate to store it.

Before considering a derivation example using logical inference, note the difference between propositions and sentences. Propositions are asserted by declarative sentences and are always true or false (but not both). The same proposition may be asserted by different sentences, for example, “Paris is the capital of France” and “The French capital is Paris”. While humans can deal with underlying meanings in a very sophisticated way, computers are completely literally minded: they deal in sentences rather than their meanings. If we want the computer to make connections that may be obvious to us, we have to explicitly provide it with the background or rules for making such connections.

Suppose that facts about brotherhood and parenthood are stored in the database. For simplicity, assume that the only objects we want to talk about are people identified by their first names. We might now set out facts as follows:

Alan is a brother of Betty.

Charles is a brother of Betty.

Betty is a parent of Fred.

You can look at these facts and readily see that both Alan and Charles are uncles of Fred. In doing so, you’re using your understanding of the term “uncle”. If we want a computer system to be able to make similar deductions, we need to provide it with a rule that expresses this understanding. For example:

Given any X, Y
        X is an uncle of Y if there is some Z such that X is a brother of Z and Z is a parent of Y.

This may be abbreviated to

X is an uncle of Y if X is a brother of Z and Z is a parent of Y.

This rule is an example of a Horn clause. The head of the clause on the left of “if is derivable from the conditions stated on the right-hand side. Horn clauses are used in languages such as Prolog, and enable many derivation rules to be set out briefly. In ORM, such rules are specified using a more readable syntax, e.g.,

For each Person1 and Person2,
       Person1 is an uncle of Person2 if Person1 is a brother of some Person 3
                                                        who is a parent of Person2.

To appreciate how the CIP works, let’s look at an example. The notation is based on a textual ORM language called FORML (Formal ORM Language). A graphical version is explained in later chapters. For simplicity, some constraints are omitted. The UoD structure is set out in the conceptual schema shown.

Reference schemes:Person(.firstname); Country(.code); Year(CE)

Base fact types:F1 Person lives in Country.
 F2 Person was born in Year.
 F3 Person is a brother of Person.
 F4 Person is a parent of/is a child of Person. Roles: parent, child.

Constraints:C1 Each Person lives in some Country.
 C2 Each Person lives in at most one Country.
 C3 Each Person was born in at most one Year.
 C4 No Person is a brother of itself.
 C5 No Person is a parent of itself.

Derivation rules:D1 Person1 is an uncle of Person2 if Person1 is a brother of some Person3 who is a parent of Person2.
 D2 nrChildren of Person = count each child of that Person.

The reference schemes section declares the kinds of object of interest and how they are referenced. Objects are either entities or values. Entities are real or abstract things that are referenced by relating values (e.g., names) to them in some way. Here we have three kinds of entity: Person, Country, and Year. In this simple UoD, people are identified by their first name (rarely true in practice!). Counties are identified by codes (e.g., ‘ALP for Australia). Years are identified by CE (Common Era) values (e.g., World War II ended in 1945 CE). Years are time segments, so are entities, not values.

The fact types section declares which kinds of facts are of interest. This indicates how object types participate in relationships. In this book, object type names in English are shown with their first letter in capitals, and role names start with a lowercase letter. Since database columns have attribute names like “birthyear”, you may feel it is better to reword “Person was born in Year” as “Person has Birthyear”. However suppose we add another fact type “Person has Deathyear”. The formal connection between Birthyear and Deathyear is now hidden. Expressing the new fact type as “Person died in Year” reveals the semantic connections to “Person was born in Year” and makes comparisons between years of birth and death meaningful (the underlying domain is Year).

Attribute names are often used in ER, UML, and relational schemas to express facts. However, this can be unnatural (compare “Person has deathyear” with “Person died in Year”), and is incomplete unless domain names such as “Year” are added.

The constraints section declares constraints on the fact types. The examples here are all static constraints (i.e., each is true for each state of the database). Reserved words in the schema language are shown in bold. Constraint C1 means that for each person referenced in the database, we know at least one city where they live. C2 says nobody can live in more than one city (at the same time). C3 says nobody was born in more than one year; note that we might not know their year of birth. Constraint C4 says that nobody is his/her own brother, and C5 says that nobody is his/her own parent (brotherhood and parenthood is “irreflexive”).

The derivation rules section declares a logical rule for unclehood and a function for computing the number of children of any person. The unclehood rule is specified in relational style, using fact type readings. The nrChildren rule is specified in attribute style using a role name.

At the conceptual level, each fact in the database is a simple or elementary fact. Basically this means it can’t be split up into two or more simpler facts without loss of information. We may add or insert a fact into the database, and we may delete a fact from it. However we may not modify or change just a part of a fact. By “fact” we mean “fact instance” not “fact type”.

The operation of adding or deleting a single fact is an elementary update or simple transaction. In our sample conceptual query language, add and delete requests start with “add:” and “del:”, and queries start with ”list” or end with “?”. The CIP either accepts updates or rejects them with an indication of the violation. The CIP answers legal queries and rejects illegal ones.

To explain the conceptual notions underlying database transactions, we now discuss some examples of CIP interactions. If this seems tedious, remember that we are talking about the conceptual level, not the external level where the user actually interacts with the system. At the external level, the user typically enters, deletes, or changes values in a screen version of a form or table. Conceptually, however, we may think of such an operation being translated into the appropriate delete and add operations on elementary facts before being passed on to the CIP. Suppose we now start populating our conceptual database as follows:

User:CIP:
add: Person ‘Terry’ lives in Country ‘US’.Accepted

Because the CIP recognizes the type of this sentence and sees that no constraints are violated, it adds it to the database and issues the reply “Accepted”. If an update is inconsistent with the conceptual schema, the CIP rejects it, indicating the reason for rejection by verbalizing the constraint(s) violated. You should now be able to follow the following dialog. To save space here, we list just the number of the constraint violated rather than its verbalization.

add: Person ‘Norma’ was born in Year 1950.Rejected. C1 violated.
add: Person ‘Norma’ lives in Country ‘US’.Accepted.
add: Person ‘Norma’ was born in Year 1950.Accepted.
add: Person Terry’ is a brother of Person ‘Paul’.Rejected. C1 violated.
add: Person ‘Paul’ lives in Country ‘AU’.Accepted.
add: Person Terry’ is a brother of Person ‘Paul’.Accepted.
add: Person ‘Paul’ is a brother of Person ‘Paul’.Rejected. C4 violated.
add: Person Terry’ plays Sport ‘Judo’.Rejected. Unknown fact type.

Now suppose that Norma moves from the United States to Australia. How would we record this change? Here are two incorrect attempts:

add: Person ‘Norma’ lives in Country ‘AU’.Rejected. C2 violated.
del: Person ‘Norma’ lives in Country ‘US’.Rejected. C1 violated.

Is there any way to change Norma’s country of residence? Yes. Sometimes it is convenient, or even necessary, to submit many elementary updates in a single, compound transaction. To show this, we use “begin” and “end” as brackets to open and close the transaction. With compound transactions, the CIP considers only the total or collective effect of the transaction. It is quite possible that the CIP will accept a compound transaction even if, taken individually, some of its component simple updates would be rejected. To change Norma’s location we proceed thus:

begin
     del: Person ‘Norma’ lives in Country ‘US’.
     add: Person ‘Norma’ lives in Country ‘AU’.
end                                                                                  Accepted.

The CIP accepts a transaction if and only if its overall effect is consistent with the conceptual schema. The user entered this compound transaction as a sequence of two elementary updates. It is more natural here to specify the delete before the add operation. However, the CIP treats any compound transaction as a set of updates, disregarding the order of the components, and looks only at the total effect. So even if the add operation had been specified before the delete, this would have been accepted with the same result.

Note that the CIP cannot accept just part of a compound transaction: it is case of “all or nothing”. Here are a few more simple interactions:

add: Person ‘Linda’ lives in Country ‘AU’.Accepted.
add: Person ‘Selena’ lives in Country ‘AU’.Accepted.
add: Person ‘Terry’ is a parent of Person ‘Linda’.Accepted.
add: Person ‘Terry’ is a parent of Person ‘Selena’.Accepted.
nrChildren of Person Terry’?2.
list each Person who is a brother of Person ‘Paul’.Terry.
list each Person who lives in Country ‘AU’.Linda
 Paul
 Selena
list each Person who is an uncle of Person ‘Selena’.No Person.
add: Person ‘Paul’ is a brother of Person ‘Terry’.Accepted.
list each Person who is an uncle of Person ‘Selena’.Paul.
list each Person who is an aunt of Person ‘Selena’.Rejected. Unknown fact type.

Now suppose the UoD is changed by adding the following constraint:

C6 Each Person was born in some Year.

Our current database fails to meet this constraint. Suppose we start over with an empty database and try to add the following fact. How would the CIP respond?

add: Person ‘Bernard’ is a brother of Person ‘John’.

This update request is rejected. It violates two constraints (C1 and C6), as all people listed in the database must have both their country and birth year recorded. Chapter 5 uses the terminology “mandatory roles” to describe such constraints.

In general, the order in which constraints are listed does not matter. However, if an update request violates more than one constraint, this order may determine which constraint is reported as violated if the CIP is configured to report at most one violation. In this case, the CIP would respond thus to the previous request: “Rejected. C1 violated”. If the CIP is configured to report all constraint violations, it would instead respond “Rejected. C1, C6 violated”. As an exercise, convince yourself that with C6 added, the following update requests are processed as shown:

add: Person ‘Ken’ lives in Country ‘GB’.	                    Rejected. C6 violated.
begin
     add: Person ‘Ken’ lives in Country ‘GB’.
     add: Person ‘Ken’ was born in Year 1960.
end	                    Accepted.
add: Person ‘Ken’ was born in Year 1959.	                    Rejected. C3 violated.
begin
     del: Person ‘Ken’ was born in Year 1960.
     add: Person ‘Ken’ was born in Year 1959.
     add: Person ‘Erik’ lives in Country ‘NL’.
     add: Person ‘Erik’ was born in Year 1970.
     add: Person ‘Ken’ is a brother of Person ‘Erik’.
end                                                                               Accepted.

The CIP uses the conceptual schema to supervise updates of the database and to respond to questions. Think of the designer of the conceptual schema as the “law giver” and the schema itself as the “law book” containing the laws or ground rules for the UoD. The CIP is the “law enforcer”, as it ensures that these laws are adhered to whenever the user tries to update the database. Like any friendly police person, the CIP is also there to provide information on request.

Whenever we communicate to a person or an information system, we have in mind a particular universe of discourse. Typically, this concerns some small part of the real universe, such as a particular business. In rare cases, we might choose a fictional UoD (e.g., one populated by comic book characters) or perhaps a fantasy world we have invented for a novel that we are writing. Fictional worlds may or may not be (logically) possible.

You can often rely on your own intuitions as to what is logically possible. For instance, a world in which the Moon is colored green is possible, but a world in which the Moon is simultaneously green all over and red all over is not. A possible world is said to be consistent and an impossible world is inconsistent.

As humans, we carry prodigious amounts of information around in our minds. It is highly likely that somewhere in our personal web of beliefs some logical contradictions are lurking. In most cases it does not matter if our belief web is globally inconsistent. When reasoning about a particular UoD, however, consistency is essential. It easy to show that once you accept a logical inconsistency, it is possible to deduce anything (including rubbish) from it—an extreme case of the GIGO (Garbage In Garbage Out) principle.

There are basically two types of garbage: logical and factual. Inconsistent designs contain logical garbage. For example, we might declare two constraints that contradict one another. A good design method supported by a CASE tool that enforces metarules can help to avoid such problems. Many factual errors can be prevented by the enforcement of constraints on the database. For example, a declared constraint that “Each Person was born in at most one Country” prevents us from assigning a person more than one birth country.

Even if the schema is consistent, and the CIP checks that the database is consistent with this world design, it is still possible to add false data into the knowledge base. For example, if we tell the CIP that Einstein was born in France it might accept this even though in the actual world Einstein was born in Germany. If we want our knowledge base to remain factually correct, it is still our responsibility to ensure that all the sentences we enter into the database express propositions that are true of the actual world.

The following exercise provides practice with the concepts discussed in this section and introduces some further constraint types treated formally later.

Exercise 2.2

    1. Assuming the conceptual schema is already stored, what are the two main functions of the conceptual information processor?

    2. What are the three main components of the conceptual schema?

    3. ”The CIP will reject a compound transaction if any of its component update operations is inconsistent with the conceptual schema”. True or false?

  1. Assume the following conceptual schema is stored. Constraints apply to each database state. C1 means that each person referred to in the database must have his/her fitness rating recorded there. C3 says the possible fitness values are whole numbers from 1 to 10. C4 means no person can be recorded as expert at more than one sport, and C5 says a person can be recorded as being an expert at a sport only if that person is also recorded as playing the same sport.

    Reference schemes:Person(.firstname); Sport(.name); FitnessRating(.nr)
    Base fact types:F1 Person has FitnessRating.
     F2 Person plays Sport.
     F3 Person is expert at Sport.
    Constraints:C1 Each Person has some FitnessRating.
     C2 Each Person has at most one FitnessRating.
     C3 The possible values of FitnessRating are 1 to 10.
     C4 Each Person is expert at at most one Sport.
     C5 Each Person who is expert at some Sport also plays that Sport.
    Derivation rules:D1 Person is a martial artist if Person plays Sport ‘judo’ or Person plays Sport ‘karatedo’.
     D2 nrPlayers of Sport = count each Person who plays that Sport.

    The database is initially empty. The user now attempts the following sequence of updates and queries. For each update, circle the letter if the update is accepted (based on the cumulative state of the database). In cases of rejection, supply a reason (e.g., state which part of the schema is violated). For queries, supply an appropriate response from the CIP.

    1. add: Person ‘Ann’ has FitnessRating 9.

    2. add: Person ‘Fred’ plays Sport ‘tennis’.

    3. add: Person ‘Bob’ has FitnessRating 7.

    4. add: Person ‘Ann’ has FitnessRating 8.

    5. add: Person ‘Chris’ has FitnessRating 7.

    6. add: Person ‘Fred’ has FitnessRating 15.

    7. add: Person ‘Ann’ plays Sport ‘judo’.

    8. add: Person ‘Bob’ is expert at Sport ‘soccer’.

    9. add: Person ‘Ann’ is expert at Sport ‘judo’.

    10. add: Person ‘Ann’ programs in Language ‘SQL’.

    11. add: Person ‘Ann’ plays Sport ‘soccer’.

    12. add: Person ‘Chris’ plays Sport ‘karatedo’.

    13. del: Person ‘Chris’ has FitnessRating 7.

    14. begin
         add: Person ‘Bob’ has FitnessRating 8;
         del: Person ‘Bob’ has FitnessRating 7.
       end

    15. add: Person ‘Ann’ is expert at Sport ‘soccer’.

    16. add: Person ‘Bob’ plays Sport ‘soccer’.

    17. Person’Ann’plays Sport’judo’?

    18. list each Person who plays Sport ‘karatedo’.

    19. nrPlayers of Sport ‘soccer’?

    20. list each Person who is a martial artist.

    21. list possible values of FitnessRating.

    22. what is the meaning of life?

  2. In the following schema, constraints apply to the database, not necessarily to the real world. Although each student in reality has a marital status, it is optional to record it (e.g., some students may wish to keep their marital status private). Constraint C1 combines two weaker constraints, since “exactly one” means “at least one (i.e., some), and at most one”.

    Reference schemes:Student(.firstname); Degree(.code); MaritalStatus(.name)

    Base fact types:F1 Student is enrolled in Degree
     F2 Student has MaritalStatus

    Constraints:C1 Each Student is enrolled in exactly one Degree.
     C2 Each Student has at most one MaritalStatus.
     C3 The possible values of MaritalStatus are ’single’, ‘married’, ‘widowed’, ‘divorced’.
     C4 MaritalStatus transitions: (“1” = “allowed”)

    From Tosinglemarriedwidoweddivorced
    single0100
    married0011
    widowed0100
    divorced0100

    The database is initially empty. The user attempts the following sequence of updates and queries. For each update, circle the letter if the update is accepted; if rejected, supply a reason. Assume questions are legal, and supply an appropriate response.

    1. add: Student ‘Fred’ is enrolled in Degree ‘BSc’.This level is concerned with providing

    2. add: Student ‘Sue’ has MaritalStatus ‘single’.

    3. begin
                 add: Student ‘Sue’ has MaritalStatus ‘single’.
                 add: Student ‘Sue’ is enrolled in Degree ‘MA’.
                 end

    4. add: Student ‘Fred’ is enrolled in Degree ‘BA.

    5. add: Student ‘Fred’ is studying Subject ‘CS112’.

    6. list possible values of MaritalStatus.

    7. add: Student ‘Bob’ is enrolled in Degree ‘BSc’.

    8. add: Student ‘Sue’ has MaritalStatus ‘married’.

    9. begin
             del: Student ‘Sue’ has MaritalStatus ‘single’.
             add: Student ‘Sue’ has MaritalStatus ‘married’.
             end

    10. add: Student ‘Bob’ has MaritalStatus ‘single’.

    11. begin
             del: Student ‘Bob’ has MaritalStatus ‘single’.
             add: Student ‘Bob’ has MaritalStatus ‘divorced’.
             end

    12. Student ‘Sue’ is enrolled in Degree ‘BSc’?

    13. list each Student who is enrolled in Degree ‘BSc’.

    14. list each Student who is enrolled in Degree ‘MA’.

    15. add: 3 students are enrolled in Degree ‘BE’.

    What is the final state of the database?

  3. Assume the following conceptual schema.

    Reference schemes:Person(.firstname)

    Base fact types:F1 Person is male.
     F2 Person is female.
     F3 Person is a parent of Person.

    Constraints:C1 Each Person is male or is female.
     C2 No Person is male and is female.
     C3 -- Each person has at most 2 parents Each Person2 instance occurs at most 2 times in Person1 is a parent of Person2.
     C4 No Person is a parent of itself.

    Derivation rules:D1 Person1 is a grandparent of Person2 if Person1 is a parent of some Person3 who is a parent of Person2.

    Assume the database is populated with the following data. The user now attempts the following sequence of updates and queries. Indicate the CIP’s response in each case.

    Males:David, Paul, Terry
    Females:Alice, Chris, Linda, Norma, Selena

    1. add: Person ‘Jim’ is male.

    2. add: Person ‘Bernie’ is a parent of Person ‘Terry’.

    3. begin
         Person ‘Terry’ is a parent of Person ‘Selena’.
         Person ‘Norma’ is a parent of Person ‘Selena’.
      end

    4. add: Person ‘David’ is a parent of Person ‘David’.

    5. begin
         Person ‘Norma’ is a parent of Person ‘Paul’.
         Person ‘Alice’ is a parent of Person ‘Terry’.
      end

    6. add: Person ‘Chris’ is male.

    7. add: Person ‘Chris’ is a parent of Person ‘Selena’.

    8. what Person is a grandparent of Person ‘Selena’?

    Formulate your own derivation rules for the following:

    (i) X is a father of Y; (j) X is a daughter of Y; (k) X is a granddaughter of Y

  4. Consider the following conceptual schema:

    Ref. schemes:Employee(.nr); Departmentf.name); Language(.name)
    Base fact types:F1 Employee works for Department.
     F2 Employee speaks Language.
    Constraints:C1 Each Employee works for some Department.
     C2 Each Employee works for at most one Department.
     C3 Each Employee speaks some Language.

    1. Provide an update sequence to add the facts that employees 101 and 102, who both speak English, work for the Health department.

    2. Invent some database populations that are inconsistent with the schema.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.40.53