18. The Object-Relational Data Model
The relational data model has been a mainstay of business data processing for nearly 30 years. Nothing has superseded it in the way the relational data model superseded the simple network data model. However, a newer data model—the object oriented data model1—has come into use as an alternative for some types of navigational data processing.
1To be completely accurate, the relational data model is the only data model that has a formal specification. The hierarchical data model and the OO data model do not. The closest thing the simple network data model has is the CODASYL specifications.
This chapter presents an overview of some object-oriented concepts for readers who aren't familiar with the object-oriented paradigm. (Chapter 19 looks at the SQL standard's support for object-oriented structures.) If you have object-oriented programming experience, then you can skip over the first parts of this chapter and begin reading with the section Pure Object-Oriented Databases.
The object-oriented paradigm was the brainchild of Dr. Kristen Nygaard, a Norwegian who was attempting to write a computer program to model the behavior of ships, tides, and fjords. He found that the interactions were extremely complex and realized that it would be easier to write the program if he separated the three types of program elements and let each one model its own behavior against each of the others.
The object-oriented programming languages in use today (most notably C++, Java, and SmallTalk) are a direct outgrowth of Nygaard's early work. The way in which objects are used in databases today is an extension of object-oriented programming.
Note: This is in direct contrast to the relational data model, which was designed specifically to model data relationships, although much of its theoretical foundations are found in mathematical set theory.
Getting Started: Object-Orientation without Computing
To understand the role of objects in relational databases, you must first understand the object-oriented paradigm as it is used in object-oriented programming and pure object-oriented databases. The easiest way to do so is to begin with an example that has absolutely nothing to do with programming at all.
Assume that you have a teenage daughter (or sister, whichever is more appropriate) named Jane and that your family is going to take a long car trip. Like many teens, Jane is less than thrilled about a trip with the family and in particular with spending so much time with her 12-year-old brother. In self-defense, Jane needs something to keep her brother busy so he won't bother her as she reads while her parents are driving. She therefore decides to write up some instructions for playing solitaire games for him.
The first set of instruction is for the most common solitaire game, Klondike. As you can see in Figure 18-1, the deal involves seven piles of cards of increasing depth, with the top card turned over. The rest of the deck remains in the draw pile. Jane decides to break the written instructions into two main parts: information about the game and questions her brother might ask. She therefore produces instructions that look something like Figure 18-2. She also attached the illustration of the game's deal.
B9780123756978500182/f18-01-9780123756978.jpg is missing
Figure 18-1
The starting layout for Klondike
B9780123756978500182/f18-02-9780123756978.jpg is missing
Figure 18-2
Instructions for playing Klondike
The next game she tackles is Canfield. Like Klondike, it is played with one deck, but the deal and play are slightly different (see Figure 18-3). Jane uses the same pattern for the instructions as she did for Klondike because it cuts down the amount of writing she has to do (see Figure 18-4).
B9780123756978500182/f18-03-9780123756978.jpg is missing
Figure 18-3
The starting deal for Canfield
B9780123756978500182/f18-04-9780123756978.jpg is missing
Figure 18-4
Instructions for playing Canfield
And finally, just to make sure her brother doesn't get too bored, Jane prepares instructions for Forty Thieves (see Figure 18-5). This game uses decks of cards and plays in a very different way from the other two games (see Figure 18-6). Nonetheless, preparing the instructions for the third game is fairly easy because she has the template for the instructions down pat.
B9780123756978500182/f18-05-9780123756978.jpg is missing
Figure 18-5
The starting layout for Forty Thieves
B9780123756978500182/f18-06-9780123756978.jpg is missing
Figure 18-6
Instructions for playing Forty Thieves
After completing three sets of instructions, it becomes clear to Jane that having the template for the instructions makes the process extremely easy. Jane can use the template to organize any number of sets of instructions for playing solitaire. All she has to do is make a copy of the template and fill in the values for the information about the game.
Basic OO Concepts
The object-oriented paradigm shares some characteristics with the Entity-Relationship model used for database design. However, OO extends the idea of relationships between entities by adding actions that the entities can perform.
Objects
If someone were writing an object-oriented computer program to manage the instructions for playing solitaire, each game would be known as an object. It is a self-contained element used by the program. It has things that it knows about itself: its name, an illustration of the layout, the number of decks needed to play, how to deal, how to play, and how to determine when the game is won. In object-oriented terms, the values that an object stores about itself are known as attributes or variables or, occasionally, properties.
Each solitaire game object also has some things it knows how to do: explain how to deal, explain how to play, explain how to identify a win, and so on. In object-oriented programming terminology, actions that objects know how to perform are called methods, services, functions, procedures, or operations.
Note: It is unfortunate, but there is no single accepted terminology for the object-oriented paradigm. Each programming language or DBMS chooses which terms it will use. You therefore need to recognize all of the terms that might be used to describe the same thing.
An object is very security-minded. It typically keeps the things it knows about itself private and releases that information only through a method whose purpose is to share data values (an accessor method). For example, a user or program using one of the game objects cannot access the contents of the Dealing variable directly. Instead, the user or program must execute the How Do I Deal the Game? method to see that data.
Objects also keep private the details of the procedures for the things they know how to do, but they make it easy for someone to ask them to perform those actions. Users or programs cannot see what is inside the methods. They see only the result of the method. This characteristic of objects is known as information hiding or encapsulation.
An object presents a public interface to other objects that might use it. This provides other objects with a way to ask for data values or for actions to be performed. In the example of the solitaire games, the questions that Jane's little brother can ask are a game's public interface. The instructions below each question represent the procedure to be used to answer the question. A major benefit of data encapsulation is that as long as the object's public interface remains the same, you can change the details of the object's methods without needing to inform any other objects that might be using those methods. For example, the card game objects currently tell the user to “read” the contents of an attribute. However, there is no reason that the methods couldn't be changed to tell the user to “print” the contents of an attribute. The user would still access the method in the same way, but the way in which the method operates would be slightly different.
An object requests data or an action by sending a message to another object. For example, if you were writing a computer program to manage the instructions for solitaire games, the program (an object in its own right) could send a message to the game object asking the game object to display the instructions for dealing the game. Because the actual procedures of the method are hidden, your program would ask for the instruction display and then you would see the instructions on the screen. However you would not need to worry about the details of how the screen display was produced. That is the job of the game object rather than the object that is asking the game to do something.
An object-oriented program is made up of a collection of objects, each of which has attributes and methods. The objects interact by sending messages to one another. The trick, of course, is figuring out which objects a program needs and the attributes and methods those objects should have.
Classes
The template on which the solitaire game instructions are based is the same for each game. Without data it might be represented as in Figure 18-7. The nice thing about this template is that it provides a consistent way of organizing all the characteristics of a game. When you want to create the instructions for another game, you make a copy of the template and “fill in the blanks.” You write the data values for the attributes. The procedures that make up the answers to the questions someone might ask about the game have already been completed.
B9780123756978500182/f18-07-9780123756978.jpg is missing
Figure 18-7
The solitaire game instruction template
In object-oriented terminology, the template on which similar objects like the solitaire game objects are based is known as a class. When a program creates an object from a class, it provides data for the object's variables. The object can then use the methods that have been written for its class. All of the objects created from the same class share the same procedures for their methods. They also have the same types of data, but the values for the data may differ, for example, just as the names of the solitaire games are different.
A class is also a data type. In fact, a class is an implementation of what is known as an abstract data type, which is just another term for a user-defined data type. The implication of a class being a data type is that you can use a class as the data type of an attribute in a relation.
Suppose, for example, you were developing a class to handle data about the employees in your organization. The attributes of the class might include the employee ID, the first name, the last name, and the address. The address itself is made up of a street, city, state, and zip. Therefore, you would probably create an address class with those attributes and then, rather than duplicating those attributes in the employee class, simply indicate that an object of the employee class will include an object created from the address class to contain the employee's address.
Types of Classes
There are three major types of classes used in an object-oriented program:
Control classes: Control classes neither manage data nor have visible output. Instead, they control the operational flow of a program. For example, application classes represent the programs themselves. In most cases, each program creates only one object from an application class. The application class's job includes starting the execution of the program, detecting menu selections (or other user interface events), and executing the correct program code to satisfy the user's requests.
Entity classes: Entity classes are used to create objects that manage data. The solitaire game class, for example, is an entity class. Classes for people, tangible objects, and events (for example, business meetings) are entity classes. Most object-oriented programs have at least one entity class from which many objects are created. In fact, in its simplest sense, the object-oriented data model is built from the representation of relationships among objects created from entity classes.
Interface classes: Interface classes handle the input and output of information. For example, if you are working with a graphic user interface, then each window and menu used by the program is an object created from an interface class.
In an object-oriented program, entity classes do not do their own input and output (I/O). Keyboard input is handled by interface objects that collect data and send it to entity objects for storage and processing. Screen and printed output is formatted by interface objects that get data for display from entity objects. When entity object become part of a database, the DBMS takes care of the file I/O; the rest of the I/O is handled by application programs or DBMS utilities.
Why is it so important to keep data manipulation separate from I/O? Wouldn't it be simpler to let the entity object manage its own I/O? It might be simpler in the short run, but if you decided to change a screen layout, you would need to modify the entity class. If you keep them separate, then data manipulation procedures are independent of data display. You can change one without affecting the other. In a large program, this not only can save you a lot of time, but also can help you avoid programming errors. In a database environment, the separation of I/O and data storage becomes especially critical because you do not want to modify data storage each time you decide to modify the look and feel of a program.
Many object-oriented programs also use a fourth type of class: a container class. Container classes exist to “contain,” or manage, multiple objects created from the same class. Because they gather objects together, they are also known as aggregations. For example, if you had a program that handled the instructions for playing solitaire, then that program would probably have a container class that organized all the individual card game objects. The container class would keep the objects in some order, list them for you, and probably search through them as well. Many pure object-oriented DBMSs require container classes, known as extents, to provide access to all objects created from the same class. However, as you will see, container classes are not used when objects are integrated into a relational database.
Types of Methods
Several types of methods are common to most classes, including the following:
Constructors: A constructor is a method that has the same name as the class. It is executed whenever an object is created from the class. A constructor therefore usually contains instructions to initialize an object's variables in some way.
Destructors: A destructor is a method that is executed when an object is destroyed. Not all object-oriented programming languages support destructors, which are usually used to release system resources (for example, main memory allocated by the object). Java in particular does not use destructors.
Accessors: An accessor, also known as a get method, returns the value of a private attribute to another object. This is the typical way in which external objects gain access to encapsulated data.
Mutators: A mutator, or set method, stores a new value in an attribute. This is the typical way in which external objects can modify encapsulated data.
The remaining methods defined for a class depend on the specific type of class and the specific behaviors it needs to perform.
Method Overloading
One of the characteristics of a class is its ability to contain over-loaded methods, methods that have the same name but require different data to operate. Because the data are different, the public interfaces of the methods are distinct.
As an example, assume that a human relations program has a container class named AllEmployees that aggregates all objects created from the Employee class. Programs that use the AllEmployees class create one object from the class and then relate all employee objects to the container using some form of program data structure.
To make the container class useful, there must be some way to locate specific employee objects. You might want to search by the employee ID number, by first and last name, or by telephone number. The All Employees class therefore contains three methods named “find.” One of the three requires an integer (the employee number) as input, the second requires two strings (the first and last name), and the third requires a single string (the phone number). Although the methods have the same name, their public interfaces are different because the combination of the name and the required input data is distinct.
Many classes have overloaded constructors. One might accept interactive input, another might read input from a file, and a third might get its data by copying data from another object (a copy constructor). For example, most object-oriented environments have a Date class that supports initializing a date object with three integers (day, month, year), the current system date, another Date object, and so on.
The benefit of method overloading is that the methods present a consistent interface to the programmer. In the case of our example of the All Employees container class, whenever a programmer wants to locate an employee, he or she knows to use a method named “find.” Then the programmer just uses whichever of the three types of data he or she happens to have. The object-oriented program locates the correct method by using the entire public interface (its signature), made up of the name and the required input data.
Class Relationships
The classes in an object-oriented environment aren't always independent. The basic object-oriented paradigm has two major ways to relate objects, distinct from any logical data relationships that might be included in a pure object-oriented database: inheritance and composition.
Inheritance
As a developer or database designer is working on an object-oriented project, he or she may run into situations where there is a need for similar—but not identical—classes. If these classes are related in a general to specific manner, then the developer can take advantage of one of the major features of the object-oriented paradigm, known as inheritance.
Inheriting Attributes
To see how inheritance works, assume that you are writing a program (or developing a database) to manage a pet shop. One of the entity classes you will use is Animal, which will describe the living creatures sold by the shop. The data that describe objects created from the Animal class include the English and Latin names of the animal, the animal's age, and the animal's gender. However, the rest of the data depend on what type of animal is being represented. For example, for reptiles, you want to know the length of the animal, but for mammals, you want to know the weight. And for fish, you don't care about the weight or length, but you do want to know the color. All the animals sold by the pet shop share some data, yet have pieces of data that are specific to certain subgroups.
You could diagram the relationship as in Figure 18-8. The Animal class provides the data common to all types of animals. The subgroups—Mammals, Reptiles, and Fish—add the data specific to themselves. They don't need to repeat the common data because they inherit them from animals. In other words, Mammals, Reptiles, and Fish all include the four pieces of data that are part of Animal.
B9780123756978500182/f18-08-9780123756978.jpg is missing
Figure 18-8
The relationship of classes for an object-oriented environment for a pet shop
If you look closely at Figure 18-8, you'll notice that the lines on the arrows go from the subgroups to Animal. This is actually contrary to what is happening: The data from Animal are flowing down the lines into the subgroups. Unfortunately, the direction of the arrows is dictated by convention, even though it may seem counterintuitive.
In object-oriented terminology, the subgroups are known as subclasses or derived classes. The Animal class is a superclass or base class.
The trick to understanding inheritance is to remember that subclasses represent a more specific occurrence of their superclass. The relationships between a base class and its derived classes therefore can be expressed using the phrase “is a”:
◊ A mammal is an animal.
◊ A reptile is an animal.
◊ A fish is an animal.
If the “is a” phrasing does not make sense in a given situation, then you are not looking at inheritance. As an example, assume that you are designing an object-oriented environment for the rental of equipment at a ski rental shop. You create a class for a generic merchandise item and then subclasses for the specific types of items being rented, as in the top four rectangles in Figure 18-9. Inheritance works properly here because skis are a specific type of merchandise item, as well as boots and poles.
B9780123756978500182/f18-09-9780123756978.jpg is missing
Figure 18-9
Inheritance and no inheritance in an object-oriented environment for a ski equipment rental
However, you run into trouble when you begin to consider the specific items being rented and the customer doing the renting (the renter). Although there is a logical database-style relationship between a renter and an item being rented, inheritance does not work because the “is a” test fails. A rented item is not a renter!
The situation with merchandise items and rental inventory is more complex. The Merchandise Item, Skis, Boots, and Poles classes represent description of types of merchandise but not physical inventory. For example, the ski shop may have many pairs of one type of ski in inventory and many pairs of boots of the same type, size, and width. Therefore, what is being rented is individual inventory items, represented by the Item Rented class. A given inventory item is either skis, boots, or poles. It can only be one, not all three as shown in Figure 18-9. Therefore, an item rented is not a pair of skis, a pair of boots, or a set of poles. (You also have the problem of having no class that can store the size or length of an item.)
The solution to the problem is to create a separate rented item class for each type of merchandise, as in Figure 18-10. When you are looking at this diagram, be sure to pay attention to the direction of the arrows. The physical layout of the diagram does not correspond to the direction of the inheritance. Remember that, by convention, the arrows point from the derived class to the base class.
B9780123756978500182/f18-10-9780123756978.jpg is missing
Figure 18-10
Multiple inheritance in the data environment for a skishop
The Ski Item class inherits information about the type of item it is from the Skis class. It also inherits information about an item being rented from the Item Rented class. A ski item “is a” pair of skis; a ski item “is a” rented item as well. Now the design of the classes passes the “is a” test for appropriate inheritance. (Note that it also gives you a class that can contain information such as the length and size of a specific inventory item.) The Renter class does not participate in the inheritance hierarchy at all.
Multiple Inheritance
When a class inherits from more than one base class, you have multiple inheritance. The extent to which multiple inheritance is supported in programming languages and DBMSs varies considerably from one product to another.
Abstract Classes
Not every class in an inheritance hierarchy is necessarily used to create objects. For example, in Figure 18-10 it is unlikely that any objects are ever created from the Merchandise Item or Item Rented classes. These classes are present simply to provide the common attributes and methods that their derived classes share.
Such classes are known as abstract, or virtual, classes. In contrast, classes from which objects are created are known as concrete classes.
Note: Many computer scientists use the verb “instantiate” to mean “creating an object from a class.” For example, you could say that abstract classes are never instantiated. However, I find that term rather contrived (although not quite as bad as saying “we will now motivate the code” to mean “we will now explain the code”) and prefer to use the more direct “create an object from a class.”
Inheriting Methods: Polymorphism
In general, methods are inherited by subclasses from their superclass. A subclass can use its base class's methods as its own. However, in some cases it may not be possible to write a generic method that can be used by all subclasses. For example, assume that the ski rental shop's Merchandise Item class has a method named print Catalog Entry, the intent of which is to print a properly formatted entry for each distinct type of merchandise item. The subclasses of Merchandise Item, however, have attributes not shared by all subclasses and the print Catalog Entry method therefore must work somewhat differently for each subclass.
To solve the problem, the ski rental shop can take advantage of polymorphism, the ability to write different bodies for methods of the same name that belong to classes in the same inheritance hierarchy. The Merchandise Item class includes a prototype for the print Catalog Entry method, indicating just the method's public interface. There is no body for the method, no specifications of how the method is to perform its work (a virtual method). Each subclass then redefines the method, adding the program instructions necessary to execute the method.
The beauty of polymorphism is that a programmer can expect methods of the same name and same type of output for all the subclasses of the same base class. However, each subclass can perform the method according to its own needs. Encapsulation hides the details from all objects outside the class hierarchy.
Note: It is very easy to confuse polymorphism and overloading. Just keep in mind that overloading applies to methods of the same class that have the same name but different signatures, whereas polymor- phism applies to several subclasses of the same base class that have methods with the same signature but different implementations.
Composition
Inheritance can be described as a general-specific relationship. In contrast, composition is a whole-part relationship. It specifies that one class is a component of another and is often read as “has a.”
To help you understand how composition can be used, let's assume that the ski rental shop wants to offer packages of items for rent (skis, boots, and poles). The packages will come in three qualities—good, better, and best—based on the retail value of the items in the package.
As you can see in Figure 18-11, each package contains three types of merchandise items, so the package class “has a” boot, “has a” pole, and “has a” ski. An object created from this class would be used to indicate which types of items could be rented as a bundle. In contrast, the rented package class contains actual rental items and therefore indicates which specific inventory items have been rented together.
B9780123756978500182/f18-11-9780123756978.jpg is missing
Figure 18-11
Composition
Some pure object-oriented DBMSs take composition to the extreme. They provide simple data types such as integers, real numbers, characters, and Booleans. Everything else in the database—even strings—is built by creating classes from these simple data types and using those classes to build more complex classes, and so on.
Benefits of Object-Orientation
There are several reasons why the object-oriented paradigm has become so pervasive in programming. Among the perceived benefits are the following:
◊ An object-oriented program consists of modular units that are independent of one another. These units can therefore be reused in multiple programs, saving development time. For example, if you have a well-debugged employee class, you can use it in any of your business programs that require data about employees.
◊ As long as a class's public interface remains unchanged, the internals of the class can be modified as needed without requiring any changes to the programs that use the class. This can significantly speed up program modification. It can also make program modification more reliable, as it cuts down on many unexpected side effects of program changes.
◊ An object-oriented program separates the user interface from data handling, making it possible to modify one independent of the other.
◊ Inheritance adds logical structure to a program by relating classes in a general to specific manner, making the program easier to understand and therefore easier to maintain.
Where Objects Work Better Than Relations
There are some database environments—especially those involving a great deal of inheritance—in which object-orientation is easier to implement than a relational design. To see why this is so, let's look at a hobby environment that just happens to be one of the best examples of the situation in question that I've ever encountered.
The database catalogs Yu-Gi-Oh cards (one of those animé related trading card games). The collector for whom this database and its application were developed has thousands of cards, some of which are duplicates. They are stored in three binders. Within each binder there may be several sections; the pages are numbered within each section.
There are three major types of cards: monsters, spells, and traps. The monster card in Figure 18-12 is fairly typical. Notice that it has a title, an “attribute” at the top right, a “level” (count the circles below the name), an “edition” (first, limited, or other), a set designation, a type (and optionally two subtypes), a description, attack points, and defense points. At the bottom left, there may be a code number, which is missing from some of the early cards.
B9780123756978500182/f18-12-9780123756978.jpg is missing
Figure 18-12
A typical Yu-Gi-Oh monster card
A card with the same name may appear in many sets and the same set may have more than one card of the same name. What distinguishes them is their “rarity,” determined by how the title is printed (black or white for common cards and silver or gold for rare cards) and how the image is printed (standard color printing or holofoil printing). There are a variety of combinations of printing to generate common, rare, super rare, ultra rare, and ultimate rare cards.
Note: If you want an interesting challenge before you see the relational design for this database, try to figure out the primary key for a card!
Most cards can be used for game play, but some have been banned from specific types of games. Others have caveats (“rulings”) attached to them by the game's governing board that affect how the card can be used in a game.
Spell cards, which as you might expect can be used in the game to cast spells, share a number of attributes with the monster card, but don't have things such as type and subtypes. The third type of card, a trap, also shares some attributes with monsters, but is missing others and has a property that is unique to this type of card. Spells also have properties, but the list of possible properties differs between spells and traps.
You can find an entity-relationship diagram for the card database in Figure 18-13. As you might have guessed, there is an entity for the card, which has three subclasses, one for each specific type of card. There are also many holdings for each card.
B9780123756978500182/f18-13-9780123756978.jpg is missing
Figure 18-13
The ER diagram for the card database
To design the relational database, we create one relation for each entity, including the superclass (in this example, Card) and its subclasses (Monster card, Trap card, and Spell card). With an object-oriented DBMS, we would create objects only from the subclasses; no object would ever be created from the superclass. The subclasses “inherit” the attributes of their parent. For the relational database, we have to do that manually, using some type of primary key-foreign key relationship to connect the subclass tables to the parent table. Differences in where cards of a given name appear and how they are printed are handled by Holdings. Therefore, the design of the card database looks like this:
Card (InternalCardNUMB, Attribute, Banned?, CardDescription?, CardImage, CardName, CardNumber, CardType, Count, Limit, Ruling)
Monster card (InternalCardNumb, ATK, DEF, Level, MonsterSubtype1, MonsterSubtype2, MonsterType)
Trap card (InternalCardNumb, TrapType)
Spell card (InternalCardNumb, SpellType)
Holdings (InternalCardNumb, Code, Edition, Holofoil?, NamePrint, NumberOwned, Binder, Page, Section, Slot)
Why have both the trap and spell card relations if they have exactly the same attributes? At the current time they could certainly be maintained in one relation. However, there are several reasons to keep them separate. First, there is no way to guarantee that they will always have the same attributes. If they are separated from the start, it will be easier to add attributes to one or the other if needed at some later date.
Second, the major reason this type of design doesn't perform as well as it might is because the details about a card always need to be looked up in another relation, joining on the internal card number. If we keep spell and trap data separate, the relations will remain smaller and the joins will perform better.
Note: Here's the answer to the primary key challenge: A Holding actually represents one or more physical cards in the inventory. It has a name (represented by the internal card number) and a set designation. When cards of the same name are printed in dif- ferent ways in the same set, they have different set designations. Therefore, the concatenation of the internal card number and the set designation uniquely identifies a card (although not a physical card in the inventory, given that there may be duplicate cards in the inventory). The only other alternative is to assign unique in- ventory numbers to each physical card and to use them. For some collectors, this may make sense, given that they would want to track the condition of each and every card.
There is an alternative design for this type of database: Rather than use one relation for each entity, create only a single relation in which many attributes may be null, depending on the type of card. Such a relation might look like this:
Card (InternalCardNumb, Attribute, Banned?, CardDescription, CardImage, CardName, CardNumber, CardType, Count, Limit, Ruling, ATK, DEF, Level, MonsterSubtype1, MonsterSubtype2, MonsterType, TrapType, spellType)
The Card Type attribute indicates which of the type-specific attributes should have data. For example, if CardType contained “M” for Monster, you would expect to find data in the ATK, DEF, and level attributes but not the spell type or trap type. The supposed benefit of this design is that you avoid the joins to combine the separate relations of the earlier design. However, when a DBMS retrieves a row from this relation, it pulls in the entire row, empty fields and all. Practically, in a performance sense, you haven't gained much and you're stuck with a design that can waste disk space.
Note: Personally, I prefer the multiple relation design because it's cleaner, wastes less space, and is much more flexible as the design of the relations needs to change over time.
A pure object-oriented design for the same database would include the five entity classes, although the Card class would be an abstract class. It would also include a class to aggregate all objects created from subclasses of the Card class, letting users handle all cards, regardless of type, as a single type of object. The nature of these data—the major need for inheritance— suggests that an object-oriented database may well perform better than a relational database.
Pure Object-Oriented Databases
A pure object-oriented database management system (OOD-BMS) defines all of its data using object-oriented data structures. There are no tables, but only objects and groups of objects. There are no required primary keys nor any foreign keys.
Representing Data Relationships
An object-oriented database indicates relationships by requiring each object to contain identifiers for the objects to which it is related. To make this work, when an object is created it is given a unique internal identifier. The user (a programmer or someone working interactively) never sees this identifier. In fact, it may be created from the information needed to locate the object's data in a physical database file.
Exactly which containers an object will have depends on the types of relationships in which it is participating:
◊ 1:1: Each object will have an attribute that stores the identifier for the object to which it is related.
◊ 1:M: The object at the “one” end of the relationship will have a container for the identifiers of the objects to which it is related. If the relationship is to support inverse access (“many-to-one”), each object at the “many” end of the relationship will have an attribute to hold the identifier of its parent object.
◊ M:M: A pure object-oriented database can represent a M:M relationship directly, without the use of a relationship object (a composite entity), something a relational database cannot do. (An OO design may still need the equivalent of a composite entity if it needs to store relationship data.) Each object will have a container for the identifiers for objects at the other end of the relationship.
The objects in an OO database are not required to have unique keys (the equivalent of a primary key). However, most classes are designed with a variable to hold a unique identifier to ensure retrieval of each specific object.
Navigating the Relationships
Object-oriented databases are what we call navigational. This means that access to related objects must follow the predefined linkages created by the containers for related objects. For example, to find all the purchases made by a customer, a program in an object-oriented database environment would do the following:
1. Find the customer object, perhaps using an aggregate object that collects all the customer objects.
2. Retrieve the first related-object identifier from the customer object.
3. Use the purchase object's identifier to locate the purchase object and process it as needed.
4. Retrieve the next related-object identifier from the customer object.
5. Repeat steps 3 and 4 until all purchase objects have been processed.
Relationships in a relational database are all two-way (inverse). In other words it is possible to go from the entity at the “one” end of a relationship to entities at the “many” end. It is also possible to take a single entity at the “many” end of a relationship and find its parent entity. However, not all OODBMSs require inverse relationships. When the database developer creates the schema, he or she must indicate which relationships will be inverse and which will be one-way. One-way relationships therefore mean that a relationship can be navigated in only one direction.
The Object-Relational Data Model
The object-relational (OR) data model—which is sometimes known as post-relational—is a combination of the relational data model that we discussed in Chapter 1 and some of the object-oriented concepts that—in the opinion of some database theorists and users—make up for shortcomings in the relational data model. The purpose of this discussion is to help you understand how OR designs differ from both pure OO designs and pure relational designs. With that in hand, you will be able to understand the strengths and weaknesses of SQL's support for object-related structures that are discussed in Chapter 19.
ER Diagrams for Object-Relational Designs
The style of ER diagramming that you saw in Chapter 1 (the Information Engineering, or IE, approach) does not lend itself to the inclusion of objects because it has no way to represent a class. Therefore, when we add objects to a relational database, we have to use another ERD style.
Although there are many techniques or object-oriented ERDs, one of the most commonly used is the Unified Modeling Language (UML). When used to depict a post-relational database design, UML looks a great deal like the IE style, but indicates relationships in a different way.
An example of an ER diagram using UML can be found in Figure 18-14. This design is of a purely object-oriented database and includes some elements that therefore won't appear in a hybrid design. It has been included here to give you an overview of UML so that you can better understand the portions of the modeling tool that we will be using in Chapter 19.
B9780123756978500182/f18-14-9780123756978.jpg is missing
Figure 18-14
An object-oriented database design using UML
The basic features of UML include the following:
◊ A regular class is represented by a rectangle, divided into three parts (name, attributes, procedures).
◊ An aggregate class (a metaclass in the diagram)—a class that collects all objects of a given class—is represented by a rectangle containing its name and the rectangles of the classes whose objects it aggregates. For example, in Figure 18-14, the Product and Source classes are within their aggregate classes, All Products and All Sources, respectively.
◊ Relationships between entities are shown with lines with plain ends. The cardinality of a relationship is represented as n, n..n, or n..*. For example, if the cardinality is 1, it is simply written as 1. If the cardinality is 0 or more, it appears as 0..*; 1 or more appears as 1..*. Notice in Figure 18-14 that there are several direct many-to-many relationships, shown with 0..* at either end of the association line.
◊ Inheritance is shown by a line with an open arrow pointing toward the base class. In Figure 18-14, the Footwear and Headgear classes have such arrows pointing toward Product.
◊ What we call composite entities in a relational database are known as association classes. They are connected to the relationship to which they apply with a dashed line. As you can see in Figure 18-14, the MaterialSupplied and MaterialUse classes are each connected to at least one many-to-many relationship by the required dashed line.
In addition to the basic features shown in Figure 18-14, UML diagrams can include any of the following:
◊ An attribute can include information about its visibility (public, protected, or private), data type, default value, and domain. In Figure 18-15, for example, you can see four classes and the data types of their attributes. Keep in mind that in an object-oriented environment, data types can be other classes. Therefore, the Source class uses an object of the TelephoneNumber class for its phoneNumber attribute and an object of the Address class for its sourceAddress attribute. In turn, source, Address, and TelphoneNumber all contain attributes that are objects of the String class.
B9780123756978500182/f18-15-9780123756978.jpg is missing
Figure 18-15
UML classes showing attribute data types
◊ Procedures (officially called operations by UML) can include their complete program signature and return data type. If you look at Figure 18-15, for example, you can see each operation's name followed by the type of data it requires to perform its job (parameters). Together, the procedure's name and parameters make up the procedure's signature. If data are returned by the operation, then the operation's signature is followed by a colon and the data type of the return value, which may be an object of another class or a simple data type such as an integer.
◊ Solid arrows can be used at the end of associations to indicate the direction in a relationship can be navigated.
Note: As mentioned earlier in this chapter, pure object-oriented databases are navigational, meaning that traversal through the database is limited to following predefined relationships. Because of this characteristic, some theorists feel that the object-oriented data model is a step backwards rather than forward and that the relational data model continues to have significant advantages over any navigational data model.
There are three possible ways to use the arrows:
Use arrows on the ends of all associations where navigation is possible. If an association has a plain end, then navigation is not possible in that direction. This would indicate, for example, a relationship between two objects that is not an inverse re- lationship, where only one of the two objects in a relationship contains the object identifier of a related object.
○ Show no arrows at all, as was done in Figure 18-15. In that case, the diagram provides no information about how the database can be navigated.
○ Show no arrows on associations that can be navigated in both directions, but use arrows on associations that can be navigated in only one direction. The drawback to this approach is that you cannot differentiate associations that can be navigated in both directions from associations that cannot be navigated at all.
◊ An association that ends in a filled diamond indicates a whole-part relationship. For example, if you were representing a spreadsheet in a database, the relationship between the spreadsheet and its cells could be diagrammed as in Figure 18-16. The filled diamond can also be used to show aggregation instead of placing one object within another as was done in Figure 18-14.
B9780123756978500182/f18-16-9780123756978.jpg is missing
Figure 18-16
A UML representation of a whole-part relationship
◊ When an association is between more than two objects, UML uses a diamond to represent the relationship. If an association is present, it will be connected to the diamond, as in Figure 18-17. The four classes in the illustration represent entities from a poetry reading society's database. A “reading” occurs when a single person reads a single poem that was written by one or more poets.
B9780123756978500182/f18-17-9780123756978.jpg is missing
Figure 18-17
The UML representation of a relationship between more than two classes
The association entity indicates when and where the reading took place.
Features of the OR Data Model
There is no accepted standard for the object-relational data model. However, a commonly used model is based on the elements supported by recent SQL standards. As you will see, these features violate many of the rules applied to relational databases:
◊ A relational database should have no data structures other than tables. An OR database, however, allows an attribute to support a reference to a row in another table. These references are the internal object identifiers used by OO databases described earlier in this chapter. An OR database can use the relational concept of a primary key-foreign key relationship to indicate entity relationships. However, it is not required; references to rows can be used instead. The advantage to using row references rather than the relational method is improved performance because the joins needed to follow data relationships are unnecessary. The major drawback to using row references is that the integrity of the relationships can't be verified by the DBMS; referential integrity requires both primary and foreign key values.
◊ A relational database is limited to one value at the intersection of a single column and row. An OR database, however, can store more than one value in the same location. The values can be an array of the same type of data, a row of data (much like a table within a table), an unordered collection of data of different data types, or an entire object.
◊ Classes are implemented as user-defined data types (UDTs). A new UDT may inherit from an existing UDT, although multiple inheritance is not allowed. A UDT will have default accessor and mutator methods as well as a default constructor, each of which can be overridden by a database programmer. There is nothing in the relational data model that prohibits UDTs. However, to be used in a relational database, a custom data type must hold only a single value.
◊ UDTs may have methods defined with them. Methods may be overloaded. Polymorphism is supported. Relational databases have no concept of storing procedures with data.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.155.216