12.3 Database Management Systems

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

12.3 Database Management Systems

Almost all sophisticated data management situations rely on an underlying database and the support structure that allows the user (either a human or a program) to interact with it. A database can simply be defined as a structured set of data. A database management system (DBMS) is a combination of software and data made up of three components:

The physical database—a collection of files that contain the data
The database engine—software that supports access to and modification of the database contents
The database schema—a specification of the logical structure of the data stored in the database

The database engine software interacts with specialized database languages that allow the user to specify the structure of data; add, modify, and delete data; and query the database to retrieve specific stored data.

The database schema provides the logical view of the data in the database, independent of how it is physically stored. Assuming that the underlying physical structure of the database is implemented in an efficient way, the logical schema is the more important point of view from the database user’s perspective because it shows how the data items relate to each other.

FIGURE 12.7 depicts the relationships among the various elements of a database management system. The user interacts with the database engine software to determine and/or modify the schema for the database. The user then interacts with the engine software to access and possibly modify the contents of the database stored on disk.

A figure shows the elements of the database engine software reading schema, user, and database files. — FIGURE 12.7 The elements of a database management system

The Relational Model

Several popular database management models have been proposed, but the one that has dominated for many years is the relational model. In a relational DBMS, the data items and the relationships among them are organized into tables. A table is a collection of records. A record is a collection of related fields. Each field of a database table contains a single data value. Each record in a table contains the same fields.

A record in a database table is also called a database object or an entity. The fields of a record are sometimes called the attributes of a database object.

As an example, consider the database table shown in FIGURE 12.8, which contains information about movies. Each row in the table corresponds to a record. Each record in the table is made up of the same fields in which particular values are stored. That is, each movie record has a MovieId, a Title, a Genre, and a Rating that contains the specific data for each record. A database table is given a name, such as Movie in this case.

A database table is shown for records and fields. — FIGURE 12.8 The Movie database table, made up of records and fields

Usually, one or more fields of a table are identified as key fields. The key field(s) uniquely identifies a record among all other records in the table. That is, the value stored in the key field(s) for each record in a table must be unique. In the Movie table, the MovieId field would be the logical choice for a key. That way, two movies could have the same title. Certainly the Genre and Rating fields are not appropriate key fields in this case.

Each value in the key field MovieId must be unique. Most DBMSs allow such fields to be automatically generated to guarantee unique entries. The key values do not have to be consecutive, however. The last three entries of the table contain radically different movie identification numbers. As long as they are unique values, the MovieId field can serve as the key.

The Movie table in Figure 12.8 happens to be presented in the order of increasing MovieId value, but it could have been displayed in other ways, such as alphabetical by movie title. In this case, there is no inherent relationship among the rows of data in the table. Relational database tables present a logical view of the data and have nothing to do with the underlying physical organization (how the records are stored on disk). Ordering records becomes important only when we query the database for particular values, such as all movies that are rated PG. At that point we might want to sort the results of the query by title.

The structure of the table corresponds to the schema it represents. That is, a schema is an expression of the attributes of the records in a table. We can express the schema for this part of the database as follows:

Movie (MovieId:key, Title, Genre, Rating)

Sometimes a schema representation indicates the type of data that is stored in individual fields, such as numeric or text. It may also indicate the specific set of values that are appropriate for a given field. For instance, the schema could indicate in this example that the Rating field can be only G, PG, PG-13, R, or NC-17. The schema for an entire database is made up of the individual schema that corresponds to individual tables.

Suppose we wanted to create a movie rental business. In addition to the list of movies for rent, we must create a database table to hold information about our customers. The Customer table in FIGURE 12.9 could represent this information.

A customer database table is shown. — FIGURE 12.9 The Customer database table, containing customer data

Similar to what we did with our Movie table, the Customer table contains a CustomerId field to serve as a key. The fact that some CustomerId values correspond to some MovieId values is irrelevant. Key values must be unique only within a table.

Universal Pro duct Code

When you look on the packaging of most products, you will find a Universal Product Code (UPC) and its associated bar code, such as the one shown to the right. UPC codes were created to speed up the process of purchasing a product at a store and to help keep better track of inventory.

A UPC symbol is made up of the machine-readable bar code and the corresponding human-readable 12-digit UPC number. The first six digits of the UPC number are the manufacturer identification number. For example, General Mills has a manufacturer ID number of 016000. The next five digits are the item number. Each type of product, and each different packaging of the same product, is assigned a unique item number. For example, a 2-liter bottle of Coke has a different item number than a 2-liter bottle of Diet Coke, and a 10-ounce bottle of Heinz ketchup has a different item number than a 14-ounce bottle of Heinz ketchup.

The last digit of the UPC code is the check digit, which allows the scanner to determine whether it scanned the number correctly. After reading the number, a calculation is performed on the rest of the digits of the number to determine the check digit. The result is then verified against the actual check digit. (See Chapter 18 for more information on check digits.)

A UPC symbol

For some products, particularly small ones, a technique has been developed to create UPC numbers that can be shortened by eliminating certain digits (all zeros). In this way, the entire UPC symbol can be reduced in size.

Note that a product’s price is not stored in the UPC number. When a product is scanned at a cash register—more formally called a point of sale (POS)—the manufacturer and item numbers are used to look up that item in a database. The database might contain a great deal of product information, including its price. Keeping only basic information in the UPC number makes it easy to change other information, such as the price, without having to relabel the products. Unfortunately, this flexibility also makes it easy to create situations of “scanner fraud” in which the database price of an item does not match the price on the store shelf, whether intentionally or not.

In a real database, we would be better off subdividing the Name field of our Customer table into FirstName and LastName fields. Also, we would probably use separate fields to hold various parts of a complete address, such as City and State. For our examples we are keeping things simple.

The Movie table and the Customer table show how data can be organized as records within isolated tables. The real power of relational database management systems, though, lies in the ability to create tables that conceptually link various tables together, as discussed in the next section.

Relationships

Recall that records represent individual database objects, and that fields of a record are the attributes of these objects. We can create a record to represent a relationship among objects and include attributes about the relationship in that record. In this way, we can use a table to represent a collection of relationships among objects.

Continuing our movie rental example, we need to be able to represent the situation in which a particular customer rents a particular movie. Because “rents” is a relationship between a customer and a movie, we can represent it as a record. The date rented and the date due are attributes of the relationship that should be in the record. The Rents table in FIGURE 12.10 contains a collection of these relationship records that represents the movies that are currently rented.

A rental database table is shown. — FIGURE 12.10 The Rents database table, storing current movie rentals

The Rents table contains information about the objects in the relationship (customers and movies), as well as the attributes of the relationship. It does not hold all of the data about a customer or a movie, however. In a relational database, we avoid duplicating data as much as possible. For instance, there is no need to store the customer’s name and address in the rental table—that data is already stored in the Customer table. When we need that data, we use the CustomerId stored in the Rents table to look up the customer’s detailed data in the Customer table. Likewise, when we need data about the movie that was rented, we look it up in the Movie table using the MovieId.

Note that the CustomerId value 103 is shown in two records in the Rents table. Its two appearances indicate that the same customer rented two different movies.

Data is modified in, added to, and deleted from our various database tables as needed. When movies are added or removed from the available stock, we update the records of the Movie table. As people become new customers of our store, we add them to the Customer table. On an ongoing basis, we add and remove records from the Rents table as customers rent and return videos.

Structured Query Language

Structured Query Language (SQL) is a comprehensive database language for managing relational databases. It includes statements that specify database schemas as well as statements that add, modify, and delete database content. In addition, as its name implies, SQL provides the ability to query the database to retrieve specific data.

The original version of SQL was Sequal, developed by IBM in the early 1970s. In 1986, the American National Standards Institute (ANSI) published the SQL standard, which serves as the basis for commercial database languages for accessing relational databases.

SQL is not case sensitive, so keywords, table names, and attribute names can be uppercase, lowercase, or mixed case. Spaces are used as separators in a statement. Because this is a specific programming language, we use a monospaced code font.

Queries

Let’s first focus on simple queries. The select statement is the primary tool for this purpose. The basic select statement includes a select clause, a from clause, and a where clause:

select attribute-list from table-list where condition

The select clause determines which attributes are returned. The from clause determines which tables are used in the query. The where clause restricts the data that is returned. For example:

select Title from Movie where Rating = ‘PG’

The result of this query is a list of all titles from the Movie table that have a PG rating. The where clause can be eliminated if no special restrictions are necessary:

select Name, Address from Customer

This query returns the name and address of all customers in the Customer table. An asterisk (*) can be used in the select clause to denote that all attributes in the selected records should be returned:

select * from Movie where Genre like ‘%action%’

This query returns all attributes of records from the Movie table in which the Genre attribute contains the word “action.” The like operator in SQL performs some simple pattern matching on strings, and the % symbol matches any string.

Select statements can also dictate how the results of the query should be sorted using the order by clause:

select * from Movie where Rating = ‘R’ order by Title

This query returns all attributes of R-rated movies sorted by the movie title.

SQL supports many more variations of select statements than are shown here. Our goal is simply to introduce the database concepts—you would require much more detail to become truly proficient at SQL queries.

Modifying Database Content

SQL’s insert, update, and delete statements allow the data in a table to be changed. The insert statement adds a new record to a table. Each insert statement specifies the values of the attributes for the new record. For example:

insert into Customer values (9876, ‘John Smith’, ‘602 Greenbriar Court’, ‘2938 3212 3402 0299’)

This statement inserts a new record into the Customer table with the specified attributes.

The update statement changes the values in one or more records of a table. For example:

update Movie set Genre = ‘thriller drama’ where title = ‘Unbreakable’

This statement changes the Genre attribute of the movie Unbreakable to “thriller drama.”

The delete statement removes all records from a table matching the specified condition. For example, if we wanted to remove all R-rated movies from the Movie table, we could use the following delete statement:

delete from Movie where Rating = ‘R’

As with the select statement, there are many variations of the insert, update, and delete statements.

Database Design

A database must be carefully designed from the outset if it hopes to fulfill its role. Poor planning in the early stages can lead to a database that does not support the required relationships.

One popular technique for designing relational databases is called entity-relationship (ER) modeling. Chief among the tools used for ER modeling is the ER diagram. An ER diagram captures the important record types, attributes, and relationships in a graphical form. From an ER diagram, a database manager can define the necessary schema and create the appropriate tables to support the database specified by the diagram.

FIGURE 12.11 presents an ER diagram showing various aspects of the movie rental example. Specific shapes are used in ER diagrams to differentiate among the various parts of the database. Types of records (which can also be thought of as classes for the database objects) are shown in rectangles. Fields (or attributes) of those records are shown in attached ovals. Relationships are shown in diamonds.

A figure shows an ER diagram for movie rental database. — FIGURE 12.11 An ER diagram for the movie rental database

The positions of the various elements of an ER diagram are not particularly important, though giving some thought to them will make the diagram easier to read. Note that a relationship such as Rents can have its own associated attributes.

Also note that the relationship connectors are labeled, one side with a 1 and the other side with an M. These designations show the cardinality constraint of the relationship. A cardinality constraint puts restrictions on the number of relationships that may exist at one time. Three general cardinality relationships are possible:

One-to-one
One-to-many
Many-to-many

The relationship between a customer and a movie is one-to-many. That is, one customer is allowed to rent many movies, but a movie can be rented by only a single customer (at any given time). Cardinality constraints help the database designer convey the details of a relationship.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 12.3 Database Management Systems

Create new playlist

Sign In

Sign Up

Table of Contents for
12.3 Database Management Systems