Working with Relational Databases

The simple spy database does the job, but it is incomplete. Any self-respecting spymaster would keep the following information on each agent:

  • CodeName

  • Specialty (each spy could have several specialties)

  • Assignment

  • Assignment description

  • Location

It might be tempting to build a slightly bigger table to hold this information. Figure 11.27 illustrates such a table.

Figure 11.27. This version of the spy database has more information, but it also introduces a number of problems.


When you carefully analyze this version of the spy database, you’ll notice a couple of problems that crop up frequently in real databases. First, many of the spies have multiple talents (my personal favorite is explosives and flower arranging). It will be difficult to write a query that finds an agent with a flower arranging skill, because the only agent with that skill also has explosives listed in the same field. (You know, flower arranging can be a deadly art in the hands of a master practitioner…) There are other problems. The description and location fields tend to be closely related to the assignment field. That makes sense, because it is supposed to be a description of the assignment, and each assignment has only one location. However, there are some inconsistencies. Does Operation Dancing Elephant take place in a circus or a zoo? Because the description of the assignment was typed in two different places in the database, there is conflicting information about the operation. Likewise, Operation Enduring Angst might be in Lower Volta, or it might be in Lower Votla. Although these examples are deliberately outlandish, the problems they point out are real. Many databases have variations of these same weaknesses. The answer to better-behaved databases is a practice called data normalization.

Improving Your Data with Normalization

Data normalization can (and does) take up entire books, but it can be summarized by a list of simple rules:

  • Break your data into multiple tables

  • No field can have a list of entries

  • Do not duplicate data

  • Each table describes only one entity

  • Each table has a single primary key field

As an example of data normalization in action, I built one more version of the spy database. It retains the ability to return all the data needed, but it avoids some of the pitfalls of the single-table database.

First, take a look at the improved version of the Agents table, featured in Figure 11.28.

Figure 11.28. The Agents table is quite a bit simpler than it was before.


You might be surprised how little information remains in the Agents table. The Assignment, Specialty, Description, and Location fields have totally disappeared from the table. (Don’t worry, they’ll reappear shortly.) The only remaining fields are AgentID, CodeName, and AssignmentID. The Assignment ID field contains only numeric values. The number in the AssignmentID field is used to look up a record in another table, illustrated in Figure 11.29.

Figure 11.29. The Assignments table describes all the information related to a specific operation.


I built the Assignments table by taking a careful look at the data in my original expanded database. In a properly normalized database, all of the information in a table describes one type of entity. On closer examination Figure 11.27 (the bad spy table) has fields that describe two different kinds of information. The AgentID, CodeName, and Specialty fields describe an Agent, but the Assignment, Description, and Location fields refer to the Operation the agent is assigned to. (Specialties are an entirely different problem, and I’ll describe them later on.) The Agents table still needs a way to determine which operation an agent is on, but it is redundant to describe each operation’s details for each agent on the assignment. For the purpose of this example, each operation has one name, one assignment, and one location. The information that pertains to the Assignment is placed in the separate Assignments table. The Assignments table also has a primary key, so each assignment has a unique key. The Agents table has an Assign mentID field, which contains only a reference to the key field of the Assignments table.

A field that contains the primary key of another table is called a foreign key reference in most database applications.

Note that the AssignmentID field appears in both the Agents table and the Assignments table. It isn’t necessary to give these two fields the same name, but it makes the next step easier.

Using a Join to Connect Two Tables

SQL Server (even the simplified version included with Visual Studio) includes a visual tool to help connect two tables. Each Database in the SQL Server list in the server explorer has a Data Diagrams option. Right-click on the Data Diagrams element and choose New Data Diagram in the same way you created new tables. You see a screen that lets you add tables. Choose both the Agents and Assignments tables. You see a graphic representation of the tables that looks like Figure 11.30.

Figure 11.30. The data diagram tool shows the tables in your database.


It’s important to indicate there is a relationship between the AssignmentID fields in the two tables. This is easily done in the data diagram window by simply dragging the mouse from the box to the left of AssignmentID in Agents, and the similar box of AssignmentID in the Assignments table. The dialog shown in Figure 11.31 appears.

Figure 11.31. You can leave the default values of the Create Relationship dialog.


The Create Relationship Dialog helps you to define the relationship between the two tables. Relationships always involve the primary key of one table and a foreign key in another table. The Create Relationship dialog usually guesses correctly which table has the indicated key as its primary key.

The two fields must have the same type, or the relationship will not be established.

After the relationship has been created, the data diagram looks something like Figure 11.32.

Figure 11.32. The solid line indicates the relationship between the Agents and Assignments tables.


Creating a View

The data has been normalized, which provides some important advantages, but the user really doesn’t care. The user doesn’t want to have to look up which assignment is assignment number 2, for example. Data management systems have a special entity called the View which enables you to generate a “virtual table” that recognizes the relationship you’ve just created. You can create a view in the server explorer just like you created a table and a data view. When you right-click on the Views item and choose Create New View, you get the opportunity to include any data tables. If you choose both tables you see a screen similar to Figure 11.33.

Figure 11.33. The view editor features a form of the data diagram. The relationship has been preserved.


The view editor has four main sections. The top section is a version of the data diagram. The next section is a Query By Example (QBE) tool. The third band displays an SQL statement, and the bottom layer shows the results of the current query.

You might wonder why the view editor has all these query tools. It’s because a view and a query are essentially the same thing. It’s called a query when it’s created on the fly in an application or by a user. A view is usually stored with the database. Often a data developer creates a number of views to reconnect any tables that have been separated by the normalization process, and any other queries that are likely to be extremely common.

The best way to understand the view editor is to look at an example. Figure 11.34 illustrates my completed Agent_Assignment View.

Figure 11.34. As you can see from the bottom of the screen, this view recombines the Agents and Assignments tables.


A view or query can be used to recombine tables. I created this view by using a tool called Query By Example (QBE). I drug all the fields I was interested in displaying from the data diagram at the top of the screen to the grid in the middle of the screen. When I did so, the center grid automatically filled with the values you see. The grid describes which fields I want to use in the query. For each field, I can assign an alias. The alias describes the name of the field in the resulting virtual table. If you do not provide an alias, the original field name is retained. You can choose whether each field is displayed. The Sort Type and Sort Order values are used to determine how the resulting data is sorted. The Criterion value enables you to set a WHERE clause for the view.

To see the results of the view, click on the exclamation point icon at the upperleft of the view editor, or choose Run from the Query menu.

The view editor is a great way to learn SQL syntax. It’s pretty easy to experiment by dragging fields to the grid and examining the resulting SQL statement. Don’t forget to periodically run the query so you can be sure that the data view you see in the bottom grid is related to the currently displayed SQL statement.

Take a careful look at the SQL statement in Figure 11.34. It can be reformatted as follows for clarity:

SELECT Agents.CodeName,
                Assignment.Name,
                Assignment.Description,
                Assignment.Location
FROM Agents INNER JOIN Assignments
ON Agents.AssignmentID = Assignments.AssignmentID

The only part of the SELECT statement that is really new is the FROM clause. The relationship between the Agents and Assignments tables is automatically represented by an INNER JOIN statement. The INNER JOIN tells the computer to display fields from the Assignment table only when the AssignmentID fields in the two tables match.

You can create SQL queries with inner joins inside your code as well as in views. The advantage of a view is the way it appears as a virtual table to the program

Referring to a View in a Program

Once you have created a view in the database, it’s very simple to add that view to your programs. Simply drag the view to your form to build a data connection and data adapter for the view, then create a data set from the adapter and connect your data grid to the data set as usual.

Figure 11.35 shows a database that features a view.

Figure 11.35. Once you’ve created a view, you can attach a grid to the view as if it were a table.


Views give you the best of both worlds. You can design your data to improve its integrity, but the user will see the data without any cryptic foreign key values.

Incorporating the Agent Specialty Attribute

The Agent Specialty field creates another problem for the data developer. The rules of normalization indicate you should never have a list of data in a field. Each agent could have a number of specialties. If you enable the user to enter all the specialties into a single text box, you have the same ambiguity problems you prevented with the Assignments table. Also, it is hard to predict how much room to allocate for skills if the skills will be a list.

The relationship between agents and specialties is called a many-to-many relationship, because each spy could have many specialties, and each specialty could belong to many different agents.

Searching for a spy with particular skills is a challenge if the skills are simply a list of text values in a field. The solution is to use two tables to manage the relationship between the Agent and his or her specialties. First, take a look at the Specialties table featured in Figure 11.36.

Figure 11.36. The Specialties table simply lists all the various specialties.


The Specialties table consists of a primary key and specialty name fields. However, because each agent can conceivably have many specialties, you cannot use the same type of join that connected agents with assignments. Instead, I added another table, displayed in Figure 11.37.

Figure 11.37. The Agent_Specialty table serves as a bridge between the Agents table and the Specialties Table.


The Agent_Specialty table is very interesting because the user will never see it. Instead, this table has joins to both the Agents table and the Specialties table. The Agent_Specialty.AgentID field is a foreign key reference to the Agents table, and the Agent_Specialty.SpecialtyID field is a foreign key reference to the Specialties table. The complete data diagram of the spy database including all four of its tables and their relationships is shown in Figure 11.38.

Figure 11.38. The Agents table connects directly with the Assignments table, but uses the Agents_Specialty table as a bridge to the Specialty table.


An intermediate table, such as Agents_Specialty, is frequently used to implement many-to-many relationships.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.93.12