Multivalue Fields

Suppose you're a company that sells food items, like Northwind. You have a Products table with all the items you sell, with one row for each item: ice cream, yogurt, nondairy creamer, and so on. Among the columns, you include a field to store all the flavors for that item: vanilla, blueberry, Irish crème (see Table 2.14).

Table 2.14. Tables That Include Multivalue Fields
Products
ProductSupplierFlavors<<other fields>>
Ice creamFrozen DelightsVanilla, Chocolate, Strawberry 
YogurtHeavenly CulturesVanilla, Blueberry, Asparagus, Strawberry 
Nondairy CreamerFoodstuffs Inc.Vanilla, Asparagus, Hazelnut 

Authors
Last NameFirst NameMajor Works<<other fields>>
DostoevskyFyodorThe Brothers Karamazov, Crime and Punishment, The Idiot, The Gambler 
TurgenevIvanFathers and Sons 
TolstoyLeoWar and Peace, Anna Karenina 

Or consider a table of Russian authors. Each row is dedicated to one author. You have fields for first name, last name, and birthday. You also have a column for the author's major works (see Table 2.14).

The Flavors field in the Products table and the Major Works field in the Authors table are known as multivalue fields. These are fields that store more than one value in each row. They're usually easy to identify because, as in the examples, they use commas to separate the various values. The fields are also usually named in the plural—flavors, characters, works, and so on.

I think multivalue fields are among the most difficult database problems to handle, both conceptually and practically. At first glance, the Flavors field in the Products table and the Major Works field in the Authors table both seem like fine, even elegant, solutions to a common problem. If you know that the number of values for a single characteristic—product flavors, authors' major works, and so on—will vary among records, why try to predict the number of columns you need? Instead of using three—or six, or eight, or who knows how many—columns to store values, wouldn't the best solution be to store all values in a single field and separate them by commas?

But you are designing your database so you can quickly and efficiently retrieve the data you need. Suppose you decide to discontinue an experimental flavor—say, asparagus—from all your products. If Asparagus is always a discrete record, it's easy to delete all occurrences of Asparagus from your database. If Asparagus is but one value in a multivalue field, however, you need to edit each of those records. That task is more difficult and, equally important, more likely to result in error. The threat to data integrity is substantial.

Or suppose you want to retrieve all records on Dostoevsky's Crime and Punishment. Surrounded as it is between The Brothers Karamazov and The Idiot, it will be inefficient to find information on that one book. Or say you want to sort all the books by title. A multivalue field will make it difficult to put all the books in alphabetical order.

Put simply, multivalue fields are difficult to edit, sort, and delete. They violate the essence of good relational database design. You need to get rid of them.

Multivalue Fields in the Classic TV Database

Earlier you saw that the Programs table had a multipart field, Character Names/Character Occupations. To resolve this multipart field, you created separate Character Names and Character Occupations fields (see Table 2.15).

Table 2.15. Programs
Multivalue Fields
Program NameCharacter NamesCharacter Occupations<<other fields>>
The Andy Griffith ShowAndy Taylor, Barney Fife, Gomer PyleSheriff, Deputy Sheriff, Gas Station Attendant 
Happy DaysRichie Cunningham, Arthur FonzarelliStudent, Motorcyclist 
BewitchedSamantha Stevens, Darrin StevensWitch, Ad Executive 
The Lucy ShowLucy Carmichael, Vivian BagleyWidow, Divorcee 
Married with ChildrenAl Bundy, Peg BundyShoe Salesman, Homemaker 

Splitting the field into two columns resolved the problem of having two different types of data in the same field. But the Character Names and Character Occupations remain multivalue fields. To repeat, they have the two telltale signs of multivalue fields: The values are separated by commas, and the field name is plural.

Resolving Multivalue Fields

So how do you fix these multivalue fields?

First, take both the Character Names and Character Occupations fields out of the Programs table and put them in a new table called Roles. Second, take the primary key from the Programs table and put it in the Roles table as well. Finally, let's split the new Character Name (now singular) field into Last Name and First Name fields to eliminate the multipart field. A separate field is assigned to the character occupation (see Table 2.16).

Table 2.16. Roles
Multivalue Fields Resolved
ProgramIDLast NameFirst NameOccupation
1 (Andy Griffith Show)TaylorAndySheriff
1 (Andy Griffith Show)FifeBarneyDeputy Sheriff
1 (Andy Griffith Show)PyleGomerFilling station attendant
2 (Happy Days)CunninghamRichieStudent
2 (Happy Days)FonzarelliArthurMotorcyclist

NOTE Next to the ProgramID, I've included in italics the name of the TV show the ID refers to. These names are not part of the table. I've (reluctantly) included them only so you can see the program the ID refers to.


This solution seems counterintuitive; it provides a complex, even convoluted, solution to what seems to be a simple problem.

But using keys to resolve multivalue fields gives you a small taste of how you are going to integrate data in your database. As you'll see, putting the primary key of the original table into the new table enables you to establish a relationship between the original and new tables. By using primary keys to represent the values in each record, you make it possible to bring together values from different tables and integrate the information in your database.

Q&A

Q1:Wait a minute: The ProgramID field has a lot of duplicates. Didn't you tell us that a primary key can't have duplicate values?
A1: ProgramID is the primary key of the Programs table; it's not the primary key in the Roles table. You haven't assigned a primary key to the Roles table yet; you do so in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.248.149