3.1. A Quick Database Primer

As you work with computers, you quickly realize that everything you do on a computer deals with data in one sense or another. Whether you are creating a Word document or crunching numbers with Excel, it is all data. However, not all data belongs in a database, and not all programs are meant to be used as databases, although if you look at some people's documents and worksheets, you may wonder if they are trying to use them as databases. This section explains a few things about databases and shows how you use real-world databases everyday.

If you have been using Microsoft Office products for a while, you have probably had some experience with or at least heard of databases. In fact, even if you haven't used databases on the computer, you have used them in real life.

3.1.1. Databases in the Real World

In the real world there are a never ending number of tasks and subjects that work as an example of databases. Every day from the time you get up until the time you go to bed, you are dealing with databases of one kind or another. Here are just a few examples of real-world data:

  • Mailing lists

  • School registrations

  • Checking account information and history

  • Membership lists

  • Customer information

And the list goes on and on. While some of these items look like simple topics in themselves, undoubtedly each topic could be fleshed out with additional data so that more than one topic, (what are called tables in database jargon) would be necessary.

The last entry in the preceding list is a common example of a real-world database and is worth discussing further. Customer information is stored as business records in manila folders, usually located in a filing cabinet. In the manila folders customer information is stored, with either:

  • One customer's information stored in each folder

  • All customer information sheets in one folder

Electronic databases can be analogous to either of these methods, which have been used for years in the real world.

In accessing the real-world customer database, you:

  1. Open the file cabinet

  2. Search through the cabinet for the folder you are looking for

  3. Pull out the folder

  4. Look through the folder for the information for which you are searching

  5. Take the piece, or pieces of paper, containing the information

  6. Read the data on the page

  7. Modify the data as necessary

At this point, you also could also add new information by filling out a new form or delete information by throwing away information. Of course, nowadays you would most likely shred the information for security reasons.

It should be noted that the following terms are generic as far as the various database systems are concerned. These terms will be discussed in greater details in the next section.

NOTE

Tables are used to store data in databases. Fields (columns) are used to store individual pieces of data such as customer name, address, and so on. The information supplied in all the fields makes up a record (row) in the table. So in this instance, all customer information such as name, address, city, state, and so on makes up a record (row). These terms are used interchangeably when discussing various database products such as Microsoft Access (fields, records) and Microsoft SQL Server (columns, rows).

3.1.2. Database Models

Various models of databases exist, two of which are flat file and relational databases. Nowadays, the relational model of databases is the most commonly used model for desktop and Web development. However, before going deeper into the relational database model, you'll read about the flat file model, including how these databases store data and their drawbacks.

3.1.2.1. Flat File Model Databases

Flat file model databases store information in single tables, including repeated data. For example, if a store were selling different kinds of coffee and wanted to track customers, invoices, invoice items, and suppliers, it would organize information as shown in Figure 3-1, which is a flat file style table.

Figure 3.1. Figure 3-1

NOTE

If you look at this figure closely, you may notice that it looks as if it was created in Excel, which it was. A lot of developers and users store data in Excel spreadsheets, thereby creating flat file tables and databases without realizing it.

There are a number of problems with the flat file database model. Here are just a few of the issues:

  • Redundant Data: Often Entries are repeated, taking up more space than necessary. In Figure 3-2, there is no reason to spell out the names each time, taking up more space.

  • Error Prone: When data has to be repeated, there is more of a chance of entering erroneous data into the table.

  • Limited Columns: Currently, only two products and their information can be entered using the table structure displayed.

  • Extra Work to Update: With the redundant data issue, if you want to make any updates, you will have to make sure that you parse through the other fields and update those values to match.

In addition to these issues, reporting on the data can be problematic. Now take a look at the relational database model.

3.1.3. Relational Database Model

Unlike the flat file database model, which stores all data, including related data, in a single record and table, the relational database model uses tables that are related to each other to store information. For example, instead of having your coffee invoices all stored in a single table called tblCoffeeInvoices, the information would be stored in related tables with customer information being stored in one table, invoice information being stored in another, product information in yet another, and so on. Figure 3-2 shows an example of how the flat file table in the previous section could be restructured into a relational model.

Figure 3.2. Figure 3-2

This figure was taken from Microsoft Access and is a shot of the tables in a database called Coffee.mdb, located in the Chapter 3 folder. The figure shows the structure with the data from the single table separated out into multiple tables. The process of normalizing a database is discussed later in the chapter, in the section titled "Normalizing Your Data."

Take a look at some of the benefits of using relational databases. They are pretty much the opposite of the issues found in flat file databases:

  • Nonredundant Data: Because entries are entered once, and other tables point to the data, there is less redundant data.

  • Less Error Prone: When data is entered once in lookup tables, data is then picked from lists; this process can be controlled and is less prone to error.

  • Unlimited Data: Because data is stored in records (down the table) as opposed to fields (across the table), the data is not limited to predefined structures. For example, when you want add another product to an invoice, you simply add another record to tblInvoiceDetails, where before in the flat file you would have had to add a third or fourth product column.

It may look like a lot more work to maintain a relational database when you look at Figure 3-3 because of the multiple tables, but you very quickly learn to appreciate the benefits of the relational database despite the extra work needed in the beginning. Next, you'll read about the elements that make up relational databases.

3.1.3.1. Tables: Where Data is Stored

As mentioned in a note earlier in the chapter, tables are where your data is stored. Tables have specific component: fields, primary keys, and indexes.

3.1.3.1.1. Fields

When created, table structures consist of fields that represent pieces of data. Fields have properties that give you control over the data that goes into them. Here are a few of those properties common to different database systems such as Access and SQL Server:

  • Name: Field names are what you will refer to when you want to pull information from the field or assign data to the field. You will want to assign names that are easy to understand. For example, in the tblCustomer table the two fields displayed in Figure 3-3 these would be CustomerID and CustomerName.

  • Data Types: Data types tell the database system how to handle the data placed in the field. Which types of data types there are will depend on the specific database system. Microsoft Access calls text data under 255 characters the Text datatype; in SQL Server this would be nvarchar(255).

  • Other Properties: There are a number of other properties that help control data going into the fields, and those properties will again depend on which database system you are using. Some properties, such as Default Value, are used by most systems, but then some, such as the Caption property, are used by Access but not SQL Server.

You can see an example of the table structure for tblCustomers listed in Figure 3-3 with the Customer field highlighted. The table structure is displayed in Microsoft Access.

Figure 3.3. Figure 3-3

3.1.3.1.2. Primary and Foreign Key Fields

Notice the callout for the Primary Key field in Figure 3-4. Each table should have a primary key. In the case of tblCustomers, the primary key is the field CustomerID. The primary key makes sure that each record in a table is unique, and it provides the ability to always find a specific record. How primary keys are specified will again depend on the database system you are using.

Foreign key fields are fields in a table that point to primary key fields in other tables. For example, you will see CustomerID in tblInvoice, which is used to match the primary key field CustomerID in tblCustomers. Primary and foreign key fields are especially important in the use of relations.

3.1.3.2. It's All about Relations

Relationships are how you tie (relate) data together using separate tables. Figure 3-4 is a repeat of 3-2 because it actually shows the relationships window in Access, displaying the relationships for Coffee.mdb.

Figure 3.4. Figure 3-4

There are three types of relationships found in relational databases. Because coffee.mdb mainly uses one type of relationship, other examples are listed outside that database:

  • One-to-One Relationship: Used when you want to have records in one table match up with individual records in another table based on the same primary key in each table. An example of this in a banking database is a table that stores private information that would match up directly with a table that stores information that can be viewed by anyone. This is probably the least used type of relationship, because the use of queries (Access) and views (SQL Server) can limit the data you can access in tables.

  • One-to-Many Relationship: This type of relation is used to relate a table such as tblCustomer (a customer) with one such as tblInvoices (the customer's invoices). The way you look at it is that one customer can have many invoices. Note that the primary key is in tblCustomer, and the foreign key in tblInvoices.

  • Many-to-Many Relationship: This is a pair of one-to-many relationships used with three tables. An example of this could be used in an insurance database. Insurance companies can have multiple customers, and customers can have multiple insurance companies.

Which type of relationship you use will depend on the need being met. You can use all three in the same database or just use one type of relationship throughout the database. It really comes down to the data.

3.1.3.3. Referential Integrity

One of the important aspects of relational databases is maintaining the referential integrity of the data. An example of maintaining referential integrity in the coffee database is that a record in the tblInvoices table can't be created without a related record existing in tblCustomers. Another example would be that a record in the tblProducts table could not be added if a record were already in the tblSuppliers table.

Depending on the database system, you can set referential integrity up to also help maintain data once it is in the database. For example, you can specify that a record can't be deleted in one table, such as tblCustomers, if there are records in tblInvoices related to it.

Another use for referential integrity with current data is to have records deleted in related tables, such as tblInvoices, when a record is deleted in the table that contains the primary key, in this case, tblCustomers.

3.1.3.4. Normalizing Your Data

Normalizing data is the steps taken to take non-normalized data (a flat file) and shape it into what is called normal (relational) form. Refer again to Figure 3-1.

Here are the steps and the tasks you for each step:

First normal form (1NF):

  1. Remove duplicate columns from the table.

  2. Create separate tables for each group of related data, identifying each row with a unique column or set of columns. This unique column or set of columns would be the primary key.

In the case of the table displayed in Figure 3-1, the Product1, Product2, and product-specific information is removed from the main table and broken out into separate rows.

Second normal form (2NF):

  1. Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

  2. Create relationships between these new tables and their predecessors through the use of foreign keys.

In this case, you would remove the customer information and store it into a separate table, then create a relationship between the new customer table and the table containing the invoice information.

Third normal form (3NF):

  • Remove columns not dependent upon the primary key.

Invoice detail is broken out into separate tables at this point and each item is given its own ID, with a foreign key pointing to the invoice header record. At this point, the tables and relations would look as shown in Figure 3-5.

Fourth normal form (4NF):

  • A relation is in 4NF if it has no multivalued dependencies.

There are additional forms possible, depending on how far you want to take the normalization. The majority of databases are in used in third or fourth normal form.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.9.179