Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

27
Building Databases with SQL Scripts

The previous chapter provided an introduction to using SQL to create and manage databases. That chapter also hinted at techniques for using SQL scripts to make database maintenance easier.

This chapter goes a little further. It discusses some of the details that you need to take into account when you use scripts to manage a database.

In this chapter, you learn how to:

Know when scripts can be useful.
Build tables in a valid order.
Insert data into tables in a valid order.
Drop tables in a valid order.

WHY BOTHER WITH SCRIPTS?

SQL statements let you create, populate, modify, and delete the tables in a database. In many database products, SQL statements even let you create and destroy the database itself. For example, MySQL's CREATE DATABASE and DROP DATABASE statements create and destroy databases.

If you put these SQL commands in a script, you can rerun that script whenever it's necessary. You can easily rebuild the database if it gets corrupted, make copies of the database on other computers, fill the tables with data to use when running tests, and reinitialize the data after the tests are finished.

Being able to reinitialize the data to a known state can also be very helpful in tracking down bugs. It's extremely hard to find a bug if it just pops up occasionally and then disappears again. If you can reinitialize the database and then make a bug happen by following a series of predictable steps, it's much easier to find and fix the problem.

SCRIPT CATEGORIES

Scripts that are useful for managing databases fall into at least four categories, described in the following subsections.

Database Creation Scripts

Database creation scripts build the database's structure. They build the tables, primary keys, indexes, foreign key constraints, field and table check constraints, and all of the other structure that doesn't change as the data is modified.

Basic Initialization Scripts

Basic initialization scripts initialize basic data that is needed before the database can be useful. This includes system parameter tables, lookup tables, and other tables that hold data that changes only rarely when the database is in use.

For example, you might use one of these scripts to initialize a list of allowed states or regions, physical constants (the speed of light, Avogadro's number, Finagle's Variable Constant), or define data type conversion constants (how many centimeters in an inch, how many millimeters in an attoparsec, how many seconds in a microfortnight).

Data Initialization Scripts

These scripts place data in tables. These range from small scripts that initialize a few values to huge monster scripts that insert thousands of records into the database.

Often, it's useful to have a separate subcategory for test scripts that fill the tables with data for use in specific tests. You would run a script to prepare the data, and then run the test. If the test can be executed by SQL statements, the script might perform the test too. Sometimes, it may be useful to have a separate test initialization script for every use case defined by your requirements documents.

It's also often useful to have separate scripts to initialize different parts of the database. For example, you might have a script that creates users, another that creates orders, and a third that creates invoice and payment data.

You can build scripts that invoke smaller scripts to perform larger tasks. For example, you might make a test initialization script that calls the standard user initialization script, and then inserts or updates specific records in other tables to prepare for the test that you are about to perform.

For example, the following MySQL script invokes three others. It creates a database, selects it, and then calls three other scripts that create a table, insert some data, and select the data. It then drops the database.

CREATE DATABASE MultipleScripts;
 
USE MultipleScripts;
 
SOURCE C:RodDB DesignMultiScript1.sql
SOURCE C:RodDB DesignMultiScript2.sql
SOURCE C:RodDB DesignMultiScript3.sql
 
DROP DATABASE MultipleScripts;

It may not always be necessary to break the database scripts into little pieces, but on some projects it's even useful to have two separate scripts to create and initialize each table. Then if you change a table, it's easy to find the creation and initialization code for it. Higher-level scripts can then call those scripts to build and initialize the database.

Similarly, if you have a bug in one table (perhaps it's missing a field or foreign key constraint), you can fix its creation code and rerun the scripts.

Cleanup Scripts

Often it's easier to simply drop the database and re-create it than it is to clean up the mess left by a test, but occasionally it's useful to truncate or drop only some of the tables. For example, if the database contains millions of records, it may be easier and faster to repair changes made by tests than to rebuild the whole thing from scratch.

It's not always easy to undo changes made by a complex series of tests, particularly if you later make changes to the tests. In fact, it's often hard to even tell if you've successfully undone the changes. For those reasons, I usually prefer to rebuild the database from scratch when possible.

Saving Scripts

Just as any other piece of software does, scripts change during development, testing, and use. Also as is the case for other types of software, it's often useful to look back at previous versions of scripts. To ensure that those versions are available, always keep the old versions of scripts. Later if you discover a problem, you can compare the current and older versions to see what's changed.

One way to keep old scripts is to use version control software. Programs such as Git (see https://github.com), CVS (Concurrent Versions System, see www.nongnu.org/cvs) and Apache Subversion (abbreviated SVN and found at https://subversion.apache.org) keep track of different versions of files. You can store your scripts in one of those systems, and then update the files whenever you create a new version. Then you can always go back and see the older versions if you have a reason. (Search online for version control software to see many other tools.)

If you don't feel like using a formal version control system, you can invent your own in one of several ways. For example, you can put a version number in the script filenames. You might make a script named MakeUsers.sql that fills the Users table. The file MakeUsers.sql would always contain the most current version and MakeUsers001.sql, MakeUsers002.sql, and so forth would contain older versions.

Another approach is to email scripts to yourself when you revise them. Later, you can search through the emails sorted by date to see the older versions. To keep your normal email account uncluttered so you can easily find those offers for mortgage debt elimination, jobs as a rebate processor, and pleas for help in getting $10 million out of Nigeria (you get to keep $3 million for your trouble), create a separate email account to hold the scripts.

ORDERING SQL COMMANDS

One issue that you should consider when building scripts is that some commands must be executed in a particular order. For example, suppose the Races table for your cheese-rolling database (search online or see www.cheeseprofessor.com/blog/british-cheese-rolling) has a WinnerId field that refers to the Racers table's RacerId field as a foreign key. In that case, you must create the Racers table before you create the Races table. Clearly, the Races table cannot refer to a field in a table that doesn't yet exist.

Usually, you can create the tables in some order so none refers to another table that doesn't yet exist. (Tip: build lookup tables first.) If for some bizarre reason there is no such ordering, you can use an ALTER TABLE ADD FOREIGN KEY statement (or a similar statement in whatever version of SQL you are using) to create the foreign key constraints after you build all of the tables.

You may also be able to tell the database to turn off constraint checking while you build the tables. For example, the following MySQL script builds three tables that are mutually dependent. TableA refers to TableB, TableB refers to TableC, and TableC refers to TableA.

 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='TRADITIONAL';
 
 CREATE DATABASE CycleDb;
 USE CycleDb;
 
 CREATE TABLE TableC (
   CType    VARCHAR(10)    NOT NULL,
   AType    VARCHAR(10)    NULL,
 
   PRIMARY KEY (CType),
 
   INDEX FK_CrefA (AType ASC),
 
   CONSTRAINT FK_CrefA
     FOREIGN KEY (AType)
     REFERENCES TableA (AType)
     ON DELETE NO ACTION
     ON UPDATE NO ACTION
 );
 
 CREATE TABLE TableB (
   BType    VARCHAR(10)    NOT NULL,
   CType    VARCHAR(10)    NULL,
 
   PRIMARY KEY (BType),
 
   INDEX FK_BrefC (CType ASC),
 
   CONSTRAINT FK_BrefC
     FOREIGN KEY (CType)
     REFERENCES TableC (CType)
     ON DELETE NO ACTION
     ON UPDATE NO ACTION
 );
 
 CREATE TABLE TableA (
   AType    VARCHAR(10)    NOT NULL,
   BType    VARCHAR(10)    NULL,
 
   PRIMARY KEY (AType),
 
   INDEX FK_ArefB (BType ASC),
 
   CONSTRAINT FK_ArefB
     FOREIGN KEY (BType)
     REFERENCES TableB (BType)
     ON DELETE NO ACTION
     ON UPDATE NO ACTION
 );
 
 INSERT INTO TableA VALUES("A value", "B value");
 INSERT INTO TableB VALUES("B value", "C value");
 INSERT INTO TableC VALUES("C value", "A value");
 
 SET SQL_MODE=@OLD_SQL_MODE;
 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
 
 SELECT * FROM TableA;
 SELECT * FROM TableB;
 SELECT * FROM TableC;
 
 DROP DATABASE CycleDb;

The first three statements tell the database not to check for unique key values, not to check foreign key constraints, and to use traditional SQL behavior (in short, give errors instead of warnings).

The script then creates the three interdependent tables and inserts some values in them. It then restores the original values for the SQL mode, foreign key checking, and unique key checking.

The script finishes by performing some queries and dropping the database.

Just as you may need to create tables in a particular order, you may need to insert values into the tables in a particular order. For example, you'll need to create the Racers record for Abby Lampe before you create the Races record for the 2022 women's race because Abby won that race.

If you cannot find a legal ordering for the INSERT statements, you may be able to disable the database's checks just as you can while creating tables. The preceding script inserts records that depend on each other, so there is no valid way to enter those values without disabling the error checking.

Finally, if you delete records or entire tables, you may need to do so in a particular order. After you've built your cheese-rolling database, you cannot remove Abby Lampe's record from the Racers table before you remove the Races record that refers to her.

ORDERING TABLES

I haven't actually seen a real database with mutually dependent tables so I have always been able to put them in a valid order. To hone your ordering skills, make a list showing a valid order for creating the tables in the design shown in Figure 27.1:

FIGURE 27.1

A list where some of the objects must come before others but not all of the relationships are known is called a partial ordering. Creating a full ordering that satisfies a partial ordering is called extending the partial ordering.

Define a partial ordering by making a list of the tables. Next to each table, make a predecessor list showing all the other tables that must be defined before that one.
Use this list to build an output list giving the tables in a valid order.
For this example, you might be able to intuit a valid order, but it's a complicated design so it could be tricky. Fortunately, there's a relatively simple algorithm that lets you extend partial orderings.
1. Look through the list you created in step 1 and find any tables that have no predecessors.
2. Put those tables in the output list.
3. Remove (or cross out) those tables from any other table's predecessor list.
4. Remove (or cross out) the outputted tables' rows from the list.
5. Repeat this process until every table is in the output list or you find a group of mutually dependent tables.

How It Works

Define a partial ordering by making a list of the tables.

If two tables are related by a one-to-many relationship, the table on the “many” side of the relationship depends on the table on the “one” side. For example, a CourseOfferings record refers to a Locations record (each course offering occurs at some location), so you must create the Locations table before you can create the CourseOfferings table.

If two tables are related by a one-to-one relationship, you need to think a bit harder about which depends on the other. Normally, such a relationship involves the primary key of only one of the tables. In that case, the non-primary key table depends on the primary key table. For example, the Addresses and Persons tables in Figure 27.1 have a one-to-one relationship. The relationship connects the Addresses table's primary key AddressId with the Persons table's non-primary key AddressId field, so the Persons table (non-primary key) depends on the Addresses table (primary key).

Another way to think of that is to ask, “Which table provides detail for the other table?” In this case, you would normally look up a Persons record, and then search for that person's address. You probably wouldn't look up an address, and then search for the person who lives there.

Table 27.1 shows the database's tables and their predecessors.

TABLE 27.1: Initial predecessors

TABLE	PREDECESSORS
Addresses	CityStateZips, States
AnimalTypes
CityStateZips
CourseOfferingDates	CourseOfferings
CourseOfferings	Locations, Courses, Employees
Courses	AnimalTypes
CustomerCourses	CourseOfferings, Customers
Customers	Persons
Employees	Persons
EmployeeSpecialties	Specialties, Employees
InventoryItems	StockLocations, ShelfLocations, Vendors
Locations
OrderItems	Orders, InventoryItems
Orders	Customers, Employees, Addresses
Persons	Addresses
Pets	Customers, AnimalTypes
Phones	PhoneTypes, Persons
PhoneTypes
ShelfLocations
Shifts	Employees
Specialties
States
StockLocations
TimeEntries	Employees
Vendors	Persons

(Give yourself bonus points if you cringed a bit at this table and said to yourself, “Hey, that's not in 1NF because the Predecessors column doesn't hold a single value!”)

Use this list to build an output list giving the tables in a valid order.

During the first pass through this list, the AnimalTypes, CityStateZips, Locations, PhoneTypes, ShelfLocations, Specialties, States, and StockLocations tables have no predecessors. You can immediately output those tables (so you can build them first in the database creation script) and remove them from the list. (You may have been able to guess that just by looking at Figure 27.1 because these are all lookup tables. Lookup tables generally don't rely on other tables.)

Table 27.2 shows the revised list with the lookup tables removed.

TABLE 27.2: Table predecessors after one round

TABLE	PREDECESSORS
Addresses
CourseOfferingDates	CourseOfferings
CourseOfferings	Courses, Employees
Courses
CustomerCourses	CourseOfferings, Customers
Customers	Persons
Employees	Persons
EmployeeSpecialties	Employees
InventoryItems	Vendors
OrderItems	Orders, InventoryItems
Orders	Customers, Employees, Addresses
Persons	Addresses
Pets	Customers
Phones	Persons
Shifts	Employees
TimeEntries	Employees
Vendors	Persons

After removing the lookup tables, the revised list has two tables with no predecessors: Addresses and Courses. Output them and build them next in the database creation script. Table 27.3 shows the revised list.

TABLE 27.3: Table predecessors after two rounds

TABLE	PREDECESSORS
CourseOfferingDates	CourseOfferings
CourseOfferings	Employees
CustomerCourses	CourseOfferings, Customers
Customers	Persons
Employees	Persons
EmployeeSpecialties	Employees
InventoryItems	Vendors
OrderItems	Orders, InventoryItems
Orders	Customers, Employees
Persons
Pets	Customers
Phones	Persons
Shifts	Employees
TimeEntries	Employees
Vendors	Persons

Now the list has only one table with no predecessors, Persons, and you might wonder if we'll soon reach a state where we're stuck with no tables without predecessors. Cross your fingers, output the Persons table, and update the list so it looks like Table 27.4.

TABLE 27.4: Table predecessors after three rounds

TABLE	PREDECESSORS
CourseOfferingDates	CourseOfferings
CourseOfferings	Employees
CustomerCourses	CourseOfferings, Customers
Customers
Employees
EmployeeSpecialties	Employees
InventoryItems	Vendors
OrderItems	Orders, InventoryItems
Orders	Customers, Employees
Pets	Customers
Phones
Shifts	Employees
TimeEntries	Employees
Vendors

You can breathe a sigh of relief when you see that the new list contains several tables without predecessors: Customers, Employees, Phones, and Vendors. Output them and update the list to get Table 27.5.

TABLE 27.5: Table predecessors after four rounds

TABLE	PREDECESSORS
CourseOfferingDates	CourseOfferings
CourseOfferings
CustomerCourses	CourseOfferings
EmployeeSpecialties
InventoryItems
OrderItems	Orders, InventoryItems
Orders
Pets
Shifts
TimeEntries

At this point, most of the tables have no predecessors, so we may be home free. Output the CourseOfferings, EmployeeSpecialties, Orders, Pets, Shifts, and TimeEntries tables and update the table to get the list in Table 27.6.

TABLE 27.6: Table predecessors after five rounds

TABLE	PREDECESSORS
CourseOfferingDates
CustomerCourses
OrderItems

Now, the three remaining tables have no predecessors: CourseOfferingDates, CustomerCourses, and OrderItems. You can build those tables last when you create the database.

A complete ordering of the tables is: AnimalTypes, CityStateZips, Locations, PhoneTypes, ShelfLocations, Specialties, States, StockLocations, Addresses, Courses, Persons, Customers, Employees, Phones, Vendors, CourseOfferings, EmployeeSpecialties, InventoryItems, Orders, Pets, Shifts, TimeEntries, CourseOfferingDates, CustomerCourses, OrderItems.

Note that this is not the only possible complete ordering for these tables. Each time a group of tables had no predecessors, you could have created them in any order.

In some cases you can also mix tables from different “rounds.” For example, in Table 27.1, the AnimalTypes table has no predecessors and the Courses table depends only on that table. After you output the AnimalTypes table, you can output the Courses table without waiting for all of the other lookup tables.

SUMMARY

SQL scripts can make building and maintaining a database much easier than working manually with database tools such as MySQL or Access. They are particularly useful for repeatedly performing tasks such as initializing the database before performing a test.

This chapter explained:

Why scripts are useful
Different categories of useful scripts such as database creation, basic initialization, data initialization, and cleanup
How to save different versions of scripts
How to create tables, insert data, remove data, and delete tables in a valid order

Scripts are useful for maintaining databases. The following chapter discusses some of the typical maintenance chores that you should perform to keep a database in working order. Before you move on to Chapter 28, however, use the following exercises to test your understanding of the material covered in this chapter. You can find the solutions to these exercises in Appendix A.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 27 Building Databases with SQL Scripts

Create new playlist

Sign In

Sign Up