9. Schemas and Tables
As a complete data manipulation language, SQL contains tatements that let you create, modify, and delete structural elements in a database. In this chapter we will begin the discussion of a database's structural elements by looking at schemas and the permanent base tables that you create within them. This discussion will be concluded in Chapter 10, which covers additional structural elements such as views, temporary tables, and indexes.
The actual file structure of a database is implementation dependent, as is the procedure needed to create database files. Therefore, the discussion in this chapter assumes that the necessary database files are already in place.
Database Object Hierarchy
The objects in a database maintained using SQL are arranged in a hierarchy diagrammed in Figure 9-1. 1 The smallest units with which a database works—the columns and rows—appear in the center. These in turn are grouped into tables and views.
1Some DBMSs support a “create database” capabiity, which provides an overall named unit for all the elements in a database. However, a “database” isn't a structural element in the SQL standard.
B9780123756978500091/f09-01-9780123756978.jpg is missing
Figure 9-1
The SQL database object hierarchy
The tables and views that constitute a single logical database are collected into a schema. Multiple schemas are grouped into catalogs, which can then be grouped into clusters. A catalog usually contains information describing all the schemas handled by one DBMS. Catalog creation is implementation dependent and therefore not part of the SQL standard.
Prior to SQL-92, clusters often represented database files, and the clustering of objects into files was a way to increase database performance. The current concept of a cluster, however, is a group of catalogs that are accessible using the same connection to a database server. None of the groupings of database objects in the SQL standard are related to physical storage structures. If you are working with a centralized mainframe DBMS, you may find multiple catalogs stored in the same database file. However, on smaller or distributed systems, you are just as likely to find one catalog or schema per database file or to find a catalog or schema split between multiple files.
Clusters, catalogs, and schemas are not required elements of a database environment. In a small installation where there is one collection of tables serving a single purpose, for example, it may not even be necessary to create a schema to hold them.
Naming and Identifying Objects
The way in which you name and identify database objects is in some measure dictated by the object hierarchy:
◊ Column names must be unique within the table.
◊ Table names must be unique within the schema.
◊ Schema names must be unique within their catalog.
◊ Catalog names must be unique within their cluster.
As you saw when you were reading about data retrieval, when a column name appears in more than one table in a query, you must specify the table from which a column should be taken (even if it makes no difference which table is used). The general form for specifying duplicate names is
table_name.column_name
If an installation has more than one schema, then you must also indicate the schema in which a table resides:
schema_name.table_name.column_name
This naming convention means that two different schemas can include tables with the same name.
By the same token, if an installation has multiple catalogs, you will need to indicate the catalog from which an object comes
catalog_name.schema_name.table_name.column_name
Note: The SQL standard refers to element names that use the dot notation as “identifier chains.”
The names that you assign to database elements can include the following:
◊ Letters
◊ Numbers
◊ Underscores (_)
Names can be up to 128 characters long. They are not case sensitive. (In fact, many SQL command processors convert names to all upper- or lowercase characters before submitting a SQL statement to a DBMS for processing.)
Note: Some DBMSs also allow pound signs (#) and dollar signs ($) in element names, but neither is recognized by SQL queries so their use should be avoided.
Schemas
To a database designer, a schema represents the overall, logical design of a complete database. As far as SQL is concerned, however, a schema is nothing more than a container for tables, views, and other structural elements. It is up to the database designer to place a meaningful group of elements within each schema.
A schema is not required to create tables and views. In fact, if you are installing a database for an environment in which there is likely to be only one logical database, then you can just as easily do without one. However, if more than one database will be sharing the same DBMS and the same server, organizing database elements into schemas can greatly simplify the maintenance of the individual databases.
To create a schema, you use the CREATE SCHEMA statement. In its simplest form, it has the syntax
Creating a Schema
CREATE SCHEMA schema_name
as in
CREATE SCHEMA rare_books;
By default, a schema belongs to the user who created it (the user ID under which the schema was created). The owner of the schema is the only user ID that can modify the schema unless the owner grants that ability to other users.
To assign a different owner to a schema, you add an AUTHORIZATION clause
CREATE SCHEMA schema_name AUTHORIZATION owner_user_ID
For example, to assign the rare book store schema to the user ID DBA, someone could use
CREATE SCHEMA rare_books AUTHORIZATION dba;
When creating a schema, you can also create additional database elements at the same time. To do so, you use braces to group the CREATE statements for the other elements, as in
CREATE SCHEMA schema_name AUTHORIZATION owner_user_ID
{
other CREATE statements go here
}
This automatically assigns the elements within the braces to the schema.
Identifying the Schema You Want to Use
One of the nicest things about a relational database is that you can add or delete database structure elements at any time. There must therefore be a way to specify a current schema for new database elements after the schema has been created initially with the CREATE SCHEMA statement.
SET SCHEMA schema_name
To use SET SCHEMA, the user ID under which you are working must have authorization to work with that schema.
Alternatively, you can qualify the name of a database element with the name of the schema. For example, if you are creating a table, then you would use something like:
CREATE TABLE schema_name.table_name
For DBMSs that do not support SET SCHEMA, this is the only way to attach new database elements to a schema after the schema has been created.
Domains
A domain is an expression of the permitted values for a column in a relation. When you define a table, you assign each column a data type (for example, character or integer) that provides a broad domain. A DBMS will not store data that violate that constraint.
The SQL-92 standard introduced the concept of user-defined domains, which can be viewed as user-defined data types that can be applied to columns in tables. (This means that you have to create a domain before you can assign it to a column!)
Domains can be created as part of a CREATE SCHEMA statement or, if your DBMS supports SET SCHEMA, at any time after a schema has been defined.
To create a domain, you use the CREATE DOMAIN statement, which has the following general syntax:
CREATE DOMAIN domain_name data_type
CHECK constraint_name (expression_to_validate_values)
The CHECK clause is actually a generic way to express a condition that data must meet. It can include a SELECT to validate data against other data stored in the database or it can include a logical expression. In that expression, the keyword VALUE represents the data being checked. Naming the constraint is optional, but doing so makes it possible to access the constraint if you want to remove it at some time in the future.
For example, if the rare book store database should validate the price of a book, someone might create the following domain:
CREATE DOMAIN price NUMERIC (7,2)
CHECK price_check (VALUE >= 15);
After creating this domain, a column in a table can be given the data type of PRICE. The DBMS will then check to be certain that the value in that column is always greater than or equal to 15. (We will leave a discussion of the data type used in the preceding SQL statement until we cover creating tables in the next section of this chapter.)
The domain mechanism is very flexible. Assume, for example, that you want to ensure that telephone numbers are always stored in the format XXX-XXX-XXXX. A domain to validate that format might be created as
CREATE DOMAIN telephone CHAR (12)
CHECK phone_format
(SUBSTRING FROM 4 FOR 1 = ‘-’) AND
SUBSTRING (VALUE FROM 8 FOR 1 = ‘ ’);
You can also use the CREATE DOMAIN statement to give a column a default value. For example, the following statement sets up a domain that holds either Y or N and defaults to Y.
CREATE DOMAIN char_boolean CHAR (1)
DEFAULT ‘Y’
CHECK (UPPER(VALUE) = ‘Y’
OR UPPER(VALUE) = ‘N’);
Tables
The most important structure within a relational database is the table. Tables contain just about everything, including business data and the data dictionary.
SQL divides its tables into three categories:
◊ Permanent base tables: Permanent base tables are tables whose contents are stored in the database and remain permanently in the database unless they are explicitly deleted.
◊ Global temporary base tables: Global temporary base tables are tables used for working storage that are destroyed at the end of a SQL session. The definitions of the tables are stored in the data dictionary, but their data are not. The tables must be loaded with data each time they are going to be used. Global temporary tables can be used only by the current user, but they are visible to an entire SQL session (either an application program or a user working with an interactive facility.)
◊ Local temporary base tables: Local temporary base tables are similar to global temporary tables. However, they are visible only to the specific program module in which they are created.
Note: Temporary base tables are subtly different from views, which assemble their data by executing a SQL query. You will read more about this difference and how temporary tables are created and used inChapter 10.
Most of the tables you will use will be permanent base tables. You create them with the CREATE TABLE statement:
CREATE TABLE table_name
(column1_name column1_data_type, column1_constraints,
column2_name column2_data_type, column2_constraints, … table_constraints)
The constraints on a table include declarations of primary and foreign keys. The constraints on a column include whether values in are mandatory as well as other constraints you may decide to include in a CHECK clause.
Column Data Types
Each column in a table must be given a data type. Although data types are somewhat implementation dependent, you can expect to find most of the following:
◊ INTEGER (abbreviated INT): A positive or negative whole number. The number of bits occupied by the value is implementation dependent. On today's desktop computers, an integer is either 32 or 64 bits. Large computers may use up to 128 bits for integers.
◊ SMALLINT: A positive or negative whole number. A small integer is usually half the size of a standard integer. Using small integers when you know you will need to store only small values can save space in the database.
◊ NUMERIC (or occasionally, NUMBER): A fixed-point positive or negative number. A numeric value has a whole number portion and a fractional portion. When you create it, you must specify the total length of the number (including the decimal point) and how many of those digits will be to the right of the decimal point (its precision). For example,
NUMERIC (6,2)
creates a number in the format XXX.XX. The DBMS will store exactly two digits to the right of the decimal point.
◊ DECIMAL: A fixed-point positive or negative number. A decimal is similar to a numeric value. However, the DBMS may store more digits to the right of the decimal than you specify. Although there is no guarantee that you will get the extra precision, its presence can provide more accurate results in computations.
◊ REAL: A “single precision” floating point value. A floating point number is expressed in the format
±X.XXXXX * 10YY
where YY is the power to which 10 is raised. Because of the way in which computers store floating point numbers, a real number will never be an exact representation of a value, but only a close approximation. The range of values that can be stored is implementation dependent, although a common range is ±1038. You therefore cannot specify a size for a real number column.
◊ DOUBLE PRECISION (abbreviated DOUBLE): A “double precision” floating point number. The range and precision of double precision values are implementation dependent, but generally will be greater than with single precision real numbers. For example, if the single precision range is ±1038, then a typical double precision range is ±10308.
◊ FLOAT: A floating point number for which you can specify the precision. The DBMS will maintain at least the precision that you specify. (It may be more.)
◊ BOOLEAN: A logical value that can take only the values true and false.
◊ BIT: Storage for a fixed number of individual bits. You must indicate the number of bits, as in
BIT (n)
where n is the number of bits. (If you do not include the number of bits, you will have room for only one bit.)
◊ DATE: A date.
◊ TIME: A time.
◊ TIMESTAMP: The combination of a date and a time.
◊ CHARACTER (abbreviated CHAR): A fixed-length space to hold a string of characters. When declaring a CHAR column, you need to indicate the width of the column:
CHAR (n)
where n is the amount of space that will be allocated for the column in every row. Even if you store less than n characters, the column will always take up n bytes and the column will be padded with blanks to fill up empty space. The maximum number of characters allowed is implementation dependent.
◊ CHARACTER VARYING (abbreviated VARCHAR): A variable length space to hold a string of characters. You must indicate the maximum width of the column—
VARCHAR (n)
—but the DBMS stores only as many characters as you insert, up to the maximum n. The overall maximum number of characters allowed is implementation dependent.
◊ INTERVAL: A date or time interval. An interval data type is followed by a qualifier that specifies the unit of the interval and optionally the number of digits. For example,
INTERVAL YEAR
INTERVAL YEAR (n)
INTERVAL MONTH
INTERVAL MONTH (n)
INTERVAL YEAR TO MONTH
INTERVAL YEAR (n) TO MONTH
INTERVAL DAY
INTERVAL DAY (n)
INTERVAL DAY TO HOUR
INTERVAL DAY (n) TO HOUR
INTERVAL DAY TO MINUTE
INTERVAL DAY (n) TO MINUTE
INTERVAL MINUTE
INTERVAL MINUTE (n)
In the preceding examples, n specifies the number of digits. When the interval covers more than one date/time unit, such as YEAR TO MONTH, you can specify a size for only the first unit. Year/month intervals can include years, months, or both. Time intervals can include days, hours, minutes, and/or seconds.
◊ BLOB (Binary Large Object): Although not universal, the BLOB data type is supported by many current DBMSs. It can be used to store elements such as graphics. Unlike other data types, however, BLOB columns cannot be searched because the contents are an undifferentiated group of binary data.
In Figure 9-2 you will find the bare bones CREATE TABLE statements for the rare book store database. These statements include only column names and data types. SQL will create tables from statements in this format, but because the tables have no primary keys, some DBMSs will not let you enter data.
B9780123756978500091/f09-02a-9780123756978.jpg is missing
B9780123756978500091/f09-02b-9780123756978.jpg is missing
Figure 9-2
Initial CREATE TABLE statements for the rare book store database
Default Values
As you are defining columns, you can designate a default value for individual columns. To indicate a default value, you add a DEFAULT keyword to the column definition, followed by the default value. For example, in the sale relation, it makes sense to assign the current date to the sale_date column as a default. The column declaration is therefore written
sale_date DATE DEFAULT CURRENT_DATE
Notice that this particular declaration is using the SQL value CURRENT_DATE. However, you can place any value after DEFAULT that is a valid instance of the column's domain.
NOT NULL Constraints
The values in primary key columns must be unique and not null. In addition, there may be columns for which you want to require a value. You can specify such columns by adding NOT NULL after the column declaration. Since the staff of the rare book store wants to ensure that an order date is always entered, the complete declaration for the column in the sale table is
sale_date DATE NOT NULL DEFAULT CURRENT_DATE
Primary Keys
To specify a table's primary key, you add a PRIMARY KEY clause to a CREATE TABLE statement. The keywords PRI-MARY KEY are followed by the names of the primary key column or columns, surrounded by parentheses. In the case of a concatenated primary key, place all columns that are part of the primary key within the parentheses.
In Figure 9-3 you will find the CREATE TABLE statements for the rare book store database including primary key declarations.
B9780123756978500091/f09-03a-9780123756978.jpg is missing
B9780123756978500091/f09-03b-9780123756978.jpg is missing
Figure 9-3
CREATE TABLE statements for the rare book store database including primary key declarations
Foreign Keys
As you know, a foreign key is a column (or concatenation of columns) that is exactly the same as the primary key of another table. When a foreign key value matches a primary key value, we know that there is a logical relationship between the database objects represented by the matching rows.
One of the major constraints on a relation is referential integrity, which states that every nonnull foreign key must reference an existing primary key value. Early implementations of SQL and early versions of the SQL standard did not include support for foreign keys. Validation of referential integrity was left up to application programmers. However, it is far better to have foreign keys identified in the data dictionary and referential integrity enforced directly by a DBMS. Referential integrity was therefore added to the SQL-89 standard.
Listing Table Structure
Although not part of the SQL standard, many DBMSs support a DESCRIBE command that displays the structure of a table. (The standard SQL DESCRIBE returns information about a prepared embedded SQL statement.) To use it, follow the keyword DESCRIBE with the name of the table, as in
DESCRIBE customer
The result is a table showing the structure of the named table in a format similar to the following:
Table “enterprisedb.customer”
ColumnTypeModifiers
customer_numbintegernot null
first_namecharacter varying(30)
last_namecharacter varying(30)
streetcharacter varying(50)
citycharacter varying(30)
state_provincecharacter(2)
zip_postcodecharacter(10)
contact_phonecharacter(12)
Indexes:
“pk_customer” PRIMARY KEY, btree (customer_numb)
To specify a foreign key for a table, you add a FOREIGN KEY clause:
FOREIGN KEY foreign_key_name (foreign_key_columns)
REFERENCES primary_key_table (primary_key_columns)
ON UPDATE update_action
ON DELETE delete_action
The names of the foreign key columns follow the keywords FOREIGN KEY. The REFERENCES clause contains the name of the primary key table being referenced. If the primary key columns are named in the PRIMARY KEY clause of their table, then you don't need to list the column names. However, if the columns aren't part of a PRIMARY KEY clause, you must list the primary key columns in the REFERENCES clause.
The final part of the FOREIGN KEY specification indicates what should happen when a primary key value being referenced by the foreign key is deleted or updated. There are three options that apply to both updates and deletions and one additional option for each:
◊ SET NULL: Replace the foreign key value with null. This isn't possible when the foreign key is part of the primary key of its table.
◊ SET DEFAULT: Replace the foreign key value with the column's default value.
◊ CASCADE: Delete or update all foreign key rows.
◊ NO ACTION: On update, make no modification of foreign key values.
◊ RESTRICT: Do not allow deletion of the primary key value.
The complete declarations for the rare book store database tables, which include foreign key constraints, can be found in Figure 9-4. Notice that although there are no restrictions on how to name foreign keys, the foreign keys in this database have been named to indicate the tables involved. This makes them easier to identify if you need to delete or modify a foreign key at a later date.
Note: The precise syntax allowed for foreign key declarations is somewhat implementation dependent. For example, some DBMSs do not support named foreign keys. You will need to check your DBMS's documentation to verify the exact syntax required.
B9780123756978500091/f09-04a-9780123756978.jpg is missing
B9780123756978500091/f09-04b-9780123756978.jpg is missing
B9780123756978500091/f09-04c-9780123756978.jpg is missing
Figure 9-4
Complete CREATE TABLE statements for the rare book store database including primary and foreign key declarations
Additional Foreign Key Options
The SQL Core standard provides some additional flexibility in the definition of foreign keys, including the following:
◊ Rules for determining what occurs when all or part of a foreign key is null. By default, if any part of a foreign key is null, then the DBMS will accept it. If you add a MATCH PARTIAL to the foreign key definition and part of a foreign key is null, then the nonnull portions of the foreign key must match parts of an existing foreign key. If you add MATCH FULL, a foreign key must either be completely null or match an existing primary key completely.
◊ Rules for determining the action to take when a primary key referenced by the foreign key is updated. If you specify ON UPDATE CASCADE, then the DBMS will automatically update the foreign key values when the primary key values they reference are modified. (In most cases, this will be the desired option because it maintains the consistency of cross-references through-out the database.) In addition, you can choose to ON UPDATE SET NULL (set the foreign key values to null), ON UPDATE SET DEFAULT (set the foreign key values to their columns' default value), or ON UPDATE NO ACTION (do nothing).
Note: In a well-designed relational DBMS, primary key values should never be modified. However, given that people persist inusing meaningful primary keys, you may want to be certain that cross-references are maintained.
Additional Column Constraints
There are additional constraints that you can place on columns in a table beyond primary and foreign key constraints. These include requiring unique values and predicates in CHECK clauses.
Requiring Unique Values
If you want to ensure that the values in a non-primary key column are unique, you can use the UNIQUE keyword. UNIQUE verifies that all nonnull values are unique. For example, if you were storing social security numbers in an employees table that used an employee ID as the primary key, you could also enforce unique social security numbers with
ssn CHAR (11) UNIQUE
The UNIQUE clause can also be placed at the end of a CREATE TABLE statement, along with the primary key and foreign key specifications. In that case, it takes the form
UNIQUE (column_names)
Check Clauses
The CHECK clause to which you were introduced earlier in this chapter in the Domains section can also be used with individual columns to declare column-specific constraints. To add a constraint, you place a CHECK clause after the column declaration, using the keyword VALUE in a predicate to indicate the value being checked. For example, to verify that a column used to hold T-shirt sizes should be limited to S, M, L, XL, XXL, and XXXL, you could write a CHECK clause as
CHECK (UPPER(VALUE) IN (‘S’,‘M’,‘L’,‘XL’,‘XXL’, ‘XXXL’))
Assertions
An assertion is a constraint that is applied to any or all tables in a schema, rather than to a specific table. It can therefore be based on more than one table or be used to verify that a table is not empty. Assertions exist as independent database objects that can be created and dropped as needed.
To create an assertion you use the CREATE ASSERTION command:
CREATE ASSERTION assertion_name
CHECK (logical_expression_for_validation)
For example, to ensure that the author table has at least one row, someone at the rare book store could define the following assertion:
CREATE ASSERTION validate_author
CHECK (SELECT COUNT(*) FROM author > 0);
Because an assertion is a database object, you remove one from a schema just as you would any other database object:
DROP ASSERTION assertion_name
Determining When Constraints Are Checked
Most of today's DBMSs check constraints whenever any modification is made to the contents of a table. The SQL standard, however, gives users the ability to determine when constraints are checked. Constraints may be not deferrable (the default), in which case they are checked after each SQL statement. If constraints are deferrable, they are checked at the end of a transaction.
Note: If you are working in an interactive environment where each statement is a distinct transaction, then deferring constraints essentially has no effect.
There are several places where you can specify when constraints are to be checked:
◊ When constraints are defined within a table definition. In this case you can set constraints to INITIALLY DEFERRED or INITIALLY IMMEDIATE to determine the initial setting for constraint checking. If you want to prevent anyone from deferring constraints at a later time, you can also specify that the constraints are NOT DEFERRABLE. To allow constraints to be deferred at a later time, you can specify them as DEFERRABLE.
◊ When you create a domain using CREATE DOMAIN, you can indicate that domain checking should be INITIALLY IMMEDIATE or INITIALLY DEFERRED. In addition, you can prevent domain checking from ever being deferred by adding NOT DEFERRABLE. To allow deferring of domain checking at a later time, specify DEFERRABLE.
◊ When you create an assertion using CREATE ASSERTION, you can indicate that the assertion checking should be INITIALLY IMMEDIATE or INITIALLY DEFERRED. In addition, you can prevent assertion checking from ever being deferred by adding NOT DEFERRABLE. To allow deferring of assertion check at a later time, specify DEFERRABLE.
Changing the Constraint Mode
The point at which constraints defined as database objects are checked can be altered with the SET CONSTRAINTS MODE statement:
SET CONSTRAINTS MODE constraint_name DEFERRED
or
SET CONSTRAINTS MODE constraint_name IMMEDIATE
Of course, the preceding assume that the named constraint is deferrable.
Note: If you want to affect all the named constraints in the current schema, use the keyword ALL instead of one or more constraint names.
Because the SET CONSTRAINT MODE statement requires a named constraint, it cannot be applied to constraints created within a CREATE TABLE statement unless those constraints have been named. This means that if you want to have control over when the checking of such constraints occurs, you need to add a CONSTRAINT clause to the table declaration so you have somewhere to name a constraint.
As an example, consider the table declaration in Figure 9-5. The addition of the CONSTRAINT clause allows both the primary and foreign keys to be named, making them accessible to a SET CONSTRAINTS MODE statement (although the primary key has been specified as not deferrable.)
B9780123756978500091/f09-05-9780123756978.jpg is missing
Figure 9-5
A table declaration including constraints accessible to a SET CONSTRAINTS MODE statement
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.142.166