5.4. Integrity Constraints

The reliability of databases and the data within them is essential to every organization. Decision-making activities depend on the correctness and accuracy of any and all data contained in key applications, information systems, databases, decision support and query tools, as well as other critical systems. Even the slightest hint of unreliable data can affect decision-making capabilities, accuracy of reports, and, in those worst case scenarios, loss of user confidence in the database environment itself.

Because data should be correct and free of problems, an integral part of every database environment is a set of rules that the data should adhere to. These rules, often referred to as database-enforced constraints, are applied to the database table structure itself and determine the type and content of data that is permitted in columns and tables.

By implementing database-enforced integrity constraints, you can dramatically reduce data-related problems and additional programming work in applications. Instead of coding complex data checks and validations in individual application programs, you can build database-enforced constraints into the database itself. This work can eliminate the propagation of column duplication, invalid and missing values, lost linkages, and other data-related problems.

5.4.1. Defining Integrity Constraints

You define integrity constraints by specifying column definitions and constraints at the time a table is created with the CREATE TABLE statement or by adding, changing, or removing a table’s column definitions with the ALTER TABLE statement. The rows in a table are then validated against the defined integrity constraints.

5.4.2. Types of Integrity Constraints

The first type of integrity constraint is referred to as a column and table constraint. This type of constraint essentially establishes rules that are attached to a specific table or column. The type of constraint is generally specified through one or two clauses with their distinct values as follows.

Column and Table Constraints
  • NOT NULL

  • UNIQUE

  • CHECK

5.4.3. Preventing Null Values with a NOT NULL Constraint

A null value is essentially a missing or unknown value in the data. When unchecked, null values can often propagate themselves throughout a database. When a NULL appears in a mathematical equation, the returned result is also a null or missing value. When a NULL is used in a comparison or a logical expression, the returned result is unknown. The occurrence of null values presents problems during search, joins, and index operations. The ability to prevent the propagation of null values in a column with a NOT NULL constraint is a powerful feature of the SQL procedure. This constraint should be used as a first line of defense against potential problems resulting from the presence of null values and the interaction of queries processing data.

Using the CREATE TABLE or ALTER TABLE statement, you can apply a NOT NULL constraint to any column where missing, unknown, or inappropriate values appear in the data. Suppose you need to avoid the propagation of missing values in the CUSTCITY (Customer’s Home City) column in the CUSTOMER_CITY table. By specifying the NOT NULL constraint for the CUSTCITY column in the CREATE TABLE statement, you prevent the propagation of null values in a table.

SQL Code

PROC SQL;
  CREATE TABLE CUSTOMER_CITY
     (CUSTNUM NUM,
      CUSTCITY CHAR(20) NOT NULL);
QUIT;

Once the CUSTOMER_CITY table is created and the NOT NULL constraint is defined for the CUSTCITY column, only non-missing data for the CUSTCITY column can be entered. Using the INSERT INTO statement with a VALUES clause, you can populate the CUSTOMER_CITY table while adhering to the assigned NOT NULL integrity constraint.

SQL Code

PROC SQL;
  INSERT INTO CUSTOMER_CITY
    VALUES(101,'La Mesa Computer Land')
    VALUES(1301,'Spring Valley Byte Center'),
QUIT;

The SAS log shows the two rows of data satisfying the NOT NULL constraint and the rows successfully being added to the CUSTOMER_CITY table.

SAS Log Results

   PROC SQL;
     INSERT INTO CUSTOMER_CITY
       VALUES(101,'La Mesa Computer Land')
       VALUES(1301,'Spring Valley Byte Center'),
NOTE: 2 rows were inserted into WORK.CUSTOMER_CITY.

   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.22 seconds
      cpu time            0.02 seconds


When you define a NOT NULL constraint and then attempt to populate a table with one or more missing data values, the rows will be rejected and the table restored to its original state. Essentially the insert fails because the NOT NULL constraint prevents any missing values for a defined column from populating a table. In the next example, several rows of data with a defined NOT NULL constraint are prevented from being populated in the CUSTOMER_CITY table because one row contains a missing CUSTCITY value.

SQL Code

PROC SQL;
  INSERT INTO CUSTOMER_CITY
    VALUES(101,'La Mesa Computer Land')
    VALUES(1301,'Spring Valley Byte Center')
    VALUES(1801,''),
QUIT;

The SAS log shows that the NOT NULL constraint has prevented the three rows of data from being populated in the CUSTOMER_CITY table. The violation caused an error message that resulted in the failure of the add/update operation. The UNDO_POLICY = REQUIRED option reverses all adds/updates that have been performed to the point of the error. This prevents errors or partial data from being propagated in the database table. The following SAS log results illustrate the error condition that caused the add/update operation to fail.

SAS Log Results

   PROC SQL;
     INSERT INTO CUSTOMER_CITY
       VALUES(101,'La Mesa Computer Land')
       VALUES(1301,'Spring Valley Byte Center')
       VALUES(1801,''),
ERROR: Add/Update failed for data set WORK
.CUSTOMER_CITY because data value(s) do not comply
 with integrity constraint _NM0001_.
NOTE: This insert failed while attempting to add
 data from VALUES clause 3 to the data set.
NOTE: Deleting the successful inserts before error
 noted above to restore table to a consistent state.
   QUIT;
NOTE: The SAS System stopped processing this step
 because of errors.
NOTE: PROCEDURE SQL used:
      real time           0.02 seconds
      cpu time            0.00 seconds


A NOT NULL constraint can also be applied to a column in an existing table containing data with an ALTER TABLE statement. To successfully impose a NOT NULL constraint you should not have missing or null values in the column the constraint is being defined for. This means that the presence of one or more null values in an existing table’s column will prevent the NOT NULL constraint from being created.

Suppose the CUSTOMERS table contains one or more missing values in the CUSTCITY column. If you tried to add a NOT NULL constraint, it would be rejected. You can successfully apply the NOT NULL constraint only when missing values are reclassified or recoded.

SQL Code

PROC SQL;
  ALTER TABLE CUSTOMERS
    ADD CONSTRAINT NOT_NULL_CUSTCITY NOT NULL(CUSTCITY);
QUIT;

The SAS log shows the NOT NULL constraint cannot be defined in an existing table when a column’s data contains one or more missing values. The violation produces an error message that results in the rejection of the constraint.

SAS Log Results

   PROC SQL;
     ALTER TABLE CUSTOMERS
       ADD CONSTRAINT NOT_NULL_CUSTCITY NOT NULL
(CUSTCITY);
ERROR: Integrity constraint NOT_NULL_CUSTCITY was
 rejected because 1
observations failed the constraint.
   QUIT;
NOTE: The SAS System stopped processing this step
 because of errors.
NOTE: PROCEDURE SQL used:
      real time           0.01 seconds
      cpu time            0.00 seconds


5.4.4. Enforcing Unique Values with a UNIQUE Constraint

A UNIQUE constraint prevents duplicate values from propagating in a table. If you use a CREATE TABLE statement, you can apply a UNIQUE constraint to any column where duplicate data is not desired. Suppose you want to avoid the propagation of duplicate values in the CUSTNUM (Customer Number) column in a new table called CUSTOMER_CITY. By specifying the UNIQUE constraint for the CUSTNUM column with the CREATE TABLE statement, you prevent duplicate values from populating the table.

SQL Code

PROC SQL;
  CREATE TABLE CUSTOMER_CITY
     (CUSTNUM NUM UNIQUE,
      CUSTCITY CHAR(20));
QUIT;

When you define a UNIQUE constraint and attempt to populate a table with duplicate data values, the rows will be rejected and the table restored to its original state prior to the add operation taking place. Essentially the insert fails because the UNIQUE constraint prevents any duplicate values for a defined column from populating a table. In the next example, several rows of data with a defined UNIQUE constraint are prevented from being populated in the CUSTOMER_CITY table because one row contains a duplicate CUSTNUM value.

SQL Code

PROC SQL;
  INSERT INTO CUSTOMER_CITY
    VALUES(101,'La Mesa Computer Land')
    VALUES(1301,'Spring Valley Byte Center')
    VALUES(1301,'Chula Vista Networks'),
QUIT;

The SAS log shows the UNIQUE constraint prevented the three rows of data from being populated in the CUSTOMER_CITY table. The violation caused an error message that resulted in the failure of the add/update operation.

SAS Log Results

  PROC SQL;
    INSERT INTO CUSTOMER_CITY
      VALUES(101,'La Mesa Computer Land')
      VALUES(1301,'Spring Valley Byte Center')
      VALUES(1301,'Chula Vista Networks'),
ERROR: Add/Update failed for data set WORK
.CUSTOMER_CITY because data value(s) do not comply
 with integrity constraint _UN0001_.
NOTE: This insert failed while attempting to add
 data from VALUES clause 3 to the data set.
NOTE: Deleting the successful inserts before error
 noted above to restore table to a consistent state.
  QUIT;
NOTE: The SAS System stopped processing this step
 because of errors.
NOTE: PROCEDURE SQL used:
      real time           1.12 seconds
      cpu time            0.06 seconds


5.4.5. Validating Column Values with a CHECK Constraint

A CHECK constraint validates data values against a list of values, minimum and maximum values, as well as a range of values before populating a table. Using either a CREATE TABLE or ALTER TABLE statement, you can apply a CHECK constraint to any column that requires data validation to be performed. In the next example, suppose you want to validate data values in the PRODTYPE (Product Type) column in the PRODUCTS table. When you specify a CHECK constraint against the PRODTYPE column using the ALTER TABLE statement, product type values will first need to match the list of defined values or the rows will be rejected.

SQL Code

PROC SQL;
  ALTER TABLE PRODUCTS
    ADD CONSTRAINT CHECK_PRODUCT_TYPE
      CHECK (PRODTYPE IN ('Laptop',
						'Phone',
						'Software',
						'Workstation'));
QUIT;

With a CHECK constraint defined, each row must meet the validation rules that are specified for the column before the table is populated. If any row does not pass the validation checks based on the established validation rules, the add/update operation fails and the table is automatically restored to its original state prior to the operation taking place. In the next example, three rows of data are validated against the defined CHECK constraint established for the PRODTYPE column.

SQL Code

PROC SQL;
  INSERT INTO PRODUCTS
    VALUES(5005,'Internet Software',500,'Software',99.)
    VALUES(1701,'Elite Laptop',170,'Laptop',3900.)
    VALUES(2103,'Digital Cell Phone',210,'Fone',199.);
QUIT;

The SAS log displays the results after attempting to add the three rows of data. Because one row violates the CHECK constraint with a value of “Fone”, the rows are not added to the PRODUCTS table. The violation produced an error message that resulted in the failure of the add/update operation.

SAS Log Results

   PROC SQL;
     INSERT INTO PRODUCTS
       VALUES(5005,'Internet Software',500
,'Software',99.)
       VALUES(1701,'Elite Laptop',170,'Laptop',3900.)
       VALUES(2103,'Digital Cell Phone',210,'Fone'
,199.);
ERROR: Add/Update failed for data set WORK
.PRODUCTS because data value(s) do not comply with
 integrity constraint CHECK_PRODUCT_TYPE.
NOTE: This insert failed while attempting to add
 data from VALUES clause 3 to the data set.
NOTE: Deleting the successful inserts before error
 noted above to restore table to a consistent state.
   QUIT;
NOTE: The SAS System stopped processing this step
 because of errors.
NOTE: PROCEDURE SQL used:
      real time           0.09 seconds
      cpu time            0.02 seconds


5.4.6. Referential Integrity Constraints

The second type of constraint that is available in the SQL procedure is referred to as a referential integrity constraint. Enforced through primary and foreign keys between two or more tables, referential integrity constraints are built into a database environment to prevent data integrity issues from occurring. The specific types of referential integrity constraints and constraint action clauses are used to enforce update and delete operations and consist of the following:

Referential Integrity Constraints
  • Primary key

  • Foreign key

Referential Integrity Constraint Action Clauses
  • RESTRICT (Default)

  • SET NULL

  • CASCADE

The action clauses are discussed in the section, “Establishing a Foreign Key.”

5.4.7. Establishing a Primary Key

A primary key consists of one or more columns with a unique value that is used to identify individual rows in a table. Depending on the nature of the columns used, a single column may be all that is necessary to identify specific rows. In other cases, two or more columns may be needed to adequately identify a row in a referenced table. Suppose you needed to uniquely identify specific rows in the MANUFACTURERS table. By establishing the Manufacturer Number (MANUNUM) as the unique identifier for rows, a key is established. The next example specifies the ALTER TABLE statement to create a primary key using MANUNUM in the MANUFACTURERS table.

SQL Code

PROC SQL;
  ALTER TABLE MANUFACTURERS
    ADD CONSTRAINT PRIM_KEY PRIMARY KEY (MANUNUM);
QUIT;

The SAS log shows that the MANUFACTURERS table has been modified successfully after creating a primary key using the MANUNUM column.

SAS Log Results

   PROC SQL;
     ALTER TABLE MANUFACTURERS
       ADD CONSTRAINT PRIM_KEY PRIMARY KEY (MANUNUM);
NOTE: Table WORK.MANUFACTURERS has been modified,
 with 4 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time          0.07 seconds
      cpu time           0.01 seconds


Suppose you also needed to uniquely identify specific rows in the PRODUCTS table. By specifying PRODNUM (Product Number) as the primary key, the next example specifies the ALTER TABLE statement to create the unique identifier for rows in the table.

SQL Code

PROC SQL;
  ALTER TABLE PRODUCTS
    ADD CONSTRAINT PRIM_PRODUCT_KEY PRIMARY KEY (PRODNUM);
QUIT;

The SAS log shows that the PRODUCTS table has been modified successfully after establishing a primary key using the PRODNUM column.

SAS Log Results

   PROC SQL;
     ALTER TABLE PRODUCTS
       ADD CONSTRAINT PRIM_PRODUCT_KEY PRIMARY KEY
 (PRODNUM);
NOTE: Table WORK.PRODUCTS has been modified, with
 5 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.03 seconds
      cpu time            0.01 seconds


5.4.8. Establishing a Foreign Key

A foreign key consists of one or more columns in a table that references or relates to values in another table. The column(s) used as a foreign key must match the column(s) in the table that is referenced. The purpose of a foreign key is to ensure that rows of data in one table exist in another table thereby preventing the possibility of lost or missing linkages between tables. The enforcement of referential integrity rules has a positive and direct effect on data reliability issues.

Suppose you wanted to ensure that data values in the INVENTORY table have corresponding and matching data values in the PRODUCTS table. By establishing PRODNUM (Product Number) as a foreign key in the INVENTORY table you ensure a strong level of data integrity between the two tables. This essentially verifies that key data in the INVENTORY table exists in the PRODUCTS table. In the next example a foreign key is created using the PRODNUM column in the INVENTORY table by specifying the ALTER TABLE statement.

SQL Code

PROC SQL;
 ALTER TABLE INVENTORY
  ADD CONSTRAINT FOREIGN_PRODUCT_KEY FOREIGN KEY (PRODNUM)
   REFERENCES PRODUCTS
    ON DELETE RESTRICT
    ON UPDATE RESTRICT;
QUIT;

The SAS log displays the successful creation of the PRODNUM column as a foreign key in the INVENTORY table. By specifying the default values ON DELETE RESTRICT and ON UPDATE RESTRICT clauses, you restrict the ability to change the values of primary key data when matching values are found in the foreign key. The execution of any SQL statement that could violate these referential integrity rules is prevented during SQL processing.

SAS Log Results

   PROC SQL;
    ALTER TABLE INVENTORY
     ADD CONSTRAINT FOREIGN_PRODUCT_KEY FOREIGN
 KEY (PRODNUM)
      REFERENCES PRODUCTS
       ON DELETE RESTRICT
       ON UPDATE RESTRICT;
NOTE: Table WORK.INVENTORY has been modified, with
 5 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.01 seconds
      cpu time            0.01 seconds


Suppose a product of a particular manufacturer is no longer available and has been taken off the market. To handle this type of situation, data values in the INVENTORY table should be set to missing once the product is deleted from the PRODUCTS table. The next example establishes a foreign key using the PRODNUM column in the INVENTORY table and sets values to null with the ON DELETE clause.

SQL Code

PROC SQL;
 ALTER TABLE INVENTORY
  ADD CONSTRAINT FOREIGN_MISSING_PRODUCT_KEY FOREIGN KEY
(PRODNUM)
   REFERENCES PRODUCTS
    ON DELETE SET NULL;
QUIT;

The SAS log displays the successful creation of the PRODNUM column as a foreign key in the INVENTORY table as well as the effect of the ON DELETE SET NULL clause. Specifying this clause will change foreign key values to missing or null for all rows matching values found in the primary key. The execution of any SQL statement that could violate these referential integrity rules is prevented during SQL processing.

SAS Log Results

   PROC SQL;
    ALTER TABLE INVENTORY
     ADD CONSTRAINT FOREIGN_MISSING_PRODUCT_KEY
 FOREIGN KEY (PRODNUM)
      REFERENCES PRODUCTS
       ON DELETE SET NULL;
NOTE: Table WORK.INVENTORY has been modified, with
 5 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.02 seconds
      cpu time            0.02 seconds


Suppose you want to ensure that changes to key values in the PRODUCTS table automatically flow over or cascade through to rows in the INVENTORY table. This is accomplished by first creating PRODNUM (Product Number) as a foreign key in the INVENTORY table using the ADD CONSTRAINT clause and referencing the PRODUCTS table. You then specify the ON UPDATE CASCADE clause to enable any changes made to the PRODUCTS table to be automatically cascaded through to the INVENTORY table. This ensures that changes to the product number values in the PRODUCTS table automatically occur in the INVENTORY table as well.

SQL Code

PROC SQL;
 ALTER TABLE INVENTORY
  ADD CONSTRAINT FOREIGN_PRODUCT_KEY FOREIGN KEY (PRODNUM)
   REFERENCES PRODUCTS
    ON UPDATE CASCADE
    ON DELETE RESTRICT  /* DEFAULT VALUE */;
QUIT;

The SAS log displays the successful creation of the PRODNUM column as a foreign key in the INVENTORY table. When the ON UPDATE and ON DELETE clauses are specified, the execution of any SQL statement that could violate referential integrity rules is strictly prohibited.

SAS Log Results

   PROC SQL;
    ALTER TABLE INVENTORY
     ADD CONSTRAINT FOREIGN_PRODUCT_KEY FOREIGN
 KEY (PRODNUM)
      REFERENCES PRODUCTS
       ON UPDATE CASCADE
      ON DELETE RESTRICT  /* DEFAULT VALUE */;
NOTE: Table WORK.INVENTORY has been modified, with
 5 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.06 seconds
      cpu time            0.01 seconds


5.4.8.1. Constraints and Change Control

To preserve change control the SAS System prohibits changes or modifications to a table containing a defined referential integrity constraint. When you attempt to delete, rename, or replace a table containing a referential integrity constraint, an error message is generated and processing is stopped. The next example illustrates a table copy operation that is performed against a table containing a referential integrity constraint that generates an error message and stops processing.

SAS Log Results

   PROC COPY IN=SQLBOOK OUT=WORK;
     SELECT INVENTORY;
   RUN;

NOTE: Copying SQLBOOK.INVENTORY to WORK.INVENTORY 
(memtype=DATA).
ERROR: A rename/delete/replace attempt is not
 allowed for a data set involved in a referential
 integrity constraint. WORK.INVENTORY.DATA
ERROR: File WORK.INVENTORY.DATA has not been saved
 because copy could not be completed.
NOTE: Statements not processed because of errors
 noted above.
NOTE: PROCEDURE COPY used:
      real time           0.44 seconds
      cpu time            0.02 seconds

NOTE: The SAS System stopped processing this step
 because of errors.


5.4.9. Displaying Integrity Constraints

Using the DESCRIBE TABLE statement, the SQL procedure displays integrity constraints along with the table description on the SAS log. The ability to capture this type of information assists with the documentation process by describing the names and types of integrity constraints as well as the contributing columns they reference.

SQL Code

PROC SQL;
  DESCRIBE TABLE MANUFACTURERS;
QUIT;

The SAS log shows the SQL statements that were used to create the MANUFACTURERS table as well as an alphabetical list of integrity constraints that have been defined.

SAS Log Results

   PROC SQL;
     DESCRIBE TABLE MANUFACTURERS;
NOTE: SQL table WORK.MANUFACTURERS was created like:

create table WORK.MANUFACTURERS( bufsize=4096 )
  (
   manunum num label='Manufacturer Number',
   manuname char(25) label='Manufacturer Name',
   manucity char(20) label='Manufacturer City',
   manustat char(2) label='Manufacturer State'
  );
create unique index manunum on WORK.MANUFACTURERS
(manunum);

                 -----Alphabetic List of Integrity
 Constraints-----

                         Integrity
                    #    Constraint    Type       
    Variables
                   
 ___________________________________________

                    1    PRIM_KEY      Primary Key
    manunum
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.19 seconds
      cpu time            0.01 seconds


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.249.198