5.2. Creating Tables

An important element in table creation is table design. Table design incorporates how tables are structured — how rows and columns are defined, how indexes are created, and how columns refer to values in other columns. Readers seeking a greater understanding in this area are encouraged to review the many references identified at the end of this book. The following overview should be kept in mind during the table design process.

When building a table it is important to devote adequate time to planning its design as well as understanding the needs that each table is meant to satisfy. This process involves a number of activities such as requirements and feasibility analysis including cost/benefit of the proposed tables, the development of a logical description of the data sources, and physical implementation of the logical data model. Once these tasks are complete, you assess any special business requirements that each table is to provide. A business assessment helps by minimizing the number of changes required to a table once it has been created.

Next, determine what tables will be incorporated into your application’s database. This requires understanding the value that each table is expected to provide. It also prevents a table of little or no importance from being incorporated into a database. The final step and one of critical importance is to define each table’s columns, attributes, and contents.

Once the table design process is complete, each table is then ready to be created with the CREATE TABLE statement. The purpose of creating a table is to create an object that does not already exist. In the SAS implementation, three variations of the CREATE TABLE statement can be specified depending on your needs:

5.2.1. Creating a Table Using Column-Definition Lists

Although part of the SQL standard, the column-definition list (like the LENGTH statement in the DATA step) is a laborious and not very elegant way to create a table. The disadvantage of creating a table this way is that it requires the definition of each column’s attributes including their data type, length, informat, and format. This method is frequently used to create columns when they are not present in another table. Using this method results in the creation of an empty table (without rows). The code used to create the CUSTOMERS table appears below. It illustrates the creation of a table with column-definition lists.

SQL Code

PROC SQL;
  CREATE TABLE CUSTOMERS
    (CUSTNUM   NUM      LABEL='Customer Number',
     CUSTNAME  CHAR(25) LABEL='Customer Name',
     CUSTCITY  CHAR(20) LABEL='Customer''s Home City'),
QUIT;

SAS Log Results

     PROC SQL;
       CREATE TABLE CUSTOMERS
         (CUSTNUM   NUM      LABEL='Customer Number',
          CUSTNAME  CHAR(25) LABEL='Customer Name',
          CUSTCITY  CHAR(20) LABEL='Customer''s
 Home City'),
NOTE: Table CUSTOMERS created, with 0 rows and 3
 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
     real time           0.81 seconds


Readers should be aware that the SQL procedure ignores width specifications for numeric columns. When a numeric column is defined, it is created with a width of 8 bytes, which is the maximum precision allowed by the SAS System. PROC SQL ignores numeric length specifications when the value is less than 8 bytes. To illustrate this point, a partial CONTENTS procedure output is displayed for the CUSTOMERS table below.

Results

                      The CONTENTS Procedure

      -----Alphabetic List of Variables and
 Attributes-----

   #     Variable    Type    Len    Pos    Label
  ________________________________________________
_____________
   3     CUSTCITY    Char     20     33   
 Customer's Home City
   2     CUSTNAME    Char     25      8   
 Customer Name
   1     CUSTNUM     Num       8      0    
Customer Number


To conserve storage space (CUSTNUM only requires maximum precision provided in 3 bytes), a LENGTH statement could be used in a DATA step to define CUSTNUM as a 3-byte column rather than an 8-byte column. A DROP= data set option is specified to delete the original CUSTNUM column (created by the CREATE TABLE statement) in the Program Data Vector or PDV.

DATA Step Code

DATA CUSTOMERS;
  LENGTH CUSTNUM 3.;
  SET CUSTOMERS(DROP=CUSTNUM);
  LABEL CUSTNUM = 'Customer Number';
RUN;

Results

                     The CONTENTS Procedure

     -----Alphabetic List of Variables and
 Attributes-----


   #    Variable    Type    Len    Pos    Label
  ________________________________________________
_____________
   3    CUSTCITY    Char     20     25   
 Customer's Home City
   2    CUSTNAME    Char     25      0    Customer
 Name
   1    CUSTNUM     Num       3     45    Customer
 Number


Let’s look at the column-definition list used to create the PRODUCTS table.

SQL Code

PROC SQL;
  CREATE TABLE PRODUCTS
    (PRODNUM   NUM(3)    LABEL='Product Number',
     PRODNAME  CHAR(25)  LABEL='Product Name',
     MANUNUM   NUM(3)    LABEL='Manufacturer Number',
     PRODTYPE  CHAR(15)  LABEL='Product Type',
     PRODCOST  NUM(5,2)  FORMAT=DOLLAR9.2 LABEL='Product Cost'),
QUIT;

SAS Log Results

   PROC SQL;
     CREATE TABLE PRODUCTS
       (PRODNUM   NUM(3)   LABEL='Product Number',
        PRODNAME  CHAR(25) LABEL='Product Name',
        MANUNUM   NUM(3)   LABEL='Manufacturer
 Number',
        PRODTYPE  CHAR(15) LABEL='Product Type',
        PRODCOST  NUM(5,2) FORMAT=DOLLAR9.2
 LABEL='Product Cost'),
NOTE: Table PRODUCTS created, with 0 rows and 5
 columns.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.00 seconds


The CONTENTS output for the PRODUCTS table shows once again that the SQL procedure ignores all width specifications for numeric columns.

Results

                       The CONTENTS Procedure

       -----Alphabetic List of Variables and
 Attributes-----

  #   Variable   Type   Len    Pos    Format     Label
  ________________________________________________
__________________
  3   MANUNUM    Num      8      8              
 Manufacturer Number
  5   PRODCOST   Num      8     16    DOLLAR9.2 
 Product Cost
  2   PRODNAME   Char    25     24              
 Product Name
  1   PRODNUM    Num      8      0              
 Product Number
  4   PRODTYPE   Char    15     49              
 Product Type


As before, to conserve storage space you can use a LENGTH statement in a DATA step to override the default 8-byte column definition for numeric columns.

DATA Step Code

DATA PRODUCTS;
  LENGTH PRODNUM MANUNUM 3.
         PRODCOST 5.;
  SET PRODUCTS(DROP=PRODNUM MANUNUM PRODCOST);
  LABEL PRODNUM  = 'Product Number'
        MANUNUM  = 'Manufacturer Number'
        PRODCOST = 'Product Cost';
  FORMAT PRODCOST DOLLAR9.2;
RUN;

Results

                      The CONTENTS Procedure

       -----Alphabetic List of Variables and
 Attributes-----

   #   Variable   Type    Len    Format       Label
   _______________________________________________
_______________
   2   MANUNUM    Num       3                
 Manufacturer Number
   3   PRODCOST   Num       5    DOLLAR9.2   
 Product Cost
   1   PRODNUM    Num       3                
 Product Number
   4   PRODNAME   Char     25                
 Product Name
   5   PRODTYPE   Char     15                
 Product Type


5.2.2. Creating a Table Using the LIKE Clause

Referencing an existing table in a CREATE TABLE statement is an effective way of creating a new table. In fact, it can be a great time-saver, because it prevents having to define each column one at a time as was shown with column-definition lists. The LIKE clause (in the CREATE TABLE statement) triggers the existing table’s structure to be copied to the new table minus any columns dropped with the KEEP= or DROP= data set (table) option. It copies the column names and attributes from the existing table structure to the new table structure. Using this method results in the creation of an empty table (without rows). To illustrate this method of creating a new table, a table called HOT_PRODUCTS will be created with the LIKE clause.

SQL Code

PROC SQL;
  CREATE TABLE HOT_PRODUCTS
    LIKE PRODUCTS;
QUIT;

SAS Log Results

     PROC SQL;
       CREATE TABLE HOT_PRODUCTS
         LIKE PRODUCTS;
NOTE: Table HOT_PRODUCTS created, with 0 rows and
 5 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.00 seconds


Readers are reminded that, as a result of executing the LIKE clause in the CREATE TABLE statement, only those columns in the existing table are copied to the new table. What this means is that the new table has zero rows of data.

Our next example illustrates how to create a new table by selecting just the columns you have an interest in. This method is not supported by the SQL ANSI standard. Suppose you want three columns (PRODNAME, PRODTYPE, and PRODCOST) from the PRODUCTS table. The following code illustrates how the KEEP= data set (table) option can be used to accomplish this. (Note that data sets can also be called tables.)

SQL Code

PROC SQL;
  CREATE TABLE HOT_PRODUCTS(KEEP=PRODNAME PRODTYPE PRODCOST)
    LIKE PRODUCTS;
QUIT;

SAS Log Results

     PROC SQL;
        CREATE TABLE HOT_PRODUCTS(KEEP=PRODNAME
 PRODTYPE PRODCOST)
        LIKE PRODUCTS;
NOTE: Table HOT_PRODUCTS created, with 0 rows and
 3 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.00 seconds


5.2.3. Deriving a Table and Data from an Existing Table

Deriving a new table from an existing table is by far the most popular and effective way to create a table. This method uses a query expression, and the results are stored in a new table instead of being displayed as SAS output. This method not only stores the column names and their attributes, but the rows of data that satisfies the query expression as well. The following example illustrates how a new table is created using a query expression.

SQL Code

PROC SQL;
  CREATE TABLE HOT_PRODUCTS AS
    SELECT *
      FROM PRODUCTS;
QUIT;

SAS Log Results

     PROC SQL;
       CREATE TABLE HOT_PRODUCTS AS
         SELECT *
           FROM PRODUCTS;
NOTE: Table WORK.HOT_PRODUCTS created, with 10
 rows and 5 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.00 seconds


Readers may notice after examining the SAS log in the previous example that the SELECT statement extracted all rows from the existing table (PRODUCTS) and copied them to the new table (HOT_PRODUCTS). In the absence of a WHERE clause, the resulting table (HOT_PRODUCTS) contains the identical number of rows as the parent table PRODUCTS.

The power of the CREATE TABLE statement, then, is in its ability to create a new table from an existing table. What is often overlooked in this definition is the CREATE TABLE statement’s ability to form a subset of a parent table. More frequently than not, a new table represents a subset of its parent table. For this reason this method of creating a table is the most powerful and widely used. Suppose you want to create a new table called HOT_PRODUCTS containing a subset of “Software” and “Phones” product types. The following query-expression would accomplish this.

SQL Code

PROC SQL;
  CREATE TABLE HOT_PRODUCTS AS
    SELECT *
      FROM PRODUCTS
        WHERE UPCASE(PRODTYPE) IN ("SOFTWARE", "PHONE");
QUIT;

SAS Log Results

     PROC SQL;
       CREATE TABLE HOT_PRODUCTS AS
         SELECT *
           FROM PRODUCTS
             WHERE UPCASE(PRODTYPE) IN ("SOFTWARE"
, "PHONE");
NOTE: Table WORK.HOT_PRODUCTS created, with 7 rows
 and 5 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.38 seconds


Let’s look at another example. Suppose you want to create another table called NOT_SO_HOT_PRODUCTS containing a subset of everything but “Software” and “Phones” product types. The following query-expression would accomplish this.

SQL Code

PROC SQL;
  CREATE TABLE NOT_SO_HOT_PRODUCTS AS
    SELECT *
      FROM PRODUCTS
        WHERE UPCASE(PRODTYPE) NOT IN ("SOFTWARE", "PHONE");
QUIT;

SAS Log Results

     PROC SQL;
       CREATE TABLE NOT_SO_HOT_PRODUCTS AS
         SELECT *
           FROM sql.PRODUCTS
             WHERE UPCASE(PRODTYPE) NOT IN 
("SOFTWARE", "PHONE");
NOTE: Table NOT_SO_HOT_PRODUCTS created, with 3
 rows and 5 columns.
     QUIT;
NOTE: PROCEDURE SQL used:
      real time           1.20 seconds


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.51.145