Example 9.5 Creating Columns That Contain the Attributes of Other Columns

Goal

Create a table that has metadata for columns in specific tables. Determine the name, type, length, label, and format of the columns and designate these as columns in the new table.

Example Features

Featured StepPROC SQL
Featured Step Options and StatementsDICTIONARY tables COLUMNS and TABLES Subqueries
A Closer LookAccessing SAS System Information Using Dictionary Tables

Input Tables

Tables TRIAL3_WEEK01, TRIAL3_WEEK02, TRIAL3_WEEK04, and TRIAL3_WEEK08 contain medical information for two patients in a clinical trial. The tables have a combination of variables that are in common and that are unique among the four tables. Assume the tables are saved in a library that has a libref of TESTCODE.

                            TRIAL3_WEEK01

Obs   id   age gender ht_cm wt_kg systol diastol ekg          init_bmi
 1  AW37BD 52   Male   173   88    131     077   Normal         29.4
 2  23GM02 53   Male   180   83    144     090   Abnormal-LVH   25.6

                            TRIAL3_WEEK02

  Obs      id      systol    diastol    gluc    chol     tri   hdl
   1     AW37BD     132        084       100     224     122    45
   2     23GM02     136        083        86     188     100    51

                            TRIAL3_WEEK04
     Obs      id      wt_kg    systol    diastol    gluc    tri
      1     AW37BD     86       128        085       104     99
      2     23GM02     85       130        079        92    124

                            TRIAL3_WEEK08

Obs     id     wt_kg   systol   diastol   gluc   chol    tri    hdl
 1    AW37BD    85      125       080       98    210    100     48
 2    23GM02    82      126       083       90    193    118     53

SAS Log

The PROC DATASETS step lists the data members in libref TESTCODE. The four input tables of interest are in this library. Their names are highlighted.

  1  proc datasets library=testcode memtype=data;
                              Directory

                Libref         TESTCODE
                Engine         V9
                Physical Name  c:	rialsdatasets
                Filename       c:	rialsdatasets
                      Gen  Member    File
  #  Name             Num  Type      Size  Last Modified

  1  TRIAL1_WEEK01         DATA      9216  11Jan09:08:42:23
  2  TRIAL1_WEEK02         DATA      5120  18Jan09:10:05:52
  3  TRIAL1_WEEK04         DATA      5120  10Feb09:14:22:08
  4  TRIAL1_WEEK08         DATA      5120  12Mar09:16:02:49
  5  TRIAL2_WEEK01         DATA      9216  01Feb09:13:03:39

  6  TRIAL2_WEEK02         DATA      5120  09Feb09:09:23:01
  7  TRIAL2_WEEK04         DATA      5120  25Feb09:16:03:18
  8  TRIAL2_WEEK08         DATA      5120  31Mar09:15:29:41
  9  TRIAL3_WEEK01         DATA      9216  01May09:10:49:19
 10  TRIAL3_WEEK02         DATA      5120  16May09:08:32:08
 11  TRIAL3_WEEK04         DATA      5120  01Jun09:12:07:48
 12  TRIAL3_WEEK08         DATA      5120  28Jun09:17:08:20
 13  TRIAL4_WEEK01         DATA      9216  01Jul09:11:41:22
 14  TRIAL4_WEEK02         DATA      9216  15Jul09:12:33:42
 15  TRIAL4_WEEK04         DATA      9216  01Aug09:08:59:12
 16  TRIAL4_WEEK08         DATA      9216  06Sep09:11:32:50

  2   quit;

NOTE: PROCEDURE DATASETS used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


Resulting Tables

Output 9.5 NUM_STUDYVARS and CHAR_STUDYVARS Tables

                        Example 9.5 NUM_STUDYVARS

  Obs   name       type   length   label                      format

    1   age        num       3     Age                         2.
    2   chol       num       4     Cholesterol Result          4.
    3   diastol    num       4     Diastolic Blood Pressure    Z3.
    4   gluc       num       4     Glucose Result              4.
    5   hdl        num       4     HDL Result                  4.
    6   ht_cm      num       4     Height(cm)                  3.
    7   init_bmi   num       8     Initial BMI (computed)      4.1
    8   systol     num       4     Systolic Blood Pressure     Z3.
    9   tri        num       4     Triglycerides Result        4.
   10   wt_kg      num       4     Weight(kg)                  2.

                     Example 9.5 CHAR_STUDYVARS

      Obs    name      type    length    label          format

       1     ekg       char      25      EKG Result
       2     gender    char       2      Gender        $GENDER.
       3     id        char       6      Study ID


Example Overview

The following example demonstrates how you can access two of the dictionary tables that SAS provides to collect information about tables and columns. The goal of the program is to make a table of the character column names and a table of the numeric column names in a group of tables in a library.

The program uses the read-only PROC SQL dictionary tables that are supplied by SAS. Accessible directly through PROC SQL programming or through views in the SASHELP library, these tables and views can provide you with information about tables, options, external files, and many other objects in your current SAS session. SAS automatically defines the DICTIONARY libref for you when your SAS session starts.

Here are the two dictionary tables that are used in this example:

  • The TABLES dictionary table. This table contains data about tables. The data that are available include library name, member name, creation date, number of observations, number of variables, and table label.

  • The COLUMNS dictionary table. This table contains data about the columns in tables. The data that are available include library name, member name, column name, column type, and column label.

The following PROC SQL step uses the TABLES dictionary table to find the set of tables that were created between May 1, 2009, and June 30, 2009. It then uses the COLUMNS dictionary table to make a table of the numeric column names and a table of the character column names in that set of tables.

The program has two CREATE TABLE statements: one to obtain the numeric columns and a second to obtain the character columns. A subquery finds the set of table names that were created between May 1, 2009, and June 30, 2009. The main query then extracts the column type information for the table names that were returned by the subquery.

The DISTINCT keyword on each SELECT clause causes duplicate rows to be removed from the results so that each column name is represented only once. For example, character column ID exists in all four tables while EKG exists only in TRIAL3_WEEK01. Without the DISTINCT keyword, four rows for column ID are output to table CHAR_STUDYVARS. The other attributes that were specified by the SELECT clauses have all been assigned identically for each column name.

Program

Create table NUM_STUDYVARS. Select specific columns from DICTIONARY.COLUMNS. Do not save duplicate rows in the output table. Select information for columns whose type is numeric. Select column information for tables in the TESTCODE library. Select column information from members in the TESTCODE library that were created between May 1, 2009, and June 30, 2009, by examining the creation date of the tables. Order by the column name.

Create table CHAR_STUDYVARS. Select specific columns from DICTIONARY.COLUMNS. Do not save duplicate rows in the output table. Select information for columns whose type is character.

      proc sql ;
        create table num_studyvars as
          select distinct name,type,length,label,format
                      from dictionary.columns

            where type='num' and

                  libname='TESTCODE' and

                   memname in
                     (select memname from dictionary.tables
                        where libname='TESTCODE' and
                              memtype='DATA' and
                  datepart(crdate) between
                        '01may2009'd and '30jun2009'd)
                  order by name;

        create table char_studyvars as
          select distinct name,type,length,label,format
              from dictionary.columns

            where type='char' and

                  libname='TESTCODE' and memname in
                     (select memname from dictionary.tables
                        where libname='TESTCODE' and
                              memtype='DATA' and
                  datepart(crdate) between
                        '01may2009'd and '30jun2009'd)
                  order by name;
      quit;

A Closer Look

Accessing SAS System Information Using Dictionary Tables

SAS supplies the read-only PROC SQL dictionary tables and associated views in the SASHELP library from which you can extract metadata about tables, options, external files, and many other objects in your current SAS session. SAS automatically defines the DICTIONARY libref for you when your SAS session starts.

Each time you reference a dictionary table, SAS collects and returns the pertinent information. When you use PROC SQL to obtain data from a dictionary table, PROC SQL optimizes the query that could potentially search libraries, open tables, and execute views. Although you can also obtain the same data by using the SASHELP view in a DATA step or PROC step, it might be more efficient to use PROC SQL to create a table that contains the information you need. After creating a table that contains only the data you need, you could then process that table with a DATA step or PROC step.

The specification of DICTIONARY.table-name is valid only in a PROC SQL step. You cannot place the specification as the object of the DATA= option in a PROC statement and you cannot place it in a SET statement in a DATA step.

The specification of SASHELP.view-name is used when you want to reference the view that is associated with a dictionary table. The views are stored in the SASHELP library and their names start with the letter "V". The remainder of each view's name is identical to or similar to the dictionary table name from which it is derived. A table in SAS documentation lists the dictionary table name and its associated view name.

The following PROC PRINT step shows how to reference a dictionary table view. It prints the variable information for ENTRANCE_EXAM by using PROC PRINT. View VCOLUMN is the view that is derived from the DICTIONARY.COLUMNS table.

proc print data=sashelp.vcolumn;
  where libname='WORK' and memname='ENTRANCE_EXAM';
run;

Here is the equivalent PROC SQL step:

proc sql;
  select * from dictionary.columns
  where libname='WORK' and memname='ENTRANCE_EXAM';
quit;

The preceding main example extracted data from two of the tables, COLUMNS and TABLES. In SAS 9.2 under Windows, many dictionary tables are available. One table, DICTIONARY.DICTIONARIES, contains metadata about all the dictionary tables. Each row in this table contains data for one column in one of the dictionary tables. Table LIBNAMES contains 11 columns of information about the libnames that are defined in your current SAS session. Therefore, in DICTIONARY.DICTIONARIES, there are 11 rows of information for the LIBNAMES dictionary table.

If you want to list the names of the dictionary tables that are available to your SAS session, submit the following step:

proc sql;
  select unique(memname) from dictionary.dictionaries;
quit;

In this step, MEMNAME is the column in DICTIONARY.DICTIONARIES that contains the dictionary name.

If you want to find out how SAS defines a specific dictionary table, use the DESCRIBE TABLE statement in a PROC SQL step. The following step writes the description of DICTIONARY.LIBNAMES to the SAS log:

proc sql;
  describe table dictionary.libnames;
quit;

For more information about dictionary tables, see SAS documentation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.110.155