Create a table that has metadata for columns in specific tables. Determine the name, type, length, label, and format of the columns and designate these as columns in the new table.
Featured Step | PROC SQL |
Featured Step Options and Statements | DICTIONARY tables COLUMNS and TABLES Subqueries |
A Closer Look | Accessing SAS System Information Using Dictionary Tables |
Tables TRIAL3_WEEK01, TRIAL3_WEEK02, TRIAL3_WEEK04, and TRIAL3_WEEK08 contain medical information for two patients in a clinical trial. The tables have a combination of variables that are in common and that are unique among the four tables. Assume the tables are saved in a library that has a libref of TESTCODE.
TRIAL3_WEEK01 Obs id age gender ht_cm wt_kg systol diastol ekg init_bmi 1 AW37BD 52 Male 173 88 131 077 Normal 29.4 2 23GM02 53 Male 180 83 144 090 Abnormal-LVH 25.6 TRIAL3_WEEK02 Obs id systol diastol gluc chol tri hdl 1 AW37BD 132 084 100 224 122 45 2 23GM02 136 083 86 188 100 51 TRIAL3_WEEK04 Obs id wt_kg systol diastol gluc tri 1 AW37BD 86 128 085 104 99 2 23GM02 85 130 079 92 124 TRIAL3_WEEK08 Obs id wt_kg systol diastol gluc chol tri hdl 1 AW37BD 85 125 080 98 210 100 48 2 23GM02 82 126 083 90 193 118 53
The PROC DATASETS step lists the data members in libref TESTCODE. The four input tables of interest are in this library. Their names are highlighted.
1 proc datasets library=testcode memtype=data; Directory Libref TESTCODE Engine V9 Physical Name c: rialsdatasets Filename c: rialsdatasets Gen Member File # Name Num Type Size Last Modified 1 TRIAL1_WEEK01 DATA 9216 11Jan09:08:42:23 2 TRIAL1_WEEK02 DATA 5120 18Jan09:10:05:52 3 TRIAL1_WEEK04 DATA 5120 10Feb09:14:22:08 4 TRIAL1_WEEK08 DATA 5120 12Mar09:16:02:49 5 TRIAL2_WEEK01 DATA 9216 01Feb09:13:03:39 6 TRIAL2_WEEK02 DATA 5120 09Feb09:09:23:01 7 TRIAL2_WEEK04 DATA 5120 25Feb09:16:03:18 8 TRIAL2_WEEK08 DATA 5120 31Mar09:15:29:41 9 TRIAL3_WEEK01 DATA 9216 01May09:10:49:19 10 TRIAL3_WEEK02 DATA 5120 16May09:08:32:08 11 TRIAL3_WEEK04 DATA 5120 01Jun09:12:07:48 12 TRIAL3_WEEK08 DATA 5120 28Jun09:17:08:20 13 TRIAL4_WEEK01 DATA 9216 01Jul09:11:41:22 14 TRIAL4_WEEK02 DATA 9216 15Jul09:12:33:42 15 TRIAL4_WEEK04 DATA 9216 01Aug09:08:59:12 16 TRIAL4_WEEK08 DATA 9216 06Sep09:11:32:50 2 quit; NOTE: PROCEDURE DATASETS used (Total process time): real time 0.01 seconds cpu time 0.01 seconds |
Output 9.5 NUM_STUDYVARS and CHAR_STUDYVARS TablesExample 9.5 NUM_STUDYVARS Obs name type length label format 1 age num 3 Age 2. 2 chol num 4 Cholesterol Result 4. 3 diastol num 4 Diastolic Blood Pressure Z3. 4 gluc num 4 Glucose Result 4. 5 hdl num 4 HDL Result 4. 6 ht_cm num 4 Height(cm) 3. 7 init_bmi num 8 Initial BMI (computed) 4.1 8 systol num 4 Systolic Blood Pressure Z3. 9 tri num 4 Triglycerides Result 4. 10 wt_kg num 4 Weight(kg) 2. Example 9.5 CHAR_STUDYVARS Obs name type length label format 1 ekg char 25 EKG Result 2 gender char 2 Gender $GENDER. 3 id char 6 Study ID |
The following example demonstrates how you can access two of the dictionary tables that SAS provides to collect information about tables and columns. The goal of the program is to make a table of the character column names and a table of the numeric column names in a group of tables in a library.
The program uses the read-only PROC SQL dictionary tables that are supplied by SAS. Accessible directly through PROC SQL programming or through views in the SASHELP library, these tables and views can provide you with information about tables, options, external files, and many other objects in your current SAS session. SAS automatically defines the DICTIONARY libref for you when your SAS session starts.
Here are the two dictionary tables that are used in this example:
The TABLES dictionary table. This table contains data about tables. The data that are available include library name, member name, creation date, number of observations, number of variables, and table label.
The COLUMNS dictionary table. This table contains data about the columns in tables. The data that are available include library name, member name, column name, column type, and column label.
The following PROC SQL step uses the TABLES dictionary table to find the set of tables that were created between May 1, 2009, and June 30, 2009. It then uses the COLUMNS dictionary table to make a table of the numeric column names and a table of the character column names in that set of tables.
The program has two CREATE TABLE statements: one to obtain the numeric columns and a second to obtain the character columns. A subquery finds the set of table names that were created between May 1, 2009, and June 30, 2009. The main query then extracts the column type information for the table names that were returned by the subquery.
The DISTINCT keyword on each SELECT clause causes duplicate rows to be removed from the results so that each column name is represented only once. For example, character column ID exists in all four tables while EKG exists only in TRIAL3_WEEK01. Without the DISTINCT keyword, four rows for column ID are output to table CHAR_STUDYVARS. The other attributes that were specified by the SELECT clauses have all been assigned identically for each column name.
Create table NUM_STUDYVARS. Select specific columns from DICTIONARY.COLUMNS. Do not save duplicate rows in the output table. Select information for columns whose type is numeric. Select column information for tables in the TESTCODE library. Select column information from members in the TESTCODE library that were created between May 1, 2009, and June 30, 2009, by examining the creation date of the tables. Order by the column name.
Create table CHAR_STUDYVARS. Select specific columns from DICTIONARY.COLUMNS. Do not save duplicate rows in the output table. Select information for columns whose type is character.
proc sql ; create table num_studyvars as select distinct name,type,length,label,format from dictionary.columns where type='num' and libname='TESTCODE' and memname in (select memname from dictionary.tables where libname='TESTCODE' and memtype='DATA' and datepart(crdate) between '01may2009'd and '30jun2009'd) order by name; create table char_studyvars as select distinct name,type,length,label,format from dictionary.columns where type='char' and libname='TESTCODE' and memname in (select memname from dictionary.tables where libname='TESTCODE' and memtype='DATA' and datepart(crdate) between '01may2009'd and '30jun2009'd) order by name; quit;
SAS supplies the read-only PROC SQL dictionary tables and associated views in the SASHELP library from which you can extract metadata about tables, options, external files, and many other objects in your current SAS session. SAS automatically defines the DICTIONARY libref for you when your SAS session starts.
Each time you reference a dictionary table, SAS collects and returns the pertinent information. When you use PROC SQL to obtain data from a dictionary table, PROC SQL optimizes the query that could potentially search libraries, open tables, and execute views. Although you can also obtain the same data by using the SASHELP view in a DATA step or PROC step, it might be more efficient to use PROC SQL to create a table that contains the information you need. After creating a table that contains only the data you need, you could then process that table with a DATA step or PROC step.
The specification of DICTIONARY.table-name is valid only in a PROC SQL step. You cannot place the specification as the object of the DATA= option in a PROC statement and you cannot place it in a SET statement in a DATA step.
The specification of SASHELP.view-name is used when you want to reference the view that is associated with a dictionary table. The views are stored in the SASHELP library and their names start with the letter "V". The remainder of each view's name is identical to or similar to the dictionary table name from which it is derived. A table in SAS documentation lists the dictionary table name and its associated view name.
The following PROC PRINT step shows how to reference a dictionary table view. It prints the variable information for ENTRANCE_EXAM by using PROC PRINT. View VCOLUMN is the view that is derived from the DICTIONARY.COLUMNS table.
proc print data=sashelp.vcolumn; where libname='WORK' and memname='ENTRANCE_EXAM'; run;
Here is the equivalent PROC SQL step:
proc sql; select * from dictionary.columns where libname='WORK' and memname='ENTRANCE_EXAM'; quit;
The preceding main example extracted data from two of the tables, COLUMNS and TABLES. In SAS 9.2 under Windows, many dictionary tables are available. One table, DICTIONARY.DICTIONARIES, contains metadata about all the dictionary tables. Each row in this table contains data for one column in one of the dictionary tables. Table LIBNAMES contains 11 columns of information about the libnames that are defined in your current SAS session. Therefore, in DICTIONARY.DICTIONARIES, there are 11 rows of information for the LIBNAMES dictionary table.
If you want to list the names of the dictionary tables that are available to your SAS session, submit the following step:
proc sql; select unique(memname) from dictionary.dictionaries; quit;
In this step, MEMNAME is the column in DICTIONARY.DICTIONARIES that contains the dictionary name.
If you want to find out how SAS defines a specific dictionary table, use the DESCRIBE TABLE statement in a PROC SQL step. The following step writes the description of DICTIONARY.LIBNAMES to the SAS log:
proc sql; describe table dictionary.libnames; quit;
For more information about dictionary tables, see SAS documentation.
3.144.110.155