Example 8.10 Adding Rows to a Table So That All Possible Values of Specific Columns Are Present in Each BY Group

Goal

Add rows to a table so that all possible combinations of specific columns are present in each BY group in the output table. The columns that are not used to define the combinations will have missing values.

This example is similar to Example 8.9 in that it fills out BY groups so that there is the same number of rows in each BY group and each BY group has the same combinations of specific variables. It differs from Example 8.9 in that the columns that define the combinations are character columns and not columns with sequential numeric values. Sequential numeric values as in Example 8.9 are straightforward to program with arrays and the DATA step while working with character values adds some complexity to the coding.

Example Features

Featured StepPROC FREQ and PROC SQL
Featured Step Options and StatementsPROC FREQ: TABLES statement with NOPRINT, OUT=, and SPARSE options PROC SQL: Left join

Input Table

Table WOODPECKERS records the presence of six species of woodpeckers at four tract locations on seven dates. Only tract MV28 was visited on all seven dates.

Value "H" means thebird was heard. Value "N" means the bird was not seen or heard. Value "S" means the bird was observed.

                             WOODPECKERS

                       red_
 Obs visit_date tract bellied sapsucker downy hairy flicker pileated
   1 03/15/2009 JN72     H        N       S     S      N       N
   2 03/15/2009 MV28     N        N       S     N      N       N
   3 03/20/2009 JN72     S        N       S     H      N       H
   4 03/20/2009 KT05     N        S       S     N      N       N
   5 03/20/2009 MV28     S        N       H     H      N       S
   6 03/25/2009 JN72     N        H       S     S      N       N
   7 03/25/2009 MV28     N        S       S     S      N       H
   8 03/30/2009 JN72     H        S       N     H      N       N
   9 03/30/2009 KT05     N        N       H     S      N       H
  10 03/30/2009 LT83     H        S       S     S      N       H
  11 03/30/2009 MV28     S        N       S     N      N       S
  12 04/04/2009 LT83     N        N       S     S      H       N
  13 04/09/2009 JN72     S        N       S     S      N       S
  14 04/09/2009 LT83     N        N       S     H      N       N
  15 04/09/2009 MV28     S        S       S     S      N       H
  16 04/14/2009 LT83     N        N       S     S      S       H
  17 04/14/2009 MV28     H        N       S     S      H       S

Resulting Table

Output 8.10 FULLBIRDSURVEY Table

        Example 8.10 FULLBIRDSURVEY Table Created with PROC SQL

                       red_
 Obs visit_date tract bellied sapsucker downy hairy flicker pileated
   1 03/15/2009 JN72     H        N       S     S      N       N
   2 03/15/2009 KT05
   3 03/15/2009 LT83
   4 03/15/2009 MV28     N        N       S     N      N       N
   5 03/20/2009 JN72     S        N       S     H      N       H
   6 03/20/2009 KT05     N        S       S     N      N       N
   7 03/20/2009 LT83
   8 03/20/2009 MV28     S        N       H     H      N       S
   9 03/25/2009 JN72     N        H       S     S      N       N
  10 03/25/2009 KT05
  11 03/25/2009 LT83
  12 03/25/2009 MV28     N        S       S     S      N       H
  13 03/30/2009 JN72     H        S       N     H      N       N
  14 03/30/2009 KT05     N        N       H     S      N       H
  15 03/30/2009 LT83     H        S       S     S      N       H
  16 03/30/2009 MV28     S        N       S     N      N       S
  17 04/04/2009 JN72
  18 04/04/2009 KT05
  19 04/04/2009 LT83     N        N       S     S      H       N
  20 04/04/2009 MV28
  21 04/09/2009 JN72     S        N       S     S      N       S
  22 04/09/2009 KT05
  23 04/09/2009 LT83     N        N       S     H      N       N
  24 04/09/2009 MV28     S        S       S     S      N       H
  25 04/14/2009 JN72
  26 04/14/2009 KT05
  27 04/14/2009 LT83     N        N       S     S      S       H
  28 04/14/2009 MV28     H        N       S     S      H       S


Example Overview

The following program uses PROC FREQ to determine all the possible combinations of specific variables, and then combine the data set that contains the combinations with the input data set by using PROC SQL. The resulting output table has the same number of observations in each BY group.

Table WOODPECKERS records observations on the presence of six species of woodpeckers in four tracts on seven dates. Observations were made in tract MV28 for all seven dates while observations were made on five of the seven dates in tract JN72, two of the seven dates in tract KT05, and four of the seven dates in tract LT83.

The goal of the PROC SQL step is to add rows to the table so that each tract has seven rows, one for each of the seven observation dates.

By adding the SPARSE option to the TABLES statement, the PROC FREQ step creates a data set that has all possible combinations of VISIT_DATE and TRACT even when the combination does not exist in the input data set. With seven values of VISIT_DATE and the four values of TRACT in the data set, PROC FREQ creates an output data set with 28 observations.

Observations that are unique to ALLVISITS have missing values in output table FULLBIRDSUVEY for columns found only in WOODPECKERS.

Program

Create a table of all possible combinations of VISIT_DATE and TRACT. Suppress printing the table. Save the table in data set ALLVISITS. Keep only the variables that are needed to define all the possible combinations. Include all possible combinations of VISIT_DATE and TRACT even when a combination does not occur in the input data set.

Create table FULLBIRDSURVEY. Perform a left join of ALLVISITS and WOODPECKERS. Put ALLVISITS on the left side of the join so that all rows from ALLVISITS are present in FULLBIRDSURVEY whether or not a row has a match in WOODPECKERS. Assign aliases to both tables. Specify how to match the rows.

proc freq data=woodpeckers;
  tables visit_date*tract

            / noprint
              out=allvisits(keep=visit_date tract)

              sparse;


run;
proc sql;
  create table fullbirdsurvey as
    select *
      from allvisits v left join woodpeckers w




       on v.visit_date=w.visit_date and v.tract=w.tract;
quit;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.18.83