A More Comprehensive Example

Often, a single SAS program contains a large number of data-manipulation and subsetting statements. Consider the following example which makes use of the INFILE statement rather than the DATALINES statement:

 1   DATA D1;
 2      INFILE 'A:/VOLUNTEER.DAT' ;
 3      INPUT   #1   @1    Q1-Q7        1.
 4                   @9    AGE          2.
 5                   @12   IQ           3.
 6                   @16   NUMBER       2.
 7                   @19   SEX         $1.
 8              #2   @1    GREVERBAL    3.
 9                   @5    GREMATH      3.  ;
10
11      DATA D2;
12         SET D1;
13
14      Q3 = 6 - Q3;
15      Q6 = 6 - Q6;
16      RESPONSE = (Q1 + Q2 + Q3) / 3;
17      TRUST    = (Q4 + Q5 + Q6) / 3;
18      SHOULD   =  Q7;
19
20      PROC MEANS   DATA=D2;
21      RUN;
22
23      DATA D3;
24         SET D2;
25         IF SEX = 'F';
26
27      PROC MEANS   DATA=D3;
28      RUN;
29
30      DATA D4;
31         SET D2;
32         IF SEX = 'M';
33
34      PROC MEANS   DATA=D4;
35      RUN;

In the preceding program, lines 11 and 12 create a new dataset called D2 and set it identical to D1. All data-manipulation commands that appear between those lines and PROC MEANS on line 20 are performed on dataset D2. Notice that a new variable called TRUST is created on line 17. TRUST is the average of participants’ responses to items 4, 5, and 6. Look over these items on the volunteerism survey to see why the name TRUST makes sense. On line 18, variable Q7 is duplicated, and the resulting new variable is called SHOULD. Why does this make sense? PROC MEANS appears on line 20, so the means and other descriptive statistics are calculated for all of the quantitative variables in the most recently created dataset, which is D2. This includes all variables inputted in dataset D1 as well as the new variables that were just created.

In lines 23 through 25, a new dataset called D3 is created; only responses from female participants are retained in this dataset. Notice that the SET statement sets D3 equal to D2 rather than D1. This enables the newly created variables such as TRUST and SHOULD to appear in this all-female dataset. In lines 30 through 32, a new dataset called D4 is created which is also set equal to D2 (not D3). This new dataset contains data from only males.

After this program is submitted for analysis, the SAS output contains three tables of means. The first table is based on lines 1 through 21, and gives the means based on all participants. The second table is based on lines 23 through 28 and gives the means based on the responses of females. The third table is based on lines 30 through 35, and is based on the responses of males.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.12.192