Creating One Observation per Detail Record

The Basics of Creating One Observation per Detail Record

In order to create one observation per detail record, it is necessary to distinguish between header and detail records. Having a field that identifies the type of the record makes this task easier.
In the partial raw data file Census shown below, H indicates a header record that contains a street address, and P indicates a detail record that contains information about a person who lives at that address.
Partial Raw Data File

Retaining the Values of Variables

While writing the DATA step to read this file, remember that you want to keep the header record as a part of each observation until the next header record is encountered. To do this, you need to use a RETAIN statement to retain the values for Address across iterations of the DATA step.
data perm.people; 
   infile census; 
   retain Address;
Retaining the Values of Variables
Next, read the first field in each record, which identifies the record's type. Use the @ line-hold specifier to hold the current record so that the other values in the record can be read later.
 data perm.people; 
    infile census; 
    retain Address; 
    input type $1. @;
Retaining the Values of Variables

Conditionally Executing SAS Statements

You can use the value of type to identify each record. If Type is H, use an INPUT statement to read the values for Address. However, if Type is P, then use an INPUT statement to read the values for Name, Age, and Gender.
You can tell SAS to perform a given task based on a specific condition by using an IF-THEN statement.
data perm.people; 
   infile census; 
   retain Address; 
   input type $1. @; 
   if type='H' then 
      input @3 address $15.;
Using an IF-THEN Statement
Expressions in conditional statements usually involve some type of comparison. In the example shown above, a variable is compared to a constant. When the condition is met, the expression is evaluated as true, and the statement that follows the keyword THEN is executed.
The expression defines a condition so that when the value of type is H, the INPUT statement reads the values for Address. However, when the value of type is not H, the expression is evaluated as false, and the INPUT statement is not executed. Notice that the value is enclosed in quotation marks because it is a character value.
Tip
When you compare values, be sure to express the values exactly as they appear in the data. For example, the expression below would evaluate to false because the values in the data are stored in uppercase letters.
if type='h' then ... ;

Reading a Detail Record

You can use a subsetting IF statement to check for the condition that type is P. The remaining DATA step statements execute only when the condition is true. If type is not P, then the values for Name, Age, and Gender are not read, the values in the program data vector are not written to the data set as an observation, and control returns to the top of the DATA step. However, Address is retained.
If type is P, Name, Age, and Gender are read, and an observation is written to the data set. Remember that you want to create an observation for detail records only.
data perm.people; 
   infile census; 
   retain Address; 
   input type $1. @; 
   if type='H' then input @3 address $15.; 
   if type='P'; 
   input @3 Name $10. @13 Age 3. @16 Gender $1.; 
run;

Dropping Variables

Type is useful only for identifying a record's type. Therefore, drop the variable from the data set. The DROP= option in the DATA statement shown here prevents the variable Type from being written to the data set.
data perm.people (drop=type);  
   infile census; 
   retain Address; 
   input type $1. @; 
   if type='H' then input @3 address $15.; 
   if type='P'; 
   input @3 Name $10. @13 Age 3. @16 Gender $1.; 
run;

Processing a DATA Step That Creates One Observation per Detail Record

At compile time, the variable Type is flagged so that its values are not written to the data set. Address is flagged so that its value is retained across iterations of the DATA step.
data perm.people (drop=type);  
   infile census; 
   retain Address; 
   input type $1. @; 
   if type='H' then input @3 address $15.; 
   if type='P'; 
   input @3 Name $10. @13 Age 3. @16 Gender $1.; 
run;
Processing a DATA Step That Creates One Observation per Detail Record
As the DATA step begins to execute, the INPUT statement reads the value for Type and holds the first record.
Processing a DATA Step That Creates One Observation per Detail Record
The condition Type='H' is checked and found to be true. Therefore, the INPUT statement reads the value for Address in the first record.
Processing a DATA Step That Creates One Observation per Detail Record
Next, the subsetting IF statement checks for the condition Type='P'. Because the condition is not true, the remaining statements are not executed and control returns to the top of the DATA step. The PDV is initialized but Address is retained.
Processing a DATA Step That Creates One Observation per Detail Record
As the second iteration begins, the input pointer moves to the next record and a new value for Type is read. The condition expressed in the IF-THEN statement is not true, so the statement following the THEN keyword is not executed.
Processing a DATA Step That Creates One Observation per Detail Record
Now the subsetting IF statement checks for the condition Type='P'. In this iteration, the condition is true, so the final INPUT statement reads the values for Name, Age, and Gender.
Processing a DATA Step That Creates One Observation per Detail Record
Next, the values in the program data vector are written as the first observation, and control returns to the top of the DATA step. Notice that the values for Type are not included.
Processing a DATA Step That Creates One Observation per Detail Record
As execution continues, observations are produced from the third and fourth records. However, notice that the fifth record is a header record. During the fifth iteration, the condition Type='H' is true, so a new address is read into the program data vector, overwriting the previous value.
Processing a DATA Step That Creates One Observation per Detail Record

Displaying Your Results

When the execution phase is complete, you can display the data set by using the PRINT procedure. The first 10 observations are displayed.
Figure 22.1 Output from the PRINT Procedure
Output from the PRINT Procedure (partial output)
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.235.144