Reading Repeating Blocks of Data

The Basics of Reading Repeating Blocks of Data

Consider this example data. Each record in the file Tempdata contains three blocks of data. Each block contains a date followed by the day's high temperature in a small city located in the southern United States.
Figure 21.1 Raw Data File Tempdata
Raw Data File Tempdata showing the date and high temperature.
You could write a DATA step that reads each record and creates three different Date and Temp variables.
Figure 21.2 Three Date and Temp Variables
SAS data set showing three Date and Temp variables.
Alternatively, you could create a separate observation for each block of data in a record. This data set is better structured for analysis and reporting with SAS procedures.
Figure 21.3 Separate Observations for Each Block of Data in a Record
SAS data set showing a separate observation for each block of data in a record

Holding the Current Record with a Line-Hold Specifier

You need to hold the current record so that the INPUT statement can read, and SAS can generate output from, repeated blocks of data in the same record. This is easily accomplished by using a line-hold specifier in the INPUT statement.
SAS provides two line-hold specifiers.
  • The trailing at-sign (@) holds the input record for the execution of the next INPUT statement.
  • The double trailing at-sign (@@) holds the input record for the execution of the next INPUT statement, even across iterations of the DATA step.
The term trailing indicates that the @ or @@ must be the last item that is specified in the INPUT statement. Here is an example.
input Name $20. @; or input Name $20. @@;

Using the Double Trailing At-Sign (@@) to Hold the Current Record

Normally, each time a DATA step executes, the INPUT statement reads the next record. But when you use the trailing @@, the INPUT statement continues reading from the same record.
Here are several facts about the double trailing at-sign (@@):
  • It works like the trailing @ except it holds the data line in the input buffer across multiple executions of the DATA step.
  • It typically is used to read multiple SAS observations from a single data line.
  • It should not be used with the @ pointer control, with column input, or with the MISSOVER option.
A record that is held by the double trailing at-sign (@@) is not released until either of the following events occurs:
  • The input pointer moves past the end of the record. Then the input pointer moves down to the next record.
  • An INPUT statement that has no trailing @ executes.
    input ID $ @@; 
    .  
    . 
    input Department 5.;
The following example requires only one INPUT statement to read the values for Date and HighTemp, but the INPUT statement must execute three times for each record.
The INPUT statement reads a block of values for Date and HighTemp, and holds the current record by using the trailing @@. The values in the program data vector are written to the data set as an observation, and control returns to the top of the DATA step.
data perm.april10; 
   infile tempdata; 
   input Date : date. HighTemp @@;
Figure 21.4 Control Returned to the Top of the DATA Step
Raw data file showing the control returned to the top of the DATA step.
In the next iteration, the INPUT statement reads the next block of values for Date and HighTemp from the same record.
Figure 21.5 Date and High Temp
Raw data file showing the date and high temp highlighted.

Completing the DATA Step

You can add a FORMAT statement to the DATA step to display the date or time values with a specified format. The FORMAT statement below uses the DATEw. format to display the values for Date in the form ddmmmyyyy.
data perm.april10; 
   infile tempdata; 
   input Date : date. HighTemp @@; 
   format date date9.; 
run;
Figure 21.6 Displaying Dates in a Specified Format
Raw data file showing the date values displayed in the specified format.

DATA Step Processing of Repeating Blocks of Data

Complete DATA Step

data perm.april10; 
   infile tempdata; 
   input Date : date. HighTemp @@; 
   format date date9.; 
run;

Example: How the DATA Step Processes Repeated Blocks of Data

As the execution phase begins, the input pointer rests on column 1 of record 1.
Figure 21.7 Input Pointer on Column 1 of Record 1
Raw data showing the input pointer on column 1 of record 1. Program data vector showing empty values for Date and HighTemp.
During the first iteration of the DATA step, the first block of values for Date and HighTemp are read into the program data vector.
Figure 21.8 Reading the First Block of Values
Raw data file showing first block of values being read. Program data vector showing first block of values for Date and HighTemp.
The first observation is written to the data set.
Figure 21.9 Control Returns to the Top of the DATA Step
Raw data file showing control returned to the top of the DATA step after the first observation has been written to the data set. Program data vector showing empty values for Date and HighTemp. SAS data set showing values for Date and HighTemp.
Control returns to the top of the DATA step, and the values are reset to missing.
Figure 21.10 Reset Values
Program Data Vector with values reset to missing.
During the second iteration, the @@ prevents the input pointer from moving down to the next record. Instead, the INPUT statement reads the second block of values for Date and HighTemp from the first record.
Figure 21.11 Reading the Second Block of Values from the First Record
Raw data file showing the input pointer on the second block of values for Date and HighTemp from the first record. Program data vector showing values from the second block of the first record.
The second observation is written to the data set, and control returns to the top of the DATA step.
Figure 21.12 Writing the Second Observation to the Data Set
Raw data file showing the control returned to the top of the DATA step. Program data vector and SAS data set showing the values from the second observation.
During the third iteration, the last block of values is read and written to the data set as the third observation.
Figure 21.13 Writing the Third Observation to the Data Set
Raw data file showing the third observation highlighted. Program data vector and SAS data set showing the values from the third observation.
During the fourth iteration, the first block of values in the second record is read and written as the fourth observation.
Figure 21.14 Writing the Fourth Observation to the Data Set
Raw data file showing the fourth observation highlighted. Program data vector and SAS data set showing the values from the fourth observation.
The execution phase continues until the last block of data is read.
Figure 21.15 Writing the Last Observation to the Data Set
Raw data file showing the last observation highlighted. Program data vector and SAS data set showing the values from the last observation.
You can display the data set with the PRINT procedure.
proc print data=perm.april10; 
run;
Figure 21.16 PROC PRINT Output of the Data Set.
PROC PRINT output of the data set.
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.51.157