Using Record Formats

The record format of an external file might affect how data is read with column input and formatted input. A record format specifies how records are organized in a file. Two common record formats are fixed-length records and variable-length records.

Fixed-Length Records

External files that have a fixed-length record format have an end-of-record marker after a predetermined number of columns. A fixed-length format ends at the same ending point (for example, at 80).
Fixed-Length Records

Variable-Length Records

Files that have a variable-length record format have an end-of-record marker after the last field in each record.
As you can see, the length of each record varies.
Variable-Length Records

Reading Variable-Length Records

When you work with variable-length records that contain fixed-field data, you might have values that are shorter than others or that are missing. This can cause problems when you try to read the raw data into your SAS data set.
For example, notice that the following INPUT statement specifies a field width of 8 columns for Receipts. In the third record, the input pointer encounters an end-of-record marker before the eighth column.
input Dept $ 1-11 @13 Receipts comma8.;
Note: The asterisk symbolizes the end-of-record marker and is not part of the data.
Variable-Length Records: End-Of-Record Marker
The input pointer moves down to the next record in an attempt to complete the value for Receipts. However, GRILL is a character value, and Receipts is a numeric variable. Thus, an invalid data error occurs, and the value for Receipts is set to missing.
Reading Variable-Length Records

The PAD Option

When reading variable-length records that contain fixed-field data, you can avoid problems by using the PAD option in the INFILE statement. The PAD option pads each record with blanks so that all data lines have the same length.
infile receipts pad;
The PAD Option
When you use column input or formatted input to read fixed-field data in variable-length records, remember to determine whether you need to use the PAD option.
Note: Use the PAD option only when missing data occurs at the end of a record.
The default value of the maximum record length is determined by your operating system. If you encounter unexpected results when reading many variables, you might need to change the maximum record length by specifying the LRECL= option in the INFILE statement.

The TRUNCOVER Option

Use the TRUNCOVER option when you are reading data using column or formatted input. If a variable’s field extends past the end of the data line, then, by default, SAS goes to the next line to start reading the variable’s value. This option tells SAS to read data for the variable until it reaches the end of the data line, or until it reaches the last column that is specified in the format or column range, whichever comes first. The next file contains addresses and must be read using column or formatted input because the street names have embedded blanks. Note that the data lines are all different lengths:
John Garcia    114  Maple Ave.
Sylvia Chung  1302  Washington Drive
Martha Newton   45  S.E. 14th St. 
The program uses column input to read the address file. Because some of the addresses stop before the end of the variable Street’s field (columns 22 through 37), the TRUNCOVER option is necessary. Without the TRUNCOVER option, SAS would try to go to the next line to read the data for Street on the first and third records.
DATA homeaddress;
	INFILE 'c:MyRawDataAddress.dat' TRUNCOVER;
	INPUT Name $1-15 Number 16-19 Street $22-37;
run;
TRUNCOVER is similar to MISSOVER. Both assign missing values to variables if the data line ends before the variable’s field starts. But when the data line ends in the middle of a variable field, TRUNCOVER covers the first 80 characters of the line, whereas MISSOVER assigns the variable a missing value.
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.93.12