Using Informats

The Basics of Using Informats

An informat is an instruction that tells SAS how to read raw data. SAS provides many informats for reading standard and nonstandard data values. Here is a small sample.
Table 17.1 Selected Informats for Reading Data
PERCENTw.d
DATEw.
NENGOw.
$BINARYw.
DATETIMEw.
PDw.d
$VARYINGw.
HEXw.
PERCENTw.
$w.
JULIANw.
TIMEw.
COMMAw.d
MMDDYYw.
w.d
Here are some facts about informats:
  • Each informat contains a w value to indicate the width of the raw data field.
  • Each informat also contains a period, which is a required delimiter.
  • For some informats, the optional d value specifies the number of implied decimal places.
  • Informats for reading character data always begin with a dollar sign ($).

Reading Character Values with the $w. Informat

The $w. informat enables you to read character data. The w represents the field width of the data value (the total number of columns that contain the raw data field).
In the example below, the $ indicates that FirstName is a character variable, the 5 indicates a field width of five columns, and a period ends the informat.
input @9 FirstName $5.;
Reading Character Values

Reading Standard Numeric Data with the w.d Informat

The informat for reading standard numeric data is the w.d informat.
The w specifies the field width of the raw data value, the period serves as a delimiter, and the d specifies the number of implied decimal places for the value. The w.d informat ignores any specified d value if the data already contains a decimal point.
For example, the raw data value shown below contains six digits (four are decimals) and one decimal point. Therefore, the w. informat requires a field width of only 7 to correctly read the raw data value.
Reading Standard Numeric Data
In the example shown below, the values for JobTitle in columns 15-17 contain only numbers. Remember that standard numeric data values can contain only numbers, decimal points, scientific notation, and plus and minus signs.
A d value is not necessary to read the values for JobTitle. Simply move the column pointer forward seven spaces to column 15, name the variable, and specify a field width of 3.
input @9 FirstName $5. @1 LastName $7. +7 JobTitle 3.;
Reading Standard Numeric Data
Note: Be certain to specify the period in the informat name.

Reading Nonstandard Numeric Data with the COMMA w.d Informat

The COMMAw.d informat is used to read numeric values and to remove embedded items such as these:
  • blanks
  • commas
  • hyphens
  • dollar signs
  • percent signs
  • close parentheses
  • open parentheses, which are interpreted as minus signs
The COMMAw.d informat has three parts:
1.
the informat name
COMMA
2.
a value that specifies the width of the field to be read (including dollar signs, decimal places, or other special characters), followed by a period
w.
3.
an optional value that specifies the number of implied decimal places for a value (not necessary if the value already contains decimal places)
d
In the example below, the values for Salary contain commas, which means that they are nonstandard numeric values.
The values for Salary begin in column 19, so the @n or +n pointer control is used to point to column 19, and the variable is then named.
The following code adds the COMMAw.d informat and specifies the field width. The values end in column 27, so the field width is nine columns.
data sasuser.empinfo; 
   infile empdata; 
   input @9 FirstName $5. @1 LastName $7. +7 JobTitle 3. 
         @19 Salary comma9.;
 run;
Reading Nonstandard Numeric Data
If you use PROC PRINT to display the data set, the commas are removed from the values for Salary in the resulting output.
data sasuser.empinfo; 
  infile empdata; 
  input @9 FirstName $5. @1 LastName $7. +7 JobTitle 3. 
        @19 Salary comma9.; 
run; 
proc print data=sasuser.empinfo; 
run;
Figure 17.3 Output from the PRINT Procedure
Output from the PRINT Procedure
Thus, the COMMAw.d informat does more than simply read the raw data values. It removes special characters such as commas from numeric data and stores only numeric values in a SAS data set.

DATA Step Processing of Informats

Remember that after the DATA step is submitted, it is compiled and then executed.
data sasuser.empinfo; 
   infile empdata; 
   input @9 FirstName $5. @1 LastName $7. +7 JobTitle 3. 
         @19 Salary comma9.; 
run;
During the compilation phase, the character variables in the PDV are defined with the exact length specified by the informat. But notice that the lengths that are defined for JobTitle and Salary in the PDV are different from the lengths that are specified by their informats.
DATA Step Processing of Informats
By default, SAS stores numeric values (no matter how many digits the value contains) as floating-point numbers in 8 bytes of storage. The length of a stored numeric variable is not affected by an informat's width nor by other column specifications in an INPUT statement.
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.60.62