24;77 195 177 163
24;31 220 213 198
24;56 173 166 155
24;12 135 125 116
;;;;
External Files
The following example shows how to read in raw data from an external file using the
INFILE and INPUT statements:
data weight;
infile file-specification or path-name;
input PatientID $ Week1 Week8 Week16;
loss=Week1-Week16;
run;
Note: See the SAS documentation for your operating environment for information about
how to specify a file with the INFILE statement.
Reading Raw Data with the INPUT Statement
Choosing an Input Style
The INPUT statement reads raw data from instream data lines or external files into a
SAS data set. You can use the following different input styles, depending on the layout
of data values in the records:
list input
column input
formatted input
named input
You can also combine styles of input in a single INPUT statement. For details about the
styles of input, see the INPUT statement in SAS Statements: Reference.
List Input
List input uses a scanning method for locating data values. Data values are not required
to be aligned in columns but must be separated by at least one blank (or other defined
delimiter). List input requires only that you specify the variable names and a dollar sign
($), if defining a character variable. You do not have to specify the location of the data
fields.
An example of list input follows:
data scores;
length name $ 12;
input name $ score1 score2;
datalines;
Riley 1132 1187
Reading Raw Data with the INPUT Statement 435
Henderson 1015 1102
;
List input has several restrictions on the type of data that it can read:
Input values must be separated by at least one blank (the default delimiter) or by the
delimiter specified with the DLM= or DLMSTR= option in the INFILE statement. If
you want SAS to read consecutive delimiters as if there is a missing value between
them, specify the DSD option in the INFILE statement.
Blanks cannot represent missing values. A real value, such as a period, must be used
instead.
To read and store a character input value longer than 8 bytes, define a variable's
length by using a LENGTH, INFORMAT, or ATTRIB statement before the INPUT
statement, or by using modified list input, which consists of an informat and the
colon modifier in the INPUT statement. See “Modified List Input” on page 436 for
more information.
Character values cannot contain embedded blanks when the file is delimited by
blanks.
Fields must be read in order.
Data must be in standard numeric or character format.
Note: Nonstandard numeric values, such as packed decimal data, must use the formatted
style of input. See “Formatted Input” on page 438 for more information.
Modified List Input
A more flexible version of list input, called modified list input, includes format
modifiers. The following format modifiers enable you to use list input to read
nonstandard data by using SAS informats:
The & (ampersand) format modifier enables you to read character values that contain
one or more embedded blanks with list input and to specify a character informat.
SAS reads until it encounters two consecutive blanks, the defined length of the
variable, or the end of the input line, whichever comes first.
The : (colon) format modifier enables you to use list input but also to specify an
informat after a variable name, whether character or numeric. SAS reads until it
encounters a blank column, the defined length of the variable (character only), or the
end of the data line, whichever comes first.
The ~ (tilde) format modifier enables you to read and retain single quotation marks,
double quotation marks, and delimiters within character values.
The following is an example of the : and ~ format modifiers. You must use the DSD
option in the INFILE statement. Otherwise, the INPUT statement ignores the ~ format
modifier.
data scores;
infile datalines dsd;
input Name : $9. Score1-Score3 Team ~ $25. Div $;
datalines;
Smith,12,22,46,"Green Hornets, Atlanta",AAA
Mitchel,23,19,25,"High Volts, Portland",AAA
Jones,09,17,54,"Vulcans, Las Vegas",AA
;
436 Chapter 19 Reading Raw Data
proc print data=scores;
Output 19.1 Output from Example with Format Modifiers
Column Input
Column input enables you to read standard data values that are aligned in columns in the
data records. Specify the variable name, followed by a dollar sign ($) if it is a character
variable, and specify the columns in which the data values are located in each record:
data scores;
infile datalines truncover;
input name $ 1-12 score2 17-20 score1 27-30;
datalines;
Riley 1132 987
Henderson 1015 1102
;
Note: Use the TRUNCOVER option in the INFILE statement to ensure that SAS
handles data values of varying lengths appropriately.
To use column input, data values must be:
in the same field on all the input lines
in standard numeric or character form
Note: You cannot use an informat with column input.
Features of column input include the following:
Character values can contain embedded blanks.
Character values can be from 1 to 32,767 characters long.
Placeholders, such as a single period (.), are not required for missing data.
Input values can be read in any order, regardless of their position in the record.
Values or parts of values can be reread.
Both leading and trailing blanks within the field are ignored.
Values do not need to be separated by blanks or other delimiters.
CAUTION:
If you insert tabs while entering data in the DATALINES statement in column
format, you might get unexpected results. This issue exists when you use the
Reading Raw Data with the INPUT Statement 437
SAS Enhanced Editor or SAS Program Editor. To avoid the issue, do one of the
following:
Replace all tabs in the data with single spaces using another editor outside of SAS.
Use the %INCLUDE statement from the SAS editor to submit your code.
If you are using the SAS Enhanced Editor, select Tools ð Options ð Enhanced
Editor to change the tab size from 4 to 1.
Formatted Input
Formatted input combines the flexibility of using informats with many of the features of
column input. By using formatted input, you can read nonstandard data for which SAS
requires additional instructions. Formatted input is typically used with pointer controls
that enable you to control the position of the input pointer in the input buffer when you
read data.
The INPUT statement in the following DATA step uses formatted input and pointer
controls. Note that $12. and COMMA5. are informats; +4 and +6 are column pointer
controls.
data scores;
input name $12. +4 score1 comma5. +6 score2 comma5.;
datalines;
Riley 1,132 1,187
Henderson 1,015 1,102
;
Note: You can also use informats to read data that is not aligned in columns. See
“Modified List Input” on page 436 for more information.
Important points about formatted input are:
Characters values can contain embedded blanks.
Character values can be from 1 to 32,767 characters long.
Placeholders, such as a single period (.) are not required for missing data.
With the use of pointer controls to position the pointer, input values can be read in
any order, regardless of their positions in the record.
Values or parts of values can be reread.
Formatted input enables you to read data stored in nonstandard form, such as packed
decimal or numbers with commas.
Named Input
You can use named input to read records in which data values are preceded by the name
of the variable and an equal sign (=). The following INPUT statement reads the data
lines containing equal signs.
data games;
input name=$ score1= score2=;
datalines;
name=riley score1=1132 score2=1187
;
438 Chapter 19 Reading Raw Data
Note: When an equal sign follows a variable in an INPUT statement, SAS expects that
data remaining on the input line contains only named input values. You cannot
switch to another form of input in the same INPUT statement after using named
input. Also, note that any variable that exists in the input data but is not defined in
the INPUT statement generates a note in the SAS log indicating a missing field.
Additional Data-Reading Features
In addition to different styles of input, there are many tools to meet the needs of different
data-reading situations. You can use options in the INFILE statement in combination
with the INPUT statement to give you additional control over the reading of data
records. The following table lists common data-reading tasks and the appropriate
features available in the INPUT and INFILE statements.
Table 19.5 Additional Data-Reading Features
Input Goal Use
multiple records create a single observation #n or / line pointer control in
the INPUT statement with a
DO loop.
a single record create multiple observations trailing @@ in the INPUT
statement.
trailing @ with multiple
INPUT and OUTPUT
statements.
variable-length data fields
and records
read delimited data list input with or without a
format modifier in the INPUT
statement and the
TRUNCOVER, DLM=,
DLMSTR=, or DSD options
in the INFILE statement.
read non-delimited data $VARYINGw. informat in the
INPUT statement and the
LENGTH= and
TRUNCOVER options in the
INFILE statement.
a file with varying record
layouts
IF-THEN statements with
multiple INPUT statements,
using trailing @ or @@ as
necessary.
hierarchical files
IF-THEN statements with
multiple INPUT statements,
using trailing @ as necessary.
Reading Raw Data with the INPUT Statement 439
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.26.22