Identifying Observations

Using the ID Statement in PROC PRINT

The ID statement identifies observations using variable values, such as an identification number, instead of observation numbers.
Syntax, ID statement in the PRINT procedure:
ID variable(s);
variable(s) specifies one or more variables to print whose value is used instead of the observation number at the beginning of each row of the report.

Example: ID Statement

In the following example, the OBS column in the output is replaced with the variable values for IDnum and LastName.
proc print data=cert.reps;
  id idnum lastname;
run;
Here is the output produced by PROC PRINT:
Figure 6.4 PROC PRINT: ID Statement Output
PROC PRINT: ID Statement Output

Example: ID and VAR Statement

You can use the ID and VAR statement together to control which variables are printed and in which order. If a variable in the ID statement also appears in the VAR statement, the output contains two columns for that variable.
proc print data=cert.reps;
  id idnum lastname;                 /*#1*/
  var idnum sex jobcode salary;      /*#2*/
run;
1 The ID statement replaces the OBS column in the output with the IDnum and LastName variable values.
2 The VAR statement selects the variables that appear in the output and determines the order.
The variable IDnum appeared in both the ID statement and the VAR statement. Therefore, IDnum appears twice in the output.
Output 6.1 PROC PRINT: ID and VAR Statement Output
PROC PRINT: ID and VAR Statement Output

Selecting Observations

By default, a PROC PRINT step lists all the observations in a data set. You can control which observations are printed by adding a WHERE statement to your PROC PRINT step. There should be only one WHERE statement in a step. If multiple WHERE statements are issued, only the last statement is processed.
Syntax, WHERE statement:
WHERE where-expression;
where-expression specifies a condition for selecting observations. The where-expression can be any valid SAS expression.
Example Code 6.1 Using the WHERE Statement in PROC PRINT
proc print data=cert.admit;
  var age height weight fee;         /*#1*/
  where age>30;                      /*#2*/
run;
1 The VAR statement selects the variables Age, Height, Weight, and Fee and displays them in the output in that order.
2 The WHERE statement selects only the observations for which the value of Age is greater than 30 and prints them in the output.
The following output displays only the observations where the value of Age is greater than 30.
Figure 6.5 PROC PRINT Output with a WHERE Statement
PROC PRINT Output with WHERE Statement

Specifying WHERE Expressions

In the WHERE statement, you can specify any variable in the SAS data set, not just the variables that are specified in the VAR statement. The WHERE statement works for both character and numeric variables. To specify a condition based on the value of a character variable, follow these rules:
  • Enclose the value in quotation marks.
  • Write the value with lowercase, uppercase, or mixed case letters exactly as it appears in the data set.
You use the following comparison operators to express a condition in the WHERE statement:
Table 6.1 Comparison Operators in a WHERE Statement
Symbol
Meaning
Sample Program Code
= or eq
equal to
where name='Jones, C.';
^= or ne
not equal to
where temp ne 212;
> or gt
greater than
where income>20000;
< or lt
less than
where partno lt "BG05";
>= or ge
greater than or equal to
where id>='1543';
<= or le
less than or equal to
where pulse le 85;
For more information about valid SAS expressions, see Creating SAS Data Sets.

Using the CONTAINS Operator

The CONTAINS operator selects observations that include the specified substring. The symbol for the CONTAINS operator is ?. You can use either the CONTAINS keyword or the symbol in your code, as shown below.
where firstname CONTAINS 'Jon'; 
where firstname ? 'Jon';

Specifying Compound WHERE Expressions

You can also use WHERE statements to select observations that meet multiple conditions. To link a sequence of expressions into compound expressions, you use logical operators, including the following:
Table 6.2 Compound WHERE Expression Operators
Operator, Symbol
Description
AND
&
and, both. If both expressions are true, then the compound expression is true.
OR
|
or, either. If either expression is true, then the compound expression is true.

Examples of WHERE Statements

  • You can use compound expressions like these in your WHERE statements:
    where age<=55 and pulse>75; 
    where area='A' or region='S';  
    where ID>'1050' and state='NC';
  • When you test for multiple values of the same variable, you specify the variable name in each expression:
    where actlevel='LOW' or actlevel='MOD'; 
    where fee=124.80 or fee=178.20;
  • You can use the IN operator as a convenient alternative:
    where actlevel in ('LOW','MOD'); 
    where fee in (124.80,178.20);
  • To control how compound expressions are evaluated, you can use parentheses (expressions in parentheses are evaluated first):
    where (age<=55 and pulse>75) or area='A'; 
    where age<=55 and (pulse>75 or area='A');

Using System Options to Specify Observations

SAS system options set the preferences for a SAS session. You can use the FIRSTOBS= and OBS= options in an OPTIONS statement to specify the observations to process from SAS data sets.
Specify either or both of these options as needed:
  • FIRSTOBS= starts processing at a specific observation.
  • OBS= stops processing after a specific observation.
Note: Using FIRSTOBS= and OBS= together processes a specific group of observations.
Syntax, FIRSTOBS=, and OBS= options in an OPTIONS statement:
FIRSTOBS=n
OBS=n
n is a positive integer. For FIRSTOBS=, n specifies the number of the first observation to process. For OBS=, n specifies the number of the last observation to process. By default, FIRSTOBS=1. The default value for OBS= is MAX, which is the largest signed, 8-byte integer that is representable in your operating environment. The number can vary depending on your operating system.
To reset the number of the last observation to process, you can specify OBS=MAX in the OPTIONS statement.
options obs=max;
This instructs any subsequent SAS programs in the SAS session to process through the last observation in the data set that is being read.
CAUTION:
Each of these options applies to every input data set that is used in a program or a SAS process because a system option sets the preference for the SAS session.

Examples: FIRSTOBS= and OBS= Options

The following examples use the data set Cert.Heart, which contains 20 observations and 8 variables.
Example Code 6.2 Using the FIRSTOBS= Option
options firstobs=10;         /*#1*/
proc print data=cert.heart;  /*#2*/
run;
1 Use the OPTIONS statement to specify the FIRSTOBS= option. In this example, the FIRSTOBS=10 option enables SAS to read the 10th observation of the data set first and read through the last observation.
2 A total of 11 observations are printed using the PROC PRINT step.
Here is the output:
Figure 6.6 PROC PRINT Output with FIRSTOBS=10
PROC PRINT Output with FIRSTOBS=10
You can specify the FIRSTOBS= and OBS= options together. In the following example, SAS reads only through the 10th observation.
Example Code 6.3 Using the FIRSTOBS= and OBS= Options
options firstobs=1 obs=10;      /*#1*/
proc print data=cert.heart;     /*#2*/
run;
1 The FIRSTOBS=1 option resets the FIRSTOBS= option to the default value. The default value reads the first observation in the data set. When you specify OBS=10 in the OPTIONS statement, SAS reads through the 10th observation.
2 A total of 10 observations are printed using the PROC PRINT step.
Here is the output:
Figure 6.7 PROC PRINT Output with FIRSTOBS=1 and Obs=10
PROC PRINT Output with FIRSTOBS=1 and Obs=10
You can also combine FIRSTOBS= and OBS= to process observations in the middle of the data set.
Example Code 6.4 Processing Middle Observations of a Data Set
options firstobs=10 obs=15;    /*#1*/
proc print data=cert.heart;    /*#2*/
run;
1 When you set FIRSTOBS=10 and OBS=15, the program processes only observations 10 through 15.
2 A total of six observations are printed using the PROC PRINT step.
Here is the output:
Figure 6.8 PROC PRINT Output with FIRSTOBS=10 and Obs=15
PROC PRINT Output with FIRSTOBS=10 and Obs=15

Using FIRSTOBS= and OBS= for Specific Data Sets

Using the FIRSTOBS= or OBS= system options determines the first or last observation, respectively, that is read for all steps for the duration of your current SAS session or until you change the setting. However, you can still do the following:
  • override these options for a given data set
  • apply these options to a specific data set only
To affect any single file, use FIRSTOBS= or OBS= as data set options instead of using them as system options. You specify data set options in parentheses immediately following the input data set name.
Tip
A FIRSTOBS= or OBS= specification from a data set option overrides the corresponding FIRSTOBS= or OBS= system option, but only for that DATA step.

Example: FIRSTOBS= and OBS= as Data Set Options

As shown in the following example, this program processes only observations 10 through 15, for a total of 6 observations:
options firstobs=10 obs=15; 
proc print data=clinic.heart; 
run;
You can create the same output by specifying FIRSTOBS= and OBS= as data set options, as follows. The data set options override the system options for this instance only.
options firstobs=10 obs=15; 
proc print data=clinic.heart(firstobs=20 obs=30); 
run;
To specify FIRSTOBS= or OBS= for this program only, you could omit the OPTIONS statement altogether and simply use the data set options.
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.247.81