Identifying Observations

Using the ID Statement in PROC PRINT

This statement identifies observations by using the formatted values of the variables that you list instead of by using observation numbers. This statement is particularly useful when observations are too long to print on one line.
Syntax, ID statement in the PRINT procedure:
ID variable(s);
variable(s) specifies one or more variables to print instead of the observation number at the beginning of each row of the report.

Example: ID Statement and VAR Statement

To replace the Obs column and identify observations based on an employee's ID number and last name, submit the following program.
proc print data=sasuser.reps; 
    id idnum lastname; 
run;
This is HTML output from the program:
Figure 5.4 HTML Output
HTML Output
If a variable in the ID statement also appears in the VAR statement, the output contains two columns for that variable. In the example below, the variable IDnum appears twice.
proc print data=sasuser.reps; 
    id idnum lastname; 
    var idnum sex jobcode salary; 
run;
Figure 5.5 IDNUM Output
IDNUM Output

Selecting Observations

By default, a PROC PRINT step lists all the observations in a data set. You can control which observations are printed by adding a WHERE statement to your PROC PRINT step. There should be only one WHERE statement in a step. If multiple WHERE statements are issued, only the last statement is processed.
Syntax, WHERE statement:
WHERE where-expression;
where-expression specifies a condition for selecting observations. The where-expression can be any valid SAS expression.
For example, the following WHERE statement selects only observations for which the value of Age is greater than 30:
proc print data=clinic.admit; 
   var age height weight fee; 
   where age>30; 
run;
Here is the output from the PROC PRINT step with the WHERE statement:
Figure 5.6 PROC PRINT Output with WHERE Statement
PROC PRINT Output with WHERE Statement

Specifying WHERE Expressions

In the WHERE statement, you can specify any variable in the SAS data set, not just the variables that are specified in the VAR statement. The WHERE statement works for both character and numeric variables. To specify a condition based on the value of a character variable:
  • Enclose the value in quotation marks.
  • Write the value with lowercase, uppercase, or mixed case letters exactly as it appears in the data set.
You use the following comparison operators to express a condition in the WHERE statement:
Table 5.1 Comparison Operators in a WHERE Statement
Symbol
Meaning
Sample Program Code
= or eq
equal to
where name='Jones, C.';
^= or ne
not equal to
where temp ne 212;
> or gt
greater than
where income>20000;
< or lt
less than
where partno lt "BG05";
>= or ge
greater than or equal to
where id>='1543';
<= or le
less than or equal to
where pulse le 85;
For more information about valid SAS expressions, see Creating SAS Data Sets from External Files.

Using the CONTAINS Operator

The CONTAINS operator selects observations that include the specified substring. The symbol for the CONTAINS operator is ?. You can use either the CONTAINS keyword or the symbol in your code, as shown below.
where firstname CONTAINS 'Jon'; 
where firstname ? 'Jon';

Specifying Compound WHERE Expressions

You can also use WHERE statements to select observations that meet multiple conditions. To link a sequence of expressions into compound expressions, you use logical operators, including the following:
Table 5.2 Compound WHERE Expression Operators
Operator, Symbol
Description
AND
&
and, both. If both expressions are true, then the compound expression is true.
OR
|
or, either. If either expression is true, then the compound expression is true.

Examples of WHERE Statements

  • You can use compound expressions like these in your WHERE statements:
    where age<=55 and pulse>75; 
    where area='A' or region='S';  
    where ID>'1050' and state='NC';
  • When you test for multiple values of the same variable, you specify the variable name in each expression:
    where actlevel='LOW' or actlevel='MOD'; 
    where fee=124.80 or fee=178.20;
  • You can use the IN operator as a convenient alternative:
    where actlevel in ('LOW','MOD'); 
    where fee in (124.80,178.20);
  • To control how compound expressions are evaluated, you can use parentheses (expressions in parentheses are evaluated first):
    where (age<=55 and pulse>75) or area='A'; 
    where age<=55 and (pulse>75 or area='A');
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.253.223