Subsetting Rows By Using Conditional Operators

Overview

In the WHERE clause of a PROC SQL query, you can specify any valid SAS expression to subset or restrict the data that is displayed in output. The expression might contain any of various types of operators, such as the following.
Type of Operator
Example
comparison
where membertype='GOLD'
logical
where visits<=3 or status='new'
concatenation
where name=trim(last) ||', '||first
Note: For a complete list of operators that can be used in SAS expressions, see the SAS documentation.

Using Operators in PROC SQL

Comparison, logical, and concatenation operators are used in PROC SQL as they are used in other SAS procedures. For example, the following WHERE clause contains
  • the logical operator AND, which joins multiple conditions
  • two comparison operators: an equal sign (=) and a greater than symbol (>).
proc sql;
   select ffid, name, state, pointsused
      from sasuser.frequentflyers
      where membertype='GOLD' and pointsused>0
      order by pointsused;
In PROC SQL queries, you can also use the following conditional operators. All of these operators except for ANY, ALL, and EXISTS, can also be used in other SAS procedures.
Conditional Operator
Tests for ...
Example
BETWEEN-AND
values that occur within an inclusive range
where salary between 70000
             and 80000
CONTAINS or ?
values that contain a specified string
where name contains 'ER' 
where name ? 'ER'
IN
values that match one of a list of values
where code in ('PT' , 'NA', 'FA')
IS MISSING or IS NULL
missing values
where dateofbirth is missing
where dateofbirth is null
LIKE (with  %, _)
values that match a specified pattern
where address like '% P%PLACE'
=*
values that sound like a specified value
where lastname=* 'Smith'
ANY
values that meet a specified condition with respect to any one of the values returned by a subquery
where dateofbirth < any
   (select dateofbirth
      from sasuser.payrollmaster
      where jobcode='FA3')
ALL
values that meet a specified condition with respect to all the values returned by a subquery
where dateofbirth < all
   (select dateofbirth
      from sasuser.payrollmaster
      where jobcode='FA3')
EXISTS
the existence of values returned by a subquery
where exists
   (select *
      from sasuser.flightschedule
      where fa.empid=
            flightschedule.empid) 
Tip
To create a negative condition, you can precede any of these conditional operators, except for ANY and ALL, with the NOT operator.
Most of these conditional operators, and their uses, are covered in the next several sections. ANY, ALL, and EXISTS are discussed later in the chapter.

Using the BETWEEN-AND Operator to Select within a Range of Values

To select rows based on a range of numeric or character values, you use the BETWEEN-AND operator in the WHERE clause. The BETWEEN-AND operator is inclusive, so the values that you specify as limits for the range of values are included in the query results, in addition to any values that occur between the limits.
General form, BETWEEN-AND operator:
BETWEEN value-1 AND value-2
Here is an explanation of the syntax:
value-1
is the value at the one end of the range
value-2
is the value at the other end of the range.
Note: When specifying the limits for the range of values, it is not necessary to specify the smaller value first.
Here are several examples of WHERE clauses that contain the BETWEEN-AND operator. The last example shows the use of the NOT operator with the BETWEEN-AND operator.
Example
Returns rows in which...
where date between '01mar2000'd
        and '07mar2000'd
In this example, the values are specified as date constants.
the value of Date is 01mar2000, 07mar2000, or any date value in between
where salary between 70000
        and 80000
the value of Salary is 70000, 80000, or any numeric value in between
where salary not between 70000
        and 80000
the value of Salary is not between or equal to 70000 and 80000

Using the CONTAINS or Question Mark (?) Operator to Select a String

The CONTAINS or question mark (?) operator is usually used to select rows for which a character column includes a particular string. These operators are interchangeable.
General form, CONTAINS operator:
sql-expression CONTAINS sql-expression
sql-expression ? sql-expression
Here is an explanation of the syntax:
sql-expression
is a character column, string (character constant), or expression. A string is a sequence of characters to be matched that must be enclosed in quotation marks.
Note: PROC SQL retrieves a row for output no matter where the string (or second sql-expression) occurs within the column's (or first sql-expression's) values. Matching is case sensitive when making comparisons.
Note: The CONTAINS or question mark (?) operator is not part of the ANSI standard; it is a SAS enhancement.

Example

The following PROC SQL query uses CONTAINS to select rows in which the Name column contains the string ER. As the output shows, all rows that contain ER anywhere within the Name column are displayed.
proc sql outobs=10;
   select name
      from sasuser.frequentflyers
      where name contains 'ER';
Rows that Contain 'ER' in the Name Column

Using the IN Operator to Select Values from a List

To select only the rows that match one of the values in a list of fixed values, either numeric or character, use the IN operator.
General form, IN operator:
column IN (constant-1<,...constant-n>)
Here is an explanation of the syntax:
column
specifies the selected column name
constant-1 and constant-n
represent a list that contains one or more specific values. The list of values must be enclosed in parentheses and separated by either commas or spaces. Values can be either numeric or character. Character values must be enclosed in quotation marks.
Here are examples of WHERE clauses that contain the IN operator.
Example
Returns rows in which...
where jobcategory in ('PT','NA','FA')
the value of JobCategory is PT, NA, or FA
where dayofweek in (2,4,6)
the value of DayOfWeek is 2, 4, or 6
where chesspiece not in
('pawn','king','queen')
the value of chesspiece is rook, knight, or bishop

Using the IS MISSING or IS NULL Operator to Select Missing Values

To select rows that contain missing values, both character and numeric, use the IS MISSING or IS NULL operator. These operators are interchangeable.
General form, IS MISSING or IS NULL operator:
column IS MISSING
column IS NULL
Here is an explanation of the syntax:
column
specifies the selected column name.
Note: The IS MISSING operator is not part of the ANSI standard for SQL. It is a SAS enhancement.

Example

Suppose you want to find out whether the table Sasuser. Marchflights has any missing values in the column Boarded. You can use the following PROC SQL query to retrieve rows from the table that have missing values:
proc sql;
   select boarded, transferred,
          nonrevenue, deplaned
      from sasuser.marchflights
      where boarded is missing;
The output shows that two rows in the table have missing values for Boarded.
Sasuser. Marchflights
Tip
Alternatively, you can specify missing values without using the IS MISSING or IS NULL operator, as shown in the following examples:
where boarded = .
where flight = ' '
However, the advantage of using the IS MISSING or IS NULL operator is that you do not have to specify the data type (character or numeric) of the column.

Using the LIKE Operator to Select a Pattern

To select rows that have values that match as specific pattern of characters rather than a fixed character string, use the LIKE operator. For example, using the LIKE operator, you can select all rows in which the LastName value starts with H. (If you wanted to select all rows in which the last name contains the string HAR, you would use the CONTAINS operator.)
General form, LIKE operator:
column LIKE 'pattern'
Here is an explanation of the syntax:
column
specifies the column name
pattern
specifies the pattern to be matched and contains one or both of the special characters underscore ( _ ) and percent sign (%). The entire pattern must be enclosed in quotation marks and matching is case sensitive.
When you use the LIKE operator in a query, PROC SQL uses pattern matching to compare each value in the specified column with the pattern that you specify using the LIKE operator in the WHERE clause. The query output displays all rows in which there is a match.
You specify a pattern using one or both of the special characters shown below.
Special Character
Represents
underscore ( _ )
any single character
percent sign (%)
any sequence of zero or more characters
Note: The underscore (_) and percent sign (%) are sometimes referred to as wildcard characters.

Specifying a Pattern

To specify a pattern, combine one or both of the special characters with any other characters that you want to match. The special characters can appear before, after, or on both sides of other characters.
Consider how the special characters can be combined to specify a pattern. Suppose you are working with a table column that contains the following list of names:
  • Diana
  • Diane
  • Dianna
  • Dianthus
  • Dyan
Here are several patterns that you can use to select one or more of the names from the list. Each pattern uses one or both of the special characters.
LIKE Pattern
Name(s) Selected
LIKE 'D_an'
Dyan
LIKE 'D_an_'
Diana, Diane
LIKE 'D_an__
Dianna
LIKE 'D_an%'
all names from the list

Example

The following PROC SQL query uses the LIKE operator to find all frequent-flyer club members whose street name begins with P and ends with the word PLACE. The following PROC SQL step performs this query:
proc sql;
   select ffid, name, address
      from sasuser.frequentflyers
      where address like '% P%PLACE';
The pattern '% P%PLACE' specifies the following sequence:
  • any number of characters (%)
  • a space
  • the letter P
  • any number of characters (%)
  • the word PLACE.
Here are the results of this query.
PROC SQL query

Using the Sounds-Like (=*) Operator to Select a Spelling Variation

To select rows that contain a value that sounds like another value that you specify, use the sounds-like operator (=*) in the WHERE clause.
General form, sounds-like (=*) operator:
sql-expression =* sql-expression
Here is an explanation of the syntax:
sql-expression
is a character column, string (character constant), or expression. A string is a sequence of characters to be matched that must be enclosed in quotation marks.
The sounds-like (=*) operator uses the SOUNDEX algorithm to compare each value of a column (or other sql-expression) with the word or words (or other sql-expression) that you specify. Any rows that contain a spelling variation of the value that you specified are selected for output.
For example, here is a WHERE clause that contains the sounds-like operator:
where lastname =* 'Smith';
The sounds-like operator does not always select all possible values. For example, suppose you use the preceding WHERE clause to select rows from the following list of names that sound like Smith:
  • Schmitt
  • Smith
  • Smithson
  • Smitt
  • Smythe
Two of the names in this list are not selected: Schmitt and Smithson.
Note: The SOUNDEX algorithm is English-biased and is less useful for languages other than English. For more information about the SOUNDEX algorithm, see the SAS documentation.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.67.70