Subsetting Data By Using Correlated Subqueries

Overview

Correlated subqueries cannot be evaluated independently, but depend on the values passed to them by the outer query for their results. Correlated subqueries are evaluated for each row in the outer query and therefore tend to require more processing time than noncorrelated subqueries.
Note: Usually, a PROC SQL join is a more efficient alternative to a correlated subquery. You should already be familiar with basic PROC SQL joins.

Example

Consider an example of a PROC SQL query that contains a correlated subquery. The following query displays the names of all navigators who are also managers. The WHERE clause in the subquery lists the column Staffmaster.EmpID. The outer query must pass this column to the correlated subquery.
proc sql;
   select lastname, firstname
      from sasuser.staffmaster
      where 'NA'=
             (select jobcategory
                from sasuser.supervisors
                where staffmaster.empid = 
                      supervisors.empid);
 
Note: When a column appears in more than one table, the column name is preceded by the table name or alias to avoid ambiguity. In this example, EmpID appears in both tables, so the appropriate table name is specified in front of each reference to that column.
The output from this query is shown below. There are three navigators who are also managers.
List of navigators who area also managers

Using the EXISTS and NOT EXISTS Conditional Operators

In the WHERE clause or in the HAVING clause of an outer query, you can use the EXISTS or NOT EXISTS conditional operator to test for the existence or non-existence of a set of values returned by a subquery.
Condition
Is true if...
EXISTS
the subquery returns at least one row
NOT EXISTS
the subquery returns no data
Note: The operators EXISTS and NOT EXISTS can be used with both correlated and noncorrelated subqueries.

Example: Correlated Subquery with NOT EXISTS

Consider a sample PROC SQL query that includes the NOT EXISTS conditional operator. Suppose you are working with the following tables:
  • Sasuser.Flightattendants contains the names and employee ID numbers of all flight attendants.
  • Sasuser.Flightschedule contains one row for each crew member assigned to a flight for each date.
As shown in the diagram below, the intersection of these two tables contains data for all flight attendants who have been scheduled to work.
Venn diagram of flight attendants and employees scheduled to work
Now suppose you want to list by name the flight attendants who are not scheduled. That is, you want to identify the data in the area highlighted below.
Venn diagram of flight attendants and employees scheduled to work, flight attendant's not scheduled is highlighted
The following PROC SQL query accomplishes this task by using a correlated subquery and the NOT EXISTS operator:
 proc sql;
    select lastname, firstname
       from sasuser.flightattendants
       where not exists
          (select *
             from sasuser.flightschedule
		                  where flightattendants.empid=
		                        flightschedule.empid);
The output is shown below.
List of employees not scheduled for a flight
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.195.34