Subsetting Data By Using Subqueries

Introducing Subqueries

The WHERE and HAVING clauses both subset data based on an expression. In the query examples shown earlier in this chapter, the WHERE and HAVING clauses contained standard SAS expressions. For example, the expression in the following WHERE clause uses the BETWEEN-AND conditional operator and specifies the Salary column as an operand:
where salary between 70000 and 80000
PROC SQL also offers another type of expression that can be used for subsetting in WHERE and HAVING clauses: a query expression or subquery. A subquery is a query that is nested in, and is part of, another query. A PROC SQL query might contain subqueries at one or more levels.
Note: Subqueries are also known as nested queries, inner queries, and sub-selects.
The following PROC SQL query contains a subquery in the HAVING clause that returns all jobcodes where the average salary for that jobcode is greater than the company average salary.
proc sql;
   select jobcode, 
          avg(salary) as AvgSalary 
          format=dollar11.2
      from sasuser.payrollmaster
      group by jobcode
      having avg(salary) >
         (select avg(salary)
            from sasuser.payrollmaster);
Tip
It is recommended that you enclose a subquery (inner query) in parentheses, as shown here.
A subquery selects one or more rows from a table, and then returns single or multiple values to be used by the outer query. The subquery shown above is a single-value subquery; it returns a single value, the average salary from the table Sasuser.Payrollmaster, to the outer query. A subquery can return values for multiple rows but only for a single column.
The table that a subquery references can be either the same as or different from the table referenced by the outer query. In the PROC SQL query shown above, the subquery selects data from the same table as the outer query.

Types of Subqueries

There are two types of subqueries.
Type of Subquery
Description
noncorrelated
a self-contained subquery that executes independently of the outer query
correlated
a dependent subquery that requires one or more values to be passed to it by the outer query before the subquery can return a value to the outer query
Both noncorrelated and correlated subqueries can return either single or multiple values to the outer query.
The next few sections provide a more in-depth look at noncorrelated and correlated subqueries, and how they are processed.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.124.145