7.6. Outer Joins

As the previous examples in this chapter have shown, an inner join disregards any rows where the search condition is not met. This differs significantly from the way an outer join groups tables. In contrast with an inner join, an outer join keeps rows that match the ON (search) condition, as well as preserving some or all of the unmatched data from one or both of the tables. Essentially, an outer join retains rows from one table even when they do not match rows in the second table. This distinction is critical because this is what truly differentiates an outer join from an inner join.

Next, an outer join is capable of processing a maximum of two tables at a time, whereas (under the SAS implementation) an inner join is able to process a maximum of 32 tables.

Another difference has to do with how you specify outer join syntax. The comma used to designate or delimit one table from the other in the FROM clause of inner joins is replaced with one of the following keywords: LEFT JOIN, RIGHT JOIN, or FULL JOIN in outer joins Additionally, the WHERE clause expression used to restrict what rows are kept in the result table is replaced with the ON keyword.

Finally, an outer join is considered to be an asymmetric join (Lorie, Raymond A. and Jean-Jacques Daudenarde, SQL & Its Applications, page 87). Unlike inner joins, an outer join does not select rows proportionally from its parts or tables.

7.6.1. Left Outer Joins

Let’s look at how a left join is applied in a real-world situation. Suppose you want to see a list of all manufacturers, their city locations, manufacturer numbers, their product types, and product costs (if available) without leaving out those manufacturers that do not have products yet. This means that the MANUFACTURERS table (left table) acts as the master table having its rows preserved while the PRODUCTS table (right table) acts as the contributing table (subordinate table). The following left outer join example effectively retains those matched rows from both tables as well as those rows from the left table that have no match in the right table.

SQL Code

PROC SQL;
  SELECT manuname, manucity, manufacturers.manunum,
         products.prodtype, products.prodcost
    FROM MANUFACTURERS LEFT JOIN PRODUCTS    1
						ON manufacturers.manunum =
						2
						products.manunum;
QUIT;

The LEFT JOIN specification preserves all the rows in the left table (MANUFACTURERS) even when there are no matching rows in the right table (PRODUCTS).

The ON clause acts as a WHERE clause to select the desired rows in the join results.

As the results from the left outer join illustrate, the rows in the left (MANUFACTURERS) table that match rows in the right (PRODUCTS) table are included in the result table. As a result, eight rows match as evidenced by the value assigned to product type and product cost. Additionally, two rows from the left table that do not match rows in the right table (based on the search condition) are also retained (bolded). Therefore, each row from the MANUFACTURERS table that does not have a matching value in the PRODUCTS table is added to the resulting virtual table, accompanied by null values in the product type and product cost columns.

Results

                                The SAS System

                                           
 Manufacturer                Product
   Manufacturer Name     Manufacturer City       
 Number  Product Type  Cost
__________________________________________________
_______________________________
   Cupid Computer        Houston                  
  111   Workstation   $3,200.00
   Storage Devices Inc   San Mate                 
  120   Workstation   $3,300.00
   Global Comm Corp      San Diego                
  210   Phone           $175.00
   Global Comm Corp      San Diego                
  210   Phone            $35.00
   KPL Enterprises       San Diego                
  500   Software        $299.00
   KPL Enterprises       San Diego                
  500   Software        $299.00
   KPL Enterprises       San Diego                
  500   Software        $299.00
   KPL Enterprises       San Diego                
  500   Software        $399.00
   World Internet Corp   Miami                    
  600                         .
												San Diego PC Planet   San Diego                
  700                         .


7.6.1.1. Specifying a WHERE Clause

To provide greater subsetting capabilities as well as added flexibility, the SQL procedure also permits the specification of an optional WHERE clause in addition to an ON clause when constructing outer joins. The ability to specify a WHERE clause in conjunction with an ON clause permits greater control over the subsetting of rows. An example will help illustrate how a WHERE clause is used in an outer join. Suppose you want to limit the results from the previous left outer join to only those products costing less than $300. In this example, the left outer join syntax uses a WHERE clause to subset row results to nonmissing products that cost less than $300.

SQL Code

PROC SQL;
  SELECT manuname, manucity, manufacturers.manunum,
         products.prodtype, products.prodcost
    FROM MANUFACTURERS LEFT JOIN PRODUCTS
      ON manufacturers.manunum =
         products.manunum
        WHERE prodcost < 300 AND
							1
							prodcost NE .;
QUIT;

The optional WHERE clause specified in addition to an ON clause in an outer join further subsets the joined results.

Results

                               The SAS System
                                                  

													1
													 Manufacturer                      Product
Manufacturer Name     Manufacturer City       
 Number   Product Type          Cost
__________________________________________________
_________________________________
Global Comm Corp      San Diego                  
 210   Phone              $175.00
Global Comm Corp      San Diego                  
 210   Phone               $35.00
KPL Enterprises       San Diego                  
 500   Software           $299.00
KPL Enterprises       San Diego                  
 500   Software           $299.00
KPL Enterprises       San Diego                  
 500   Software           $299.00


7.6.1.2. Specifying Aggregate Functions

Suppose you need to produce a monthly report consisting of a total invoice amount by manufacturer. An aggregate function can be specified with outer join syntax to perform a group computation using a GROUP BY clause. In the next example, a left join computes the total invoice amount for each manufacturer with a SUM function and GROUP BY clause.

SQL Code

PROC SQL;
  SELECT manuname,
         SUM(invoice.invprice) AS Total_Invoice_Amt
							1
             FORMAT=DOLLAR10.2
    FROM MANUFACTURERS LEFT JOIN  INVOICE
      ON manufacturers.manunum =
         invoice.manunum
        GROUP BY MANUNAME;
							2
QUIT;

The SUM function computes the total invoice amount for each manufacturer.

The GROUP BY clause groups all rows associated with a manufacturer into a single row.

The results show that manufacturers with no activity have a null or missing value in the aggregated Total_Invoice_Amt column.

Results

                    The SAS System

                                           1
													2                           Total_
        Manufacturer Name           Invoice_Amt
        _______________________________________
        Cupid Computer                $9,600.00
        Global Comm Corp                $245.00
        KPL Enterprises              $25,789.00
        San Diego PC Planet                 .
        Storage Devices Inc                 .
        World Internet Corp           $1,598.00


7.6.2. Right Outer Joins

Right joins are similar to left joins, except the rows in the right (second) table are preserved. Consequently, the results will contain the rows of the symmetric join plus a row for each unmatched row in the right table. Nulls are automatically substituted for values from the left table. Suppose you want to see all manufacturers and their respective products. In the next example, a simple report containing products, product type, manufacturer number, and manufacturer name is produced from the PRODUCTS and MANUFACTURERS tables using a right outer join construct.

SQL Code

PROC SQL;
   SELECT prodname, prodtype,
          products.manunum, manuname
     FROM PRODUCTS RIGHT JOIN MANUFACTURERS     1
						ON products.manunum =
						manufacturers.manunum;
QUIT;

The RIGHT JOIN specification preserves all the rows in the right table (MANUFACTURERS) even when there are no matching rows in the left table (PRODUCTS).

The results show that manufacturers appearing in the MANUFACTURERS table with no products listed in the PRODUCTS table have null or missing values in the Product Name, Product Type, and Manufacturer Number columns.

Note:To remove rows with missing values in the results, a WHERE clause could be specified.

Results

                               The SAS System
                                                  

												1
												 Manufacturer   Manufacturer
     Product Name            Product Type       
 Number   Name
     _____________________________________________
___________________________
     Dream Machine           Workstation          
  111   Cupid Computer
     Business Machine        Workstation          
  120   Storage Devices Inc
     Digital Cell Phone      Phone                
  210   Global Comm Corp
     Analog Cell Phone       Phone                
  210   Global Comm Corp
     Spreadsheet Software    Software             
  500   KPL Enterprises
     Graphics Software       Software             
  500   KPL Enterprises
     Wordprocessor Software  Software             
  500   KPL Enterprises
     Database Software       Software             
  500   KPL Enterprises
                                                  
    .   World Internet Corp
                                                  
    .   San Diego PC Planet


7.6.3. Full Outer Joins

Full outer joins combine the power of left and right joins by preserving rows from both the left and right tables. Although a full join is not used as frequently as left join or right join constructs, it can be useful when information from both tables is missing. In the next example, a full outer join is specified to produce a report containing manufacturers with no products and products with no known manufacturers.

SQL Code

PROC SQL;
  SELECT prodname, prodtype,
         products.manunum, manuname
    FROM PRODUCTS FULL JOIN MANUFACTURERS     1
						ON products.manunum =
						manufacturers.manunum;
QUIT;

The full join specification preserves all the rows in the left table (PRODUCTS) as well as all rows in the right table (MANUFACTURERS) even when there are no matching rows.

Results

                                The SAS System
                                                  

												1
												 Manufacturer
Product Name               Product Type           
 Number   Manufacturer Name
__________________________________________________
___________________________
Dream Machine              Workstation            
    111   Cupid Computer
Business Machine           Workstation            
    120   Storage Devices Inc
Travel Laptop              Laptop                 
    170
Digital Cell Phone         Phone                  
    210   Global Comm Corp
Analog Cell Phone          Phone                  
    210   Global Comm Corp
Office Phone               Phone                  
    220
Spreadsheet Software       Software               
    500   KPL Enterprises
Graphics Software          Software               
    500   KPL Enterprises
Wordprocessor Software     Software               
    500   KPL Enterprises
Database Software          Software               
    500   KPL Enterprises
                                                  
      .   World Internet Corp
                                                  
      .   San Diego PC Planet


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.133.54