7.5. Inner Joins

As was mentioned earlier, inner joins can handle a maximum of 32 tables at a time, and are the most recognized and widely used type of join. They are principally used to restrict rows where the specific search condition is not met. As a result, only rows satisfying the conditions specified in the WHERE clause are kept. This is in direct contrast with outer joins (discussed in a later section).

7.5.1. Equijoins

The most common form of inner join often referred to as an equijoin uses an equal sign “=” in the WHERE clause to indicate equality between the columns in two or more tables. Suppose you wanted to match products with their corresponding manufacturers so that all products from each manufacturer would be listed. An equijoin is performed to equate the manufacturer number from tables PRODUCTS and MANUFACTURERS.

SQL Code

PROC SQL;
  SELECT prodname, prodcost,
         manufacturers.manunum, manuname
             1
						2
						FROM PRODUCTS, MANUFACTURERS
						WHERE products.manunum =
						3
						manufacturers.manunum;
QUIT;

The PRODUCTS table is the first table specified in the FROM clause.

The MANUFACTURERS table is the second table specified in the FROM clause.

The specification of an equal sign “=” in a WHERE clause between the columns in the tables indicates an equality type of join.

Results

                           The SAS System

                          Product  Manufacturer
  Product Name               Cost  Number 
 Manufacturer Name
  ________________________________________________
_________________
  Dream Machine         $3,200.00     111  Cupid
 Computer
  Business Machine      $3,300.00     120  Storage
 Devices Inc
  Analog Cell Phone        $35.00     210  Global
 Comm Corp
  Digital Cell Phone      $175.00     210  Global
 Comm Corp
  Spreadsheet Software    $299.00     500  KPL
 Enterprises
  Database Software       $399.00     500  KPL
 Enterprises
  Wordprocessor Software  $299.00     500  KPL
 Enterprises
  Graphics Software       $299.00     500  KPL
 Enterprises


The previous example can be further qualified by adding another condition in the WHERE clause. For example, suppose you wanted to display only those products from the manufacturer KPL Enterprises. The following join identifies all the products manufactured by KPL Enterprises as specified in the WHERE clause (all rows not meeting the condition of the WHERE clause are automatically excluded from the results of the join).

Note:This join assumes you know KPL Enterprises’s unique manufacturer number.

SQL Code

PROC SQL;
  SELECT prodname, prodcost,
         manufacturers.manunum, manuname
    FROM PRODUCTS, MANUFACTURERS
      WHERE products.manunum =
						1
						manufacturers.manunum AND
						products.manunum = 500;
QUIT;

The specification of the AND logical operator in the WHERE clause indicates that both conditions must be true in order to retrieve rows from both tables.

Results

                           The SAS System

                          Product  Manufacturer
  Product Name               Cost  Number 
 Manufacturer Name
  ________________________________________________
_______________
  Spreadsheet Software    $299.00      500  KPL
 Enterprises
  Database Software       $399.00      500  KPL
 Enterprises
  Wordprocessor Software  $299.00      500  KPL
 Enterprises
  Graphics Software       $299.00      500  KPL
 Enterprises


Let’s extend our knowledge of equijoins a bit further by identifying how much money is tied up with products manufactured by KPL Enterprises. To accomplish this, you need to do two things. First, you need to sum the product cost (PRODCOST) column across all rows that match the WHERE clause condition. Because the objective of the equijoin is to compute a total amount for products manufactured by KPL Enterprises, you need to prevent duplicate rows from displaying in the result. To do so, specify the DISTINCT keyword.

SQL Code

PROC SQL;
  SELECT DISTINCT SUM(prodcost) AS Total_Cost
						1
						FORMAT=DOLLAR10.2,
         manufacturers.manunum
    FROM PRODUCTS, MANUFACTURERS
      WHERE products.manunum =
						manufacturers.manunum AND
						manufacturers.manuname = 'KPL Enterprises';

QUIT;

The DISTINCT keyword prevents duplicate rows from appearing in the result.

Results

                   The SAS System

          Total_Cost    Manufacturer Name
          _____________________________________

           $1,296.00    KPL Enterprises


7.5.2. Non-Equijoins

Another type of inner join is known as a non-equijoin. As you might guess from its name, a non-equijoin does not have an equal sign “=” specified in its WHERE clause. For example, suppose you want to display products manufactured by KPL Enterprises that cost more than $299.00. The use of the greater than “>”operator gives this type of join its name.

Note:When the SQL procedure optimizer is unable to optimize a join query by reducing the Cartesian product, a message is displayed in the SAS log indicating that the join requires performing one or more Cartesian product joins and cannot be optimized.

SQL Code

PROC SQL;
  SELECT prodname, prodtype, prodcost,
         manufacturers.manunum, manufacturers.manuname
    FROM PRODUCTS, MANUFACTURERS
      WHERE manufacturers.manunum = 500 AND
						prodtype = 'Software' AND
						prodcost > 299.00;
						1
QUIT;

The specification of the greater than “>”operator in the WHERE clause indicates a non-equijoin scenario.

SAS Log Results

   PROC SQL;
     SELECT prodname, prodtype, prodcost,
            manufacturers.manunum, manufacturers
.manuname
       FROM PRODUCTS, MANUFACTURERS
         WHERE manufacturers.manunum = 500 AND
               prodtype = 'Software' AND
               prodcost > 299.00;
NOTE: The execution of this query involves
 performing one or more Cartesian
product joins that can not be optimized.
   QUIT;
NOTE: PROCEDURE SQL used:
      real time           0.01 seconds
      cpu time            0.01 seconds


Results

                             The SAS System

                     Product        Product 
 Manufacturer  Manufacturer
  Product Name       Type              Cost       
 Number  Name
  ________________________________________________
________________________
  Database Software  Software       $399.00       
    500  KPL Enterprises


7.5.3. Reflexive or Self Joins

The final type of inner join is referred to as a reflexive join, or as it is sometimes called by practitioners a self join. As its name implies, a self join makes an internal copy of a table and joins the copy to itself. Essentially a join of this type joins one copy of a table to itself for the purpose of exploiting and illustrating comparisons between table values. For example, suppose you want to compare the prices of products side-by-side by product type with the less expensive product appearing first (in the first three columns of example result below).

SQL Code

PROC SQL;
  SELECT products.prodname, products.prodtype,
						products.prodcost,
						products_copy.prodname, products_copy.prodtype,
						products_copy.prodcost
						1
						2
						FROM PRODUCTS, PRODUCTS PRODUCTS_COPY
						WHERE products.prodtype =
						3
						products_copy.prodtype AND
						products.prodcost <
						products_copy.prodcost;
QUIT;

The PRODUCTS table is the primary table specified in the FROM clause.

A copy of the PRODUCTS table called PRODUCTS_COPY is joined with the PRODUCTS table.

The WHERE clause requests the same type of products to be compared side-by-side with the less expensive product appearing first.

Results

                                      The SAS System

                     1                 Product    

												2                  Product
 Product Name          Product Type      Cost  
 Product Name           Product Type       Cost
 _________________________________________________
_____________________________________________
3 Dream Machine         Workstation  $3,200.00  
Business Machine       Workstation    $3,300.00
 Analog Cell Phone      Phone           $35.00 
 Digital Cell Phone     Phone            $175.00
 Analog Cell Phone      Phone           $35.00 
 Office Phone           Phone            $130.00
 Office Phone           Phone          $130.00 
 Digital Cell Phone     Phone            $175.00
 Spreadsheet Software   Software       $299.00 
 Database Software      Software         $399.00
 Wordprocessor Software Software       $299.00 
 Database Software      Software         $399.00
 Graphics Software      Software       $299.00 
 Database Software      Software         $399.00


Looking at another example, suppose you want to find out the names and invoice amounts where, for each customer, you list the names and invoice amounts of each customer with larger invoice amounts. The next example illustrates a very useful application of a self join.

SQL Code

PROC SQL;
  SELECT invoice.custnum, invoice.invprice,
						invoice_copy.custnum, invoice_copy.invprice
						1
						2
						FROM INVOICE, INVOICE INVOICE_COPY
						WHERE invoice.invprice <
						3
						invoice_copy.invprice;
QUIT;

The INVOICE table is the primary table specified in the FROM clause.

A copy of the INVOICE table called INVOICE_COPY is joined with the INVOICE table.

The WHERE clause produces names of customers with larger invoice amounts.

Results

                     The SAS System

     Customer  Invoice Unit  Customer  Invoice Unit
       Number         Price    Number         Price
    _______________________________________________
          201     $1,495.00      1301     $1,598.00
          201     $1,495.00       501     $9,600.00
          201     $1,495.00       401    $23,100.00
         1301     $1,598.00       501     $9,600.00
         1301     $1,598.00       401    $23,100.00
          101       $245.00       201     $1,495.00
          101       $245.00      1301     $1,598.00
          101       $245.00       501     $9,600.00
          101       $245.00       801       $798.00
          101       $245.00       901       $396.00
          101       $245.00       401    $23,100.00
          501     $9,600.00       401    $23,100.00
          801       $798.00       201     $1,495.00
          801       $798.00      1301     $1,598.00
          801       $798.00       501     $9,600.00
          801       $798.00       401    $23,100.00
          901       $396.00       201     $1,495.00
          901       $396.00      1301     $1,598.00
          901       $396.00       501     $9,600.00
          901       $396.00       801       $798.00
          901       $396.00       401    $23,100.00


7.5.4. Using Table Aliases in Joins

Every table in a SAS library must have a unique name to reference it. Table names must conform to valid SAS naming conventions having a maximum length of 32 characters and starting with a letter or underscore (see the SAS Language Reference: Concepts for further details).

To minimize the number of keystrokes when referencing the tables specified in a join query, you can assign an alias or temporary table name reference to each table. When assigned, these arbitrary aliases provide a short-cut method to the tables themselves and are in effect for the duration of the join query but no longer. In the next example, the table alias “P” is assigned to the PRODUCTS table and the alias “M” is assigned to the MANUFACTURERS table in the FROM clause. Table name references in the SELECT statement and WHERE clause are made easier as well.

SQL Code

PROC SQL;
  SELECT prodnum, prodname, prodtype, M.manunum
    FROM PRODUCTS  P, MANUFACTURERS M
						1
						WHERE P.manunum  = M.manunum AND
						M.manuname = 'KPL Enterprises';
QUIT;

The assignment of the table alias “P” and the table alias “M” in the FROM clause provides a short-cut method of referencing the longer table names PRODUCTS and MANUFACTURERS.

Results

                         The SAS System

    Product                                       
      Manufacturer
     Number  Product Name              Product
 Type           Number
    ______________________________________________
__________________
       5001  Spreadsheet Software      Software   
               500
       5002  Database Software         Software   
               500
       5003  Wordprocessor Software    Software   
               500
       5004  Graphics Software         Software   
               500


7.5.5. Performing Computations in Joins

Join queries, as with simpler queries, can take full advantage of the power of the SQL procedure. Logical and arithmetic operators, predicates, and summary functions are all available for you to use. The join query is an essential component because stored information is not always available in the form we need.

PROC SQL provides the ability to perform basic arithmetic operations such as addition, subtraction, multiplication, and division with columns containing numeric values. Essentially, this enables any query to perform column addition, subtraction, multiplication, and division. Suppose you had to compute the sales tax of 7.75% for all manufactured products sold in the state of California. In the next example, the SELECT statement shows the California sales tax (using the product cost column and the fixed sales tax percentage) computation, assigns a column alias to the result column as well as a format and label to enhance the readability of the result.

SQL Code

PROC SQL;
  SELECT prodname, prodtype, prodcost,
						prodcost * .0775 AS SalesTax
						1
						FORMAT=dollar10.2 LABEL='California Sales Tax'
    FROM PRODUCTS  P, MANUFACTURERS M
      WHERE P.manunum  = M.manunum AND
            M.manustat = 'CA';
QUIT;

The ability to perform basic arithmetic operations in a SELECT statement as well as assign a column alias to the result is part of the SQL ANSI standard.

Results

                          The SAS System
                                                  

												1
												 Product  California
   Product Name               Product Type        
 Cost   Sales Tax
   _______________________________________________
_________________
   Business Machine           Workstation     $3
,300.00     $255.75
   Analog Cell Phone          Phone             
 $35.00       $2.71
   Digital Cell Phone         Phone            
 $175.00      $13.56
   Spreadsheet Software       Software         
 $299.00      $23.17
   Database Software          Software         
 $399.00      $30.92
   Wordprocessor Software     Software         
 $299.00      $23.17
   Graphics Software          Software         
 $299.00      $23.17


7.5.6. Joins with Three Tables

Up to this point, our examples have been limited to two-table joins. But what if more information is needed than the two tables can provide? To extract the required information, access to a third table may be necessary. A join with three tables is a fairly simple extension of a two-table join.

As before, each joinable column must possess the same column attributes and contain the same type of information. Besides listing all required tables in the FROM clause, the WHERE clause would need to include any and all restrictions to subset only the rows desired. For example, suppose you want to display only those products along with their invoice quantity that appear in the INVOICE table for the manufacturer KPL Enterprises (manunum=500).

SQL Code

PROC SQL;
  SELECT P.prodname,
         P.prodcost,
         M.manuname,
         I.invqty
    FROM PRODUCTS P,
						MANUFACTURERS  M,
						INVOICE I
      WHERE P.manunum = M.manunum AND
            P.prodnum = I.prodnum AND
            M.manunum = 500;
QUIT;

Results

                         The SAS System

                                                  
         Invoice
                                                  
       Quantity
                          Product                 
        - Units
   Product Name              Cost    Manufacturer
 Name         Sold
   _______________________________________________
__________________
   Spreadsheet Software   $299.00    KPL
 Enterprises       5
   Database Software      $399.00    KPL
 Enterprises       2


Let’s examine the construction of the WHERE clause for this three-way join a bit further. The column containing the manufacturer number from the PRODUCTS, MANUFACTURERS, and INVOICE tables is joined using an AND logical operator in the WHERE clause. Additionally, the WHERE clause restricts the resulting table to only product invoices for manufacturer (manunum=500). In the next example, a three-way join lists the product names and costs, along with the customer who bought each product.

SQL Code

PROC SQL;
  SELECT P.prodname,
         P.prodcost,
         C.custname,
         I.invprice
    FROM PRODUCTS  P,
						INVOICE   I,
						CUSTOMERS C
      WHERE P.prodnum = I.prodnum AND
            I.custnum = C.custnum;
QUIT;

Results

                         The  SAS System

                       Product                    
               Invoice
  Product Name            Cost   Customer Name    
                 Price
  ________________________________________________
______________________
  Analog Cell Phone      $35.00  La Mesa Computer
 Land           $245.00
  Spreadsheet Software  $299.00  Vista Tech Center
             $1,495.00
  Business Machine    $3,300.00  La Jolla
 Computing           $23,100.00
  Dream Machine       $3,200.00  Alpine Technical
 Center       $9,600.00
  Database Software     $399.00  Jamul Hardware & 
Software       $798.00


7.5.7. Joins with More Than Three Tables

Occasionally, information needs to be extracted from four, five, or more tables (up to a maximum of 32 tables). Joins of four or more tables can be constructed just like those accessing two or three tables. The only difference is the number of table references in the FROM clause and the level of complexity in the WHERE clause to restrict what rows are kept. Suppose you want to know, based on invoices, the number of products ordered before September 1, 2000. One way to find this information is to perform a join with four tables.

SQL Code

PROC SQL;
  SELECT sum(inventory.invenqty)
						AS Products_Ordered_Before_09012000
    FROM PRODUCTS,
         INVOICE,
         CUSTOMERS,
         INVENTORY
      WHERE inventory.orddate < mdy(09,01,00) AND
            products.prodnum  = invoice.prodnum AND
            invoice.custnum   = customers.custnum AND
            invoice.prodnum   = inventory.prodnum;
QUIT;

Results

                      The SAS System

                         Products_
                   Ordered_Before_
                          09012000
                         _________
                                 8


If you were wondering whether this result could have been derived another way, you would be correct. You could also determine, based on invoices, the number of products ordered before September 1, 2000, with the following two-way join code. As can be seen, there is often more than one way to construct a join to extract the information you want.

SQL Code

PROC SQL;
  SELECT sum(inventory.invenqty)
						AS Products_Ordered_Before_09012000
    FROM INVOICE   I,
         INVENTORY I2
      WHERE inventory.orddate < mdy(09,01,00) AND
            invoice.prodnum   = inventory.prodnum;
QUIT;

Results

                      The SAS System

                         Products_
                   Ordered_Before_
                          09012000
                         _________
                                 8


To expand your understanding of joins with more than three tables, we will illustrate a four-table join. Suppose you want to know the products being purchased and who is purchasing them. The next example shows a four-way inner join that combines data from the MANUFACTURERS, PRODUCTS, INVOICE, and CUSTOMERS tables.

SQL Code

PROC SQL;
  SELECT products.prodname,
         products.prodtype,
         customers.custname,
         manufacturers.manuname
    FROM MANUFACTURERS,
         PRODUCTS,
         INVOICE,
         CUSTOMERS
      WHERE manufacturers.manunum = products.manunum  AND
            manufacturers.manunum = invoice.manunum   AND
            products.prodnum      = invoice.prodnum   AND
            invoice.custnum       = customers.custnum;
QUIT;

Results

                                  The SAS System

Product Name          Product Type  Customer Name 
             Manufacturer Name
--------------------------------------------------
---------------------------------
Analog Cell Phone     Phone         La Mesa
 Computer Land      Global Comm Corp
Spreadsheet Software  Software      Vista Tech
 Center          Incredible Software
Dream Machine         Workstation   Alpine
 Technical Center    Cupid Computer
Database Software     Software      Jamul Hardware
 & Software  KPL Enterprises


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.80.209