Using Outer Joins

A Brief Overview

An outer join returns nonmatching rows as well as matching rows.
Type of Outer Join
Output
Left
Returns all rows from the left table (first table) and matching rows from the right table (second table) that are specified in the FROM clause.
Venn Diagram, Left Outer Join
Right
Returns all rows from the right table (second table) and matching rows from the left table (first table).
Venn Diagram, Right Outer Join
Full
Returns all matching and nonmatching rows from all the tables.
Venn Diagram, Full Outer Join

Outer Join Syntax

Syntax, SELECT statement for outer join:
SELECT column-1<,...column-n>
FROM table-1 | view-1 LEFT JOIN | RIGHT JOIN | FULL JOIN table-2 | view-2
ON table1.column = table2.column
<other clauses>;
LEFT JOIN, RIGHT JOIN, FULL JOIN
are keywords that specify the type of outer join.
table
specifies the name of the source table.
ON
specifies join conditions, which are expressions that specify the column or columns on which the tables are to be joined.
table.column
refers to the source table and the column name on which the join occurs.
<other clauses>
refers to optional PROC SQL clauses.
Note: To further subset the rows in the query output, you can follow the ON clause with a WHERE clause. The WHERE clause subsets the individual detail rows before the outer join is performed. The ON clause then specifies how the remaining rows are to be selected for output.
Note: You can perform an outer join on only two tables or views at a time.

Example: Using a Left Outer Join

A left outer join retrieves all rows that match across tables, based on the join conditions, plus nonmatching rows from the left table (the first table specified in the FROM clause).
Suppose you are using a PROC SQL left join to combine the two tables One and Two. The join condition is stated in the expression following the ON keyword.
proc sql;
   select *
      from certadv.one left join
           certadv.two 
           on one.x=two.x
;
quit;
The two tables and the three rows of output are shown below:
Output 3.10 PROC SQL Query Result: Left Outer Join Output
PROC SQL Query Result: Left Outer Join Output
In each row of output, the first two columns correspond to table One (the left table) and the last two columns correspond to table Two (the right table).
Because this is a left join, all rows, both matching and nonmatching, from table One are included in the output. Rows from table Two are included in the output only if they match a row from table One. In the output, the rows from table One are the first two columns and the rows from table Two are the last two columns.
The second row of output is the only row in which the row from table One matched a row from table Two, based on the join conditions specified in the ON clause. In the first and third rows of output, the row from table One had no matching row in table Two.
Note: In all three types of outer joins (left, right, and full), the columns in the result (combined) row that are from the unmatched row are set to missing values.

Example: Eliminating Duplicate Columns in a Left Outer Join

To eliminate one of the duplicate columns in any outer join, you can modify the SELECT clause to list the specific columns that will be displayed. The SELECT clause from the preceding query has been modified to remove the duplicate X column.
proc sql;
   select one.x, a, b
      from certadv.one left join
           certadv.two
           on one.x=two.x
;
quit;
Output 3.11 PROC SQL Query Result: Left Outer Join without Duplicate Rows
PROC SQL Query Result: Left Outer Join Without Duplicate Rows

Example: Using a Right Outer Join

A right outer join retrieves all rows that match across tables, based on the join conditions, plus nonmatching rows from the second table that are specified in the FROM clause. The following PROC SQL query uses a right join to combine rows from tables One and Two, based on the join conditions that were specified in the ON clause.
proc sql;
   select *
      from certadv.one right join
           certadv.two
           on one.x=two.x
;
quit;
Output 3.12 PROC SQL Query Result: Right Outer Join Output
PROC SQL Query Result: Right Outer Join Output
In each row of output, the first two columns correspond to table One and the last two columns correspond to table Two.
Because this is a right join, all rows, both matching and nonmatching, from table Two are included in the output. Rows from table One are displayed in the output only if they match a row from table Two. There is only one row in table One that matches a value of X in table Two, and these two matching rows combine to form the first row of output. In the remaining rows of output, there is no match and the columns corresponding to table One are set to missing values.

Example: Using a Full Outer Join

A full outer join retrieves both matching rows and nonmatching rows from both tables.
proc sql;
   select *
      from certadv.one full join
           certadv.two
           on one.x=two.x
;
quit;
Output 3.13 PROC SQL Query Result: Full Outer Join Output
PROC SQL Query Result: Full Outer Join Output
Because this is a full join, all rows, both matching and nonmatching, from both tables are included in the output. There is only one match between table One and table Two, so only one row of output displays values in all columns. All remaining rows of output contain only values from table One or table Two, with the remaining columns set to missing values.

Example: Complex Outer Join

Suppose you want to list all of an airline's flights that were scheduled for March, along with any available corresponding delay information. Each flight is identified by both a flight date and a flight number. Your output should display the following data: flight date, flight number, destination, and length of delay in minutes.
The data that you need is stored in the two tables shown below. The applicable columns from each table are identified.
Table
Relevant Columns
Certadv.Marchflights
Date, FlightNumber, Destination
Certadv.Flightdelays
Date, FlightNumber, Destination, Delay
Your output should include the columns that are listed above and all of the following rows:
  • rows that have matching values of Date and FlightNumber across the two tables
  • rows from Certadv.Marchflights that have no matching row in Certadv.Flightdelays
To generate the output that you want, the following PROC SQL query uses a left outer join, with Certadv.Marchflights specified as the left table.
proc sql outobs=20;
title 'All March Flights';
   select m.date,                                  /*1*/
          m.flightnumber label='Flight Number',
          m.destination  label='Left',
          f.destination  label='Right',
          delay label='Delay in Minutes'
   from certadv.marchflights as m left join        /*2*/
        certadv.flightdelays as f
      on m.date=f.date                             /*3*/
         and m.flightnumber=f.flightnumber
   order by delay;                                 /*4*/
quit;
1 The SELECT clause eliminates the duplicate Date and FlightNumber columns by specifying their source as Certadv.Marchflights. However, the SELECT clause list specifies the Destination columns from both tables and assigns a table alias to each to distinguish between them.
2 The FROM clause lists the tables to select from. The FROM clause uses the AS keyword to distinguish a table alias from other table names. The FROM clause also specifies the LEFT JOIN keyword to specify the join type.
3 The ON clause contains two join conditions, which match the tables on the two columns Date and FlightNumber.
4 The ORDER BY clause specifies the order in which rows are displayed in the result table.
The first 12 rows in the output display rows from Certadv.Marchflights. The rows that have no matching values with Certadv.Flightdelays display missing values for those columns. Therefore, the first 12 rows have missing values.
Output 3.14 PROC SQL Query Result: Outer Join
PROC SQL Query Result: Outer Join
Note: The same results could be generated by using a right outer join, with Certadv.Marchflights specified as the right (second) table.
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.193.231