Using Outer Joins

Introducing Types of Outer Joins

An outer join combines and displays all rows that match across tables, based on the specified matching criteria (also known as join conditions), plus some or all of the rows that do not match. You can think of an outer join as an augmentation of an inner join: an outer join returns all rows generated by an inner join, plus additional (nonmatching) rows.
Type of Outer Join
Output
Left
All matching rows plus nonmatching rows from the first table specified in the FROM clause (the left table)
Venn Diagram, Left Outer Join
Right
All matching rows plus nonmatching rows from the second table specified in the FROM clause (the right table)
Venn Diagram, Right Outer Join
Full
All matching rows plus nonmatching rows in both tables
Venn Diagram, Full Outer Join
The syntax of an outer join is shown below.
General form, SELECT statement for outer join:
SELECT column-1<,...column-n>
FROM table-1 | view-1
LEFT JOIN | RIGHT JOIN | FULL JOIN
table-2 | view-2
ON join-condition(s)
<other clauses>;
Here is an explanation of the syntax:
LEFT JOIN, RIGHT JOIN, FULL JOIN
are keywords that specify the type of outer join.
ON
specifies join-condition(s), which are expression(s) that specify the column or columns on which the tables are to be joined.
<other clauses>
refers to optional PROC SQL clauses.
Note: To further subset the rows in the query output, you can follow the ON clause with a WHERE clause. The WHERE clause subsets the individual detail rows before the outer join is performed. The ON clause then specifies how the remaining rows are to be selected for output.
Note: You can perform an outer join on only two tables or views at a time. Views are covered later in this chapter.
Consider how each type of outer join works.

Using a Left Outer Join

A left outer join retrieves all rows that match across tables, based on the specified matching criteria (join conditions), plus nonmatching rows from the left table (the first table specified in the FROM clause).
Suppose you are using the following PROC SQL left join to combine the two tables One and Two. The join condition is stated in the expression following the ON keyword. The two tables and the three rows of output are shown below:
proc sql;
    select *
       from one
        left join
        two
       on one.x=two.x;
Left Outer Join Output
In each row of output, the first two columns correspond to table One (the left table) and the last two columns correspond to table Two (the right table).
Because this is a left join, all rows (both matching and nonmatching) from table One (the left table) are included in the output (the first two columns). Rows from table Two (the right table) are displayed in the output (the last two columns) only if they match a row from table One.
In this example, the second row of output is the only row in which the row from table One matched a row from table Two, based on the matching criteria (join conditions) specified in the ON clause. In the first and third rows of output, the row from table One had no matching row in table Two.
Note: In all three types of outer joins (left, right, and full), the columns in the result (combined) row that are from the unmatched row are set to missing values.
To eliminate one of the duplicate columns (in this case, X) in any outer join, as shown earlier with an inner join, you can modify the SELECT clause to list the specific columns that is displayed. Here, the SELECT clause from the preceding query has been modified to remove the duplicate X column:
proc sql;
    select one.x, a, b
       from one
       left join
       two
       on one.x=two.x;
Left Outer Join Output, Duplicate Columns Removed

Using a Right Outer Join

A right outer join retrieves all rows that match across tables, based on the specified matching criteria (join conditions), plus nonmatching rows from the right table (the second table specified in the FROM clause).
Consider what happens when you use a right join to combine the two tables used in the previous example. The following PROC SQL query uses a right join to combine rows from One and Two, based on the join conditions specified in the ON clause:
proc sql;
    select *
       from one
       right join
       two
       on one.x=two.x;
Right Outer Join Output
In each row of output, the first two columns correspond to table One (the left table) and the last two columns correspond to table Two (the right table).
Because this is a right join, all rows (both matching and nonmatching) from table Two (the right table) are included in the output (the last two columns). Rows from table One (the left table) are displayed in the output (the first two columns) only if they match a row from table Two.
In this example, there is only one row in table One that matches a value of X in table Two, and these two matching rows combine to form the first row of output. In the remaining rows of output, there is no match and the columns corresponding to table One are set to missing values.

Using a Full Outer Join

A full outer join retrieves both matching rows and nonmatching rows from both tables.
Combine the same two tables again, this time using a full join. The PROC SQL query, the tables, and the output are shown below:
proc sql;
   select *
      from one 
      full join
      two
      on one.x=two.x;
Full Outer Join Output
Because this is a full join, all rows (both matching and nonmatching) from both tables are included in the output. There is only one match between table One and table Two, so only one row of output displays values in all columns. All remaining rows of output contain only values from table One or table Two. The remaining columns are set to missing values.

Example: Outer Join

Now that you have seen how the three types of outer joins work, consider a realistic situation requiring the use of an outer join.
Suppose you want to list all of an airline's flights that were scheduled for March, along with corresponding delay information (if it exists). Each flight is identified by both a flight date and a flight number. Your output should display the following data: flight date, flight number, destination, and length of delay in minutes.
The data that you need is stored in the two tables shown below. The applicable columns from each table are identified.
Table
Relevant Columns
Sasuser.Marchflights
Date, FlightNumber, Destination
Sasuser.Flightdelays
Date, FlightNumber, Destination, Delay
Your output should include the columns that are listed above and all of the following rows:
  • rows that have matching values of Date and FlightNumber across the two tables
  • rows from Sasuser.Marchflights that have no matching row in Sasuser.Flightdelays.
To generate the output that you want, the following PROC SQL query uses a left outer join. Sasuser.Marchflights is specified as the left (first) table.
proc sql outobs=20;
title 'All March Flights';
   select m.date,
          m.flightnumber 
	             label='Flight Number',
          m.destination
          label='Left',
          f.destination
          label='Right',
          delay
          label='Delay in Minutes'
      from sasuser.marchflights as m 
           left join 
           sasuser.flightdelays as f
      on m.date=f.date
         and m.flightnumber=
             f.flightnumber
      order by delay;
Notice the following:
  • The SELECT clause eliminates the duplicate Date and FlightNumber columns by specifying their source as Sasuser.Marchflights. However, the SELECT clause list specifies the Destination columns from both tables and assigns a table alias to each to distinguish between them.
  • The ON clause contains two join conditions, which match the tables on the two columns Date and FlightNumber.
The query output is shown below.
query output
The first 12 rows of output display rows from Sasuser.Marchflights (the left table) that have no matching rows in Sasuser.Flightdelays. Therefore, in these 12 rows, the last 2 columns are set to missing values.
Note: The same results could be generated by using a right outer join. Sasuser.Marchflights is specified as the right (second) table.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.129.253