Join Types

So far, you have used only simple joins known as inner joins or equijoins. You'll now take a look at three additional join types: the self join, the natural join, and the outer join.

Creating Self-Joins

As I mentioned earlier, one of the primary reasons to use table aliases is to be able to refer to the same table more than once in a single SELECT statement. An example will demonstrate this.

Suppose you wanted to send a mailing to all the customer contacts who work for the same company for which Jim Jones works. This query requires that you first find out which company Jim Jones works for, and next which customers work for that company. The following is one way to approach this problem:

SELECT cust_id, cust_name, cust_contact
FROM Customers
WHERE cust_name = (SELECT cust_name
                   FROM Customers
                   WHERE cust_contact = 'Jim Jones'),

cust_id                 cust_name          cust_contact
--------                --------------     --------------
1000000003              Fun4All            Jim Jones
1000000004              Fun4All            Denise L. Stephens

This first solution uses subqueries. The inner SELECT statement does a simple retrieval to return the cust_name of the company that Jim Jones works for. That name is the one used in the WHERE clause of the outer query so that all employees who work for that company are retrieved. (You learned all about subqueries in Lesson 11. Refer to that lesson for more information.)

Now look at the same query using a join:

SELECT c1.cust_id, c1.cust_name, c1.cust_contact
FROM Customers AS c1, Customers AS c2
WHERE c1.cust_name = c2.cust_name
 AND c2.cust_contact = 'Jim Jones';

cust_id            cust_name            cust_contact
-------            -----------          --------------
1000000003         Fun4All              Jim Jones
1000000004         Fun4All              Denise L. Stephens

Here is the Oracle version of this statement:

SELECT c1.cust_id, c1.cust_name, c1.cust_contact
FROM Customers c1, Customers c2
WHERE c1.cust_name = c2.cust_name
 AND c2.cust_contact = 'Jim Jones';

The two tables needed in this query are actually the same table, and so the Customers table appears in the FROM clause twice. Although this is perfectly legal, any references to table Customers would be ambiguous because the DBMS does not know which Customers table you are referring to.

To resolve this problem table aliases are used. The first occurrence of Customers has an alias of C1, and the second has an alias of C2. Now those aliases can be used as table names. The SELECT statement, for example, uses the C1 prefix to explicitly state the full name of the desired columns. If it did not, the DBMS would return an error because there are two columns named cust_id, cust_name, and cust_contact. It cannot know which one you want (even though, in truth, they are one and the same). The WHERE clause first joins the tables, and then it filters the data by cust_contact in the second table to return only the desired data.

Tip

Self Joins Instead of Subqueries Self joins are often used to replace statements using subqueries that retrieve data from the same table as the outer statement. Although the end result is the same, many DBMSs process joins far more quickly than they do subqueries. It is usually worth experimenting with both to determine which performs better.


Natural Joins

Whenever tables are joined, at least one column will appear in more than one table (the columns being joined). Standard joins (the inner joins that you learned about in the last lesson) return all data, even multiple occurrences of the same column. A natural join simply eliminates those multiple occurrences so that only one of each column is returned.

How does it do this? The answer is it doesn't—you do it. A natural join is a join in which you select only columns that are unique. This is typically done using a wildcard (SELECT *) for one table and explicit subsets of the columns for all other tables. The following is an example:

SELECT C.*, O.order_num, O.order_date, OI.prod_id, OI. quantity, OI.item_price
FROM Customers AS C, Orders AS O, OrderItems AS OI
WHERE C.cust_id = O.cust_id
 AND OI.order_num = O.order_num
 AND prod_id = 'RGAN01';

Oracle users should use the following syntax (minus the AS keyword used above):

SELECT C.*, O.order_num, O.order_date, OI.prod_id, OI. quantity, OI.item_price
FROM Customers C, Orders O, OrderItems OI
WHERE C.cust_id = O.cust_id
 AND OI.order_num = O.order_num
 AND prod_id = 'RGAN01';

In this example, a wildcard is used for the first table only. All other columns are explicitly listed so that no duplicate columns are retrieved.

The truth is, every inner join you have created thus far is actually a natural join, and you will probably never even need an inner join that is not a natural join.

Outer Joins

Most joins relate rows in one table with rows in another. But occasionally, you will want to include rows that have no related rows. For example, you might use joins to accomplish the following tasks:

  • Count how many orders were placed by each customer, including customers that have yet to place an order

  • List all products with order quantities, including products not ordered by anyone

  • Calculate average sale sizes, taking into account customers that have not yet placed an order

In each of these examples, the join includes table rows that have no associated rows in the related table. This type of join is called an outer join.

Caution

Syntax Differences It is important to note that the syntax used to create an outer join can vary slightly among different SQL implementations. The various forms of syntax described in the following section cover most implementations, but refer to your DBMS documentation to verify its syntax before proceeding.


The following SELECT statement is a simple inner join. It retrieves a list of all customers and their orders:

SELECT Customers.cust_id, Orders.order_num
FROM Customers, Orders
WHERE Customers.cust_id = Orders.cust_id;

To retrieve a list of all customers, including those who have placed no orders, you can do the following:

SELECT Customers.cust_id, Orders.order_num
FROM Customers, Orders
WHERE Customers.cust_id *= Orders.cust_id;

cust_id         order_num
----------      ---------
1000000001      20005
1000000001      20009
1000000002      NULL
1000000003      20006
1000000004      20007
1000000005      20008

The only difference between this SELECT statement and the previous one is the operator in the WHERE clause. Instead of testing for equality with a =, the *= operator is used to specify that every row in the Customers table should be included. *= is the left outer join operator. It retrieves all the rows from the left table.

The opposite of the left outer join is the right outer join specified by the =* operator. It can be used to return all rows from the table listed to the right of the operator, as seen in this next example:

SELECT Customers.cust_id, Orders.order_num
FROM Customers, Orders
WHERE Orders.cust_id =* Customers.cust_id;

As explained in the last lesson, some DBMSs require the use of a slightly different join syntax. The following example of the same SELECT statement uses explicit OUTER JOIN syntax instead of a WHERE clause operator. The end result is the same, however.

SELECT Customers.cust_id, Orders.order_num
FROM Customers LEFT OUTER JOIN Orders
 ON Customers.cust_id = Orders.cust_id;

Like the inner join seen in the last lesson, this SELECT statement uses the keywords OUTER JOIN to specify the join type (instead of specifying it in the WHERE clause). But unlike inner joins, which relate rows in both tables, outer joins also include rows with no related rows. And as seen in the preceding example, when creating an outer join, you must specify the table from which you wish to include all the rows. In the previous examples, you used *= and =* for this. When using OUTER JOIN syntax, however, you must use the RIGHT or LEFT keywords. The previous example uses LEFT OUTER JOIN to select all the rows from the table on the left in the FROM clause (the Customers table). To select all the rows from the table on the right, you use a RIGHT OUTER JOIN as seen in this next example:

SELECT Customers.cust_id, Orders.order_num
FROM Customers RIGHT OUTER JOIN Orders
 ON Orders.cust_id = Customers.cust_id;

Yet another form of the OUTER JOIN syntax (used by Oracle) requires the use of a + character after the table name as follows:

SELECT Customers.cust_id, Orders.order_num
FROM Customers, Orders
WHERE Customers.cust_id = Orders.cust_id

There is one other variant of the outer join, and that is the FULL OUTER JOIN that retrieves all rows from both tables and relates those that can be related. Unlike a LEFT OUTER JOIN or RIGHT OUTER JOIN, which includes unrelated rows from a single table, the FULL OUTER JOIN includes unrelated rows from both tables. The syntax for a FULL OUTER JOIN is as follows:

Tip

Outer Join Types Regardless of the form of outer join used, there are always two basic forms of outer joins—the LEFT OUTER JOIN and the RIGHT OUTER JOIN. The only difference between them is the order of the tables that they are relating. In other words, a LEFT OUTER JOIN can be turned into a RIGHT OUTER JOIN simply by reversing the order of the tables in the FROM or WHERE clause. As such, the two types of outer join can be used interchangeably, and the decision about which one is used is based purely on convenience.


SELECT Customers.cust_id, Orders.order_num
FROM Orders FULL OUTER JOIN Customers
 ON Orders.cust_id = Customers.cust_id;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.93.168