Chapter 7. SELECT Statement: The FROM Clause

Introduction

The processing of a table expression begins at the FROM clause. In fact, this is the starting point of processing a table expression, which is why this clause is discussed in detail first.

In this chapter, we describe the basic features of the FROM clause. In previous chapters, you saw many examples of this clause. The FROM clause is an important clause because each table from which we “use” columns in the other clauses should be specified here. By “using,” we mean, for example, that a column appears in a condition or in the SELECT clause. Simply, in the FROM clause, we specify the tables from which the result of a table expression will be retrieved.

The FROM clause has many different forms. We start this chapter with the simplest form.

Table Specifications in the FROM Clause

The FROM clause is used for specifying which tables are to be queried. This is done by means of table references. A table reference consists of a table specification possibly followed by a pseudonym. Table specifications are discussed in this section; pseudonyms are discussed later in this chapter.

<from clause> ::=
   FROM <table reference> [ { , <table reference> 
Table Specifications in the FROM Clause}... ]

<table reference> ::=
   <table specification> [ [ AS ] <pseudonym> ]

<table specification> ::=
   [ <database name> . | <user name> . ] <table name>

<pseudonym> ::= <name>

A table specification normally consists of the name of a table, but you can specify the name of a view. In both cases, we will use the term table specification.

Within a table specification another name can be specified in front of a table name. Those two names are separated by a full stop, as in NAME1.PLAYERS. What the extra name refers to depends on the SQL product in use. For one group of products to which MySQL belongs, it has to be the name of an existing database. For other products, it is the name of the owner of the table. In most cases, that is the name of the user who created the table.

MySQL belongs to that second group of products, so let’s begin there. Each table is stored in a specific database. It also holds that an application must open a connection to access tables. And a connection always requires one current database. However, sometimes the need arises to stay within the connection and to query a table that is stored outside the current database. We do this by extending table specifications with database names.

Example 7.1. Create a new database called EXTRA with a new table called CITIES.

CREATE DATABASE EXTRA

USE EXTRA

CREATE TABLE CITIES
      (CITYNO      INTEGER NOT NULL PRIMARY KEY,
       CITYNAME    CHAR(20) NOT NULL)

INSERT INTO CITIES VALUES
   (1, 'Stratford')

INSERT INTO CITIES VALUES
   (2, 'Inglewood')

Explanation: Do not forget to change the current database into EXTRA with the USE statement after the CREATE DATABASE statement.

Example 7.2. Show the entire contents of the CITIES table; assume that a connection has been made to the TENNIS database.

SELECT   *
FROM     EXTRA.CITIES

ExplanationThe compound name EXTRA.CITIES is the table specification. (Note the full stop between the name of the database and the table name; this full stop is mandatory). We say that the table name CITIES is qualified with the database name EXTRA.

In fact, a table name can always be qualified, even when a table from the current database is queried.

Example 7.3. Show the contents of the TEAMS table.

SELECT   *
FROM     TENNIS.TEAMS

Products such as DB2 and Oracle belong to the other group. Here, we can qualify tables with the name of another user. So you can understand this, we first have to explain the concept owner.

Every table has an owner. In most cases, the SQL user who enters a CREATE TABLE statement is the owner of the table. Names of tables are unique within a user. In other words, two users can both create a table with the same name, but one user is not allowed to assign the same name to two of his tables. If SQL users want to access tables of others, they have to indicate in some way which table they intend to access. This means that they have to state the owner of the table that they want to access. For this purpose, the definition of the table specification has been extended.

In a FROM clause, if an SQL user wants to refer to a table created by someone else, the name of the owner must be specified in front of the table name. This is not required if that user is the owner of the table.

Example 7.4. JIM wants to retrieve the entire contents of the PENALTIES table, which has been created by BOB. (Assume that JIM has the authority to query this table.)

SELECT   *
FROM     BOB.PENALTIES

Explanation: The new compound name BOB.PENALTIES is the table specification. The table name PENALTIES has been qualified by the owner’s name BOB.

If user BOB wants to see the contents of the PENALTIES table, he could use the previous statement, but he might also leave out his own name:

SELECT   *
FROM     PENALTIES

Again, the Column Specification

In the previous section, we saw that, depending on the SQL product, a table can be qualified with the name of the database or with the name of the owner of the table. When specifying columns (in the SELECT clause, for example), you can also qualify them by specifying the table to which the columns belong. Each column specification consists of three parts; see the definition.

<column specification> ::=
   [ <table specification> . ] <column name>

<table specification> ::=
   [ <database name> . | <user name> . ] <table name>

The last part is the column name itself, such as PLAYERNO or NAME. This is the only mandatory part. The second part is the table name, such as PLAYERS or TEAMS. The first one is the name of the database or the user. You do not have to specify all these parts, but it is not wrong to do so.

Example 7.5. Find the number of each team. Here are three possible solutions; we assume that the TEAMS table is stored in the TENNIS database and that we use MySQL.

SELECT   TEAMNO
FROM     TEAMS

and

SELECT   TEAMS.TEAMNO
FROM     TEAMS

and

SELECT   TENNIS.TEAMS.TEAMNO
FROM     TENNIS.TEAMS

Multiple Table Specifications in the FROM Clause

Until now, we have used only one table specification in the FROM clause. If we want to present data from different tables in our result table, we must specify multiple tables in the FROM clause.

Example 7.6. Get the team number and the name of the captain of each team.

The TEAMS table holds information about team numbers and the player numbers of each team. However, the names of the captains are not stored in the TEAMS table but in the PLAYERS table. In other words, we need both tables. Both must be mentioned in the FROM clause.

SELECT   TEAMNO, NAME
FROM     TEAMS, PLAYERS
WHERE    TEAMS.PLAYERNO = PLAYERS.PLAYERNO

The intermediate result of the FROM clause is:

TEAMNO  PLAYERNO  DIVISION  PLAYERNO  NAME       ...
------  --------  --------  --------  ---------  ---
     1         6  first            6  Parmenter  ...
     1         6  first           44  Baker      ...
     1         6  first           83  Hope       ...
     1         6  first            2  Everett    ...
     1         6  first           27  Collins    ...
     1         6  first          104  Moorman    ...
     1         6  first            7  Wise       ...
     1         6  first           57  Brown      ...
     1         6  first           39  Bishop     ...
     1         6  first          112  Bailey     ...
     1         6  first            8  Newcastle  ...
     1         6  first          100  Parmenter  ...
     1         6  first           28  Collins    ...
     1         6  first           95  Miller     ...
     2        27  second           6  Parmenter  ...
     2        27  second          44  Baker      ...
     2        27  second          83  Hope       ...
     2        27  second           2  Everett    ...
     2        27  second          27  Collins    ...
     2        27  second         104  Moorman    ...
     2        27  second           7  Wise       ...
     2        27  second          57  Brown      ...
     2        27  second          39  Bishop     ...
     2        27  second         112  Bailey     ...
     2        27  second           8  Newcastle  ...
     2        27  second         100  Parmenter  ...
     2        27  second          28  Collins    ...
     2        27  second          95  Miller     ...

ExplanationEach row of the PLAYERS table is aligned “beside” each row of the TEAMS table. This results in a table in which the total number of columns equals the number of columns in one table plus the number of columns in the other table, and in which the total number of rows equals the number of rows in one table multiplied by the number of rows in the other table. We call this result the Cartesian product of the tables concerned.

In the WHERE clause, each row where the value in the TEAMS.PLAYERNO column equals the one in the PLAYERS.PLAYERNO column is selected:

TEAMNO  PLAYERNO  DIVISION  PLAYERNO  NAME       ...
------  --------  --------  --------  ---------  ---
     1         6  first            6  Parmenter  ...
     2        27  second          27  Collins    ...

The end result is:

TEAMNO  NAME
------  ---------
     1  Parmenter
     2  Collins

In this example, it is essential to specify the table name in front of the PLAYERNO column. Without qualifying the column name, it would be impossible for SQL to determine which column was intended.

Conclusion: If you use a column name that appears in more than one table specified in the FROM clause, it is mandatory to include a table specification with the column specification.

Example 7.7. For each penalty, find the payment number, the amount of the penalty, the player number, the name, and the initials of the player who incurred the penalty.

The payment numbers, the amounts, and the player numbers are held in the PENALTIES table, while names and initials are found in the PLAYERS table. Both tables must be included in the FROM clause:

SELECT   PAYMENTNO, PENALTIES.PLAYERNO, AMOUNT,
         NAME, INITIALS
FROM     PENALTIES, PLAYERS
WHERE    PENALTIES.PLAYERNO = PLAYERS.PLAYERNO

The intermediate result from the FROM clause is (not all the rows have been included):

PAYMENTNO  PLAYERNO  AMOUNT  ...  PLAYERNO  NAME       INITIALS  ...
---------  --------  ------  ---  --------  ---------  --------  ---
        1         6  100.00  ...         6  Parmenter  R         ...
        1         6  100.00  ...        44  Baker      E         ...
        1         6  100.00  ...        83  Hope       PK        ...
        1         6  100.00  ...         2  Everett    R          ..
        :         :       :              :  :          :
        2        44   75.00  ...         6  Parmenter  R         ...
        2        44   75.00  ...        44  Baker      E          ..
        2        44   75.00  ...        83  Hope       PK        ...
        2        44   75.00  ...         2  Everett    R         ...
        :         :       :              :  :          :
        3        27  100.00  ...         6  Parmenter  R         ...
        3        27  100.00  ...        44  Baker      E         ...
        3        27  100.00  ...        83  Hope       PK        ...
        3        27  100.00  ...         2  Everett    R         ...
        :         :       :              :  :          :
        :         :       :              :  :          :

The intermediate result after processing the FROM clause is:

PAYMENTNO  PLAYERNO  AMOUNT  ...  PLAYERNO  NAME      INITIALS  ...
---------  --------  ------  ---  --------  --------- --------  ---
        1         6  100.00  ...         6  Parmenter R         ...
        2        44   75.00  ...        44  Baker     E         ...
        3        27  100.00  ...        27  Collins   DD        ...
        4       104   50.00  ...       104  Moorman   D         ...
        5        44   25.00  ...        44  Baker     E         ...
        6         8   25.00  ...         8  Newcastle B         ...
        7        44   30.00  ...        44  Baker     E         ...
        8        27   75.00  ...        27  Collins   DD        ...

The end result is:

PAYMENTNO  PLAYERNO  AMOUNT  NAME       INITIALS
---------  --------  ------  ---------  --------
        1         6  100.00  Parmenter  R
        2        44   75.00  Baker      E
        3        27  100.00  Collins    DD
        4       104   50.00  Moorman    D
        5        44   25.00  Baker      E
        6         8   25.00  Newcastle  B
        7        44   30.00  Baker      E
        8        27   75.00  Collins    DD

To avoid ambiguity, the table name must be specified in front of the PLAYERNO column in the SELECT clause.

The order of the table specifications in a FROM clause does not affect the result of this clause and the end result of the table expression. The SELECT clause is the only clause that determines the order of the columns in the result. The ORDER BY clause is used to determine the order in which the rows will be presented. Thus, the results of the next two statements are equal:

SELECT   PLAYERS.PLAYERNO
FROM     PLAYERS, TEAMS
WHERE    PLAYERS.PLAYERNO = TEAMS.PLAYERNO

and

SELECT   PLAYERS.PLAYERNO
FROM     TEAMS, PLAYERS
WHERE    PLAYERS.PLAYERNO = TEAMS.PLAYERNO

Exercise 7.1:

Indicate why these SELECT statements are not correctly formulated:

  1. SELECT   PLAYERNO
    FROM     PLAYERS, TEAMS
    
  2. SELECT   PLAYERS.PLAYERNO
    FROM     TEAMS
    

Exercise 7.2:

For each clause of the following statement, determine the intermediate result and the result. Also, give a description of the question that underlies the statement.

SELECT   PLAYERS.NAME
FROM     TEAMS, PLAYERS
WHERE    PLAYERS.PLAYERNO = TEAMS.PLAYERNO

Exercise 7.3:

For each penalty, find the payment number, the amount, and the number and name of the player who incurred it.

Exercise 7.4:

For each penalty incurred by a team captain, find the payment number and the captain’s name.

Pseudonyms for Table Names

When multiple table specifications appear in the FROM clause, it is sometimes easier to use so-called pseudonyms. Another name for pseudonym is an alias. Pseudonyms are temporary alternative names for table names. In the previous examples, to qualify a column, we specified the full table name. Instead of using table names, we can use pseudonyms.

Example 7.8. For each penalty, get the payment number, the amount of the penalty, the player number, and the name and initials of the player who incurred the penalty. Make use of pseudonyms.

SELECT   PAYMENTNO, PEN.PLAYERNO, AMOUNT,
         NAME, INITIALS
FROM     PENALTIES AS PEN, PLAYERS AS P
WHERE    PEN.PLAYERNO = P.PLAYERNO

Explanation: In the FROM clause, the pseudonyms are specified or declared after the table names. In other clauses, we must use these pseudonyms instead of the real table names.

Because pseudonyms have been used, it is not possible to mention the original table names in the other clauses anymore. The presence of a pseudonym implies that a table name cannot be used in this SQL statement.

The fact that the pseudonym PEN has been used earlier in the statement (in the SELECT clause) than its declaration (in the FROM clause) does not cause any problems. As we have seen, the FROM clause might not be the first clause we specify, but it is the first processed.

The word AS in the definition is optional. So, the previous statement has the same result as the following:

SELECT   PAYMENTNO, PEN.PLAYERNO, AMOUNT,
         NAME, INITIALS
FROM     PENALTIES PEN, PLAYERS P
WHERE    PEN.PLAYERNO = P.PLAYERNO

In both examples, the use of pseudonyms is not vital. However, later in this book we formulate SELECT statements where table names would have to be repeated many times. Adding pseudonyms makes it easier to formulate and read those statements.

A pseudonym must satisfy the naming rules for table names. More about this subject comes in Section 15.6, in Chapter 15, “Creating Tables.” Two pseudonyms in the same statement cannot have the same name.

Exercise 7.5:

Get for each team the number and the last name of the captain.

Exercise 7.6:

Get for each match the match number, the last name of the player, and the division of the team.

Various Examples of Joins

This section looks at some examples to illustrate various aspects of the FROM clause. Plus, we introduce several new terms.

Example 7.9. Get the numbers of the captains who have incurred at least one penalty.

SELECT   T.PLAYERNO
FROM     TEAMS AS T, PENALTIES AS PEN
WHERE    T.PLAYERNO = PEN.PLAYERNO

Explanation: The TEAMS table includes all the players who are captains. By using the player numbers, we can search the PENALTIES table for those captains who have incurred at least one penalty. For that reason, both tables are included in the FROM clause. The intermediate result from the FROM clause becomes:

TEAMNO  PLAYERNO  DIVISION  PAYMENTNO  PLAYERNO  ...
------  --------  --------  ---------  --------  ---
     1         6  first             1         6  ...
     1         6  first             2        44  ...
     1         6  first             3        27  ...
     1         6  first             4       104  ...
     1         6  first             5        44  ...
     1         6  first             6         8  ...
     1         6  first             7        44  ...
     1         6  first             8        27  ...
     2        27  second            1         6  ...
     2        27  second            2        44  ...
     2        27  second            3        27  ...
     2        27  second            4       104  ...
     2        27  second            5        44  ...
     2        27  second            6         8  ...
     2        27  second            7        44  ...
     2        27  second            8        27  ...

The intermediate result from the WHERE clause is:

TEAMNO  PLAYERNO  DIVISION  PAYMENTNO  PLAYERNO  ...
------  --------  --------  ---------  --------  ---
     1         6  first             1         6  ...
     2        27  second            3        27  ...
     2        27  second            8        27  ...

The end result is thus:

PLAYERNO
--------
       6
      27
      27

When data of different tables is merged into one table, it is called a join of tables. The columns on which the join is executed are called the join columns. In the previous SELECT statement, these are the columns TEAMS.PLAYERNO and PENALTIES.PLAYERNO. The condition in the WHERE clause, with which we compare the PLAYERNO column of the TEAMS table with the one of the PENALTIES table, is called the join condition.

Note that the result of the earlier statement contains duplicate rows. SQL does not automatically remove duplicate rows from the end result. In our example, player 27 appears twice because she incurred two penalties. When you do not want duplicate rows in your result, you should specify the word DISTINCT directly behind the word SELECT. (Chapter 9, “SELECT Statement: SELECT Clause and Aggregation Functions,” discusses DISTINCT extensively.)

Example 7.10. Get the numbers of the captains who have incurred at least one penalty. Remove the duplicate numbers.

SELECT   DISTINCT T.PLAYERNO
FROM     TEAMS AS T, PENALTIES AS PEN
WHERE    T.PLAYERNO = PEN.PLAYERNO

The end result then becomes:

PLAYERNO
--------
       6
      27

Example 7.11. Get the names and initials of the players who have played at least one match. Warning: A competition player does not have to appear in the MATCHES table. (Perhaps he or she has been injured for the whole season.)

SELECT   DISTINCT P.NAME, P.INITIALS
FROM     PLAYERS AS P, MATCHES AS M
WHERE    P.PLAYERNO = M.PLAYERNO

The result is:

NAME       INITIALS
---------  --------
Parmenter  R
Baker      E
Hope       PK
Everett    R
Collins    DD
Moorman    D
Brown      M
Bailey     IP
Newcastle  B

Work out for yourself how this SELECT statement could give rise to duplicate values if DISTINCT is not used.

A join is not restricted to two tables. A FROM clause can contain many tables.

Example 7.12. For each match, get the match number, the player number, the team number, the name of the player, and the division in which the team plays.

SELECT   M.MATCHNO, M.PLAYERNO, M.TEAMNO, P.NAME, T.DIVISION
FROM     MATCHES AS M, PLAYERS AS P, TEAMS AS T
WHERE    M.PLAYERNO = P.PLAYERNO
AND      M.TEAMNO = T.TEAMNO

The result is:

MATCHNO  PLAYERNO  TEAMNO  NAME       DIVISION
-------  --------  ------  ---------  --------
      1         6       1  Parmenter  first
      2         6       1  Parmenter  first
      3         6       1  Parmenter  first
      4        44       1  Baker      first
      5        83       1  Hope       first
      6         2       1  Everett    first
      7        57       1  Brown      first
      8         8       1  Newcastle  first
      9        27       2  Collins    second
     10       104       2  Moorman    second
     11       112       2  Bailey     second
     12       112       2  Bailey     second
     13         8       2  Newcastle  second

Example 7.13. Get the payment number, the player number, and the date of each penalty incurred in the year in which the player concerned joined the club.

SELECT   PEN.PAYMENTNO, PEN.PLAYERNO, PEN.PAYMENT_DATE
FROM     PENALTIES AS PEN, PLAYERS AS P
WHERE    PEN.PLAYERNO = P.PLAYERNO
AND      YEAR(PEN.PAYMENT_DATE) = P.JOINED

The result is:

PAYMENTNO  PLAYERNO  PEN.PAYMENT_DATE
---------  --------  ----------------
        3        27  1983-09-10
        4       104  1984-12-08
        5        44  1980-12-08
        6         8  1980-12-08

ExplanationMost join conditions compare key columns with each other. However, that is not a requirement. In this example, the date on which the penalty has been paid is compared to the year in which the player joined the club.

Exercise 7.7:

Get the numbers and names of players who have been chairmen.

Exercise 7.8:

Get the number of each player who on the same day that he became a committee member also incurred a penalty.

Mandatory Use of Pseudonyms

In some SELECT statements, there is no choice about whether a pseudonym is to be used. This situation arises when the same table is mentioned more than once in the FROM clause. Consider this example.

Example 7.14. Get the numbers of the players who are older than R. Parmenter; in this example, we assume that the combination of name and initials is unique.

SELECT   P.PLAYERNO
FROM     PLAYERS AS P, PLAYERS AS PAR
WHERE    PAR.NAME = 'Parmenter'
AND      PAR.INITIALS = 'R'
AND      P.BIRTH_DATE < PAR.BIRTH_DATE

The intermediate result from the WHERE clause is a multiplication of the PLAYERS table by itself. (For simplicity, we have shown only the rows from the PAR.PLAYERS table in which player 6, named R. Parmenter, is found.)

PLAYERNO  ...  BIRTH_DATE  ...  PLAYERNO  ...  BIRTH_DATE  ...
--------  ---  ----------  ---  --------  ---  ----------  ---
       6  ...  1964-06-25  ...         6  ...  1964-06-25  ...
      44  ...  1963-01-09  ...         6  ...  1964-06-25  ...
      83  ...  1956-11-11  ...         6  ...  1964-06-25  ...
       2  ...  1948-09-01  ...         6  ...  1964-06-25  ...
      27  ...  1964-12-28  ...         6  ...  1964-06-25  ...
     104  ...  1970-05-10  ...         6  ...  1964-06-25  ...
       7  ...  1963-05-11  ...         6  ...  1964-06-25  ...
      57  ...  1971-08-17  ...         6  ...  1964-06-25  ...
      39  ...  1956-10-29  ...         6  ...  1964-06-25  ...
     112  ...  1963-10-01  ...         6  ...  1964-06-25  ...
       8  ...  1962-07-08  ...         6  ...  1964-06-25  ...
     100  ...  1963-02-28  ...         6  ...  1964-06-25  ...
      28  ...  1963-06-22  ...         6  ...  1964-06-25  ...
      95  ...  1963-05-14  ...         6  ...  1964-06-25  ...
       :  :    :           :           :  :    :           :
       :  :    :           :           :  :    :           :

The intermediate result of the WHERE clause is:

PLAYERNO  ...  BIRTH_DATE  ...  PLAYERNO  ...  BIRTH_DATE  ...
———-  —-  ————-  —-  ———-  —-  ————-  —-
      44  ...  1963-01-09  ...         6  ...  1964-06-25  ...
      83  ...  1956-11-11  ...         6  ...  1964-06-25  ...
       2  ...  1948-09-01  ...         6  ...  1964-06-25  ...
       7  ...  1963-05-11  ...         6  ...  1964-06-25  ...
      39  ...  1956-10-29  ...         6  ...  1964-06-25  ...
     112  ...  1963-10-01  ...         6  ...  1964-06-25  ...
       8  ...  1962-07-08  ...         6  ...  1964-06-25  ...
     100  ...  1963-02-28  ...         6  ...  1964-06-25  ...
      28  ...  1963-06-22  ...         6  ...  1964-06-25  ...
      95  ...  1963-05-14  ...         6  ...  1964-06-25  ...

The end result is:

PLAYERNO
--------
      44
      83
       2
       7
      39
     112
       8
     100
      28
      95

In the previous examples, table names were specified in front of column names to identify columns uniquely. That would not help in the previous example because both tables have the same name. In other words, if a FROM clause refers to two tables with the same name, pseudonyms must be used.

Note that it would have been sufficient to assign only one of the two tables a pseudonym in the earlier example:

SELECT   P.PLAYERNO
FROM     PLAYERS AS P, PLAYERS
WHERE    PLAYERS.NAME = 'Parmenter'
AND      PLAYERS.INITIALS = 'R'
AND      P.BIRTH_DATE < PLAYERS.BIRTH_DATE

Exercise 7.9:

Get the numbers and names of the players who live in the same town as player 27. Player 27 should not appear in the end result.

Exercise 7.10:

Get the number and name of every competition player as well as the number and name of the captain of each team for whom that player has ever competed. The result may not contain competition players who are captains of a team. Desired result:

PLAYERNO  NAME (PLAYERS)  PLAYERNO  NAME (CAPTAIN)
--------  --------------  --------  --------------
      44  Baker                  6  Parmenter
       8  Newcastle              6  Parmenter
       8  Newcastle             27  Collins
       :  :                      :  :
       :  :                      :  :

Exercise 7.11:

Get the numbers of the penalties for which the penalty amount is equal to a penalty amount belonging to player 44. The result should not contain the penalties of player 44.

Explicit Joins in the FROM Clause

So far, we have talked about the concept of joins, but we have not seen the word JOIN in the table expression yet. The reason is that, until now, we have shown only examples in which the join is “hidden” in the SELECT statement. Sometimes, this join is referred to as an implicit join. In this case, a join is then made up of several specifications from the FROM clause (the table specifications), together with one or more conditions from the WHERE clause.

Explicitly adding the join to the SELECT statement started in the SQL2 standard. This new, explicit join is entirely specified in the FROM clause, resulting in a considerable increase in features of this clause. The effect is that it is much easier to formulate certain statements. The extended definition of the FROM clause is shown next. Most important in this definition is that a table reference is not restricted to a simple table specification but can form a complete join.

<from clause> ::=
   FROM <table reference> [ { , <table reference> 
Explicit Joins in the FROM Clause}... ]

<table reference> ::=
   { <table specification> |
     <join specification>  }
   [ [ AS ] <pseudonym> ]

<join specification> ::=
   <table reference> <join type> <table reference>
      [ <join condition> ]

<join condition> ::=
    ON <condition> | USING <column list>

<join type> ::=
   [ INNER ] JOIN       |
   LEFT  [ OUTER ] JOIN |
   RIGHT [ OUTER ] JOIN |
   FULL  [ OUTER ] JOIN |
   UNION JOIN           |
   CROSS JOIN

<column list> ::=
   ( <column name> [ { , <column name> }... ] )

According to this definition, the following FROM clause is correct:

FROM     PLAYERS INNER JOIN PENALTIES
         ON (PLAYERS.PLAYERNO = PENALTIES.PLAYERNO)

In this example, PLAYERS and PENALTIES are the tables to be joined, and the join condition is placed between brackets after the word ON. The type of join that must be performed is the inner join. We next illustrate with an example the meaning of these specifications.

Example 7.15. For each player born after June 1920, find the player number, the name, and the penalty amounts incurred by him or her.

In the previous chapters, we showed that we can answer this question with the following formulation:

SELECT   PLAYERS.PLAYERNO, NAME, AMOUNT
FROM     PLAYERS, PENALTIES
WHERE    PLAYERS.PLAYERNO = PENALTIES.PLAYERNO
AND      BIRTH_DATE > '1920-06-30'

which has the result:

PLAYERNO  NAME       AMOUNT
--------  ---------  ------
       6  Parmenter  100.00
      44  Baker       75.00
      27  Collins    100.00
     104  Moorman     50.00
      44  Baker       25.00
       8  Newcastle   25.00
      44  Baker       30.00
      27  Collins     75.00

There is also a join “hidden” in this statement. The specifications that together form the join are spread out over the FROM and WHERE clauses. With the new definition of the FROM clause, this join can be presented explicitly, and for this, we use the FROM clause we have already given:

SELECT   PLAYERS.PLAYERNO, NAME, AMOUNT
FROM     PLAYERS INNER JOIN PENALTIES
         ON (PLAYERS.PLAYERNO = PENALTIES.PLAYERNO)
WHERE    BIRTH_DATE > '1920-06-30'

This statement leads to the same result as the previous one; the difference is that now, during the processing of the FROM clause, much more work is done. In the first formulation, the (intermediate) result of the FROM clause is equal to the Cartesian product of the two specified tables (see also Section 7.4, in Chapter 7, “SELECT Statement: The FROM Clause”). For the second formulation, the result is the Cartesian product to which the condition already has been applied. For the processing of the WHERE clause, less work has to be done.

Both statements return the same result, but do they satisfy our requirements? The answer is no! These SELECT statements return only the player number and the name of each player who has incurred at least one penalty. That brings us to the specification INNER JOIN. Because SQL is presenting only data about the players appearing in both the tables PLAYERS and PENALTIES, this join is called an inner join. Only those players who appear in the intersection of the sets of the two join columns are included in the end result.

Whether an inner join does or does not give what we want depends entirely, on one hand, on our question and, on the other hand, on the relationship between the join columns. In the previous example, we lose players (from the PLAYERS table) because the sets of the two join columns are not equal; one is a subset of the other. Had the question in the example above been ‘For each player who incurred at least one penalty, find the player number . . .’, the formulation of the statement would have been correct.

There always exists a certain type of relationship between join columns. ‘Being a subset of’ is just one possibility. There are four types of relationships possible. When a join is specified, it is very important to know what the type of relationship is because it has a serious influence on the result of the SELECT statement in which the join appears.

If C1 and C2 are two columns, the four types of relationships between C1 and C2 are as follows:

  1. The population of C1 and C2 are equal.

  2. The population of C1 is a subset of that of C2 (or C2 is a subset of C1).

  3. The populations of C1 and C2 are conjoint; they have some values in common.

  4. The populations of C1 and C2 are disjoint; they have no values in common.

If C1 and C2 are considered to be sets with values, the four relationships can be defined using set theory terminology as follows:

  1. C1 = C2

  2. C1 ⊂ C2 (or C2 ⊂ C1)

  3. C1 – C2 ≠ Ø ∧ C2 – C1 ≠ Ø

  4. C1 – C2 = C1 ∧ C2 – C1 = C2

Example 7.16. For each team, find the team number and the name of the captain. With the help of an implicit join:

SELECT   TEAMNO, NAME
FROM     TEAMS, PLAYERS
WHERE    TEAMS.PLAYERNO = PLAYERS.PLAYERNO

With an explicit join, the previous statement looks as follows:

SELECT   TEAMNO, NAME
FROM     TEAMS INNER JOIN PLAYERS
         ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO

Explanation: It is obvious, again, that the TEAMS and PLAYERS tables are joined with an inner join. The join condition (after the word ON) is used to compare the PLAYERNO columns in the two tables. The result of these two statements is equal. Because the PLAYERNO column in the TEAMS table is a subset of that of the PLAYERS table, the result contains all those players who appear in the TEAMS table (which is in accordance with the question).

The word INNER in the join specification can be omitted. It has been added only to show which type of join will be executed. Therefore, the previous statement is equal to the next:

SELECT    TEAMNO, NAME
FROM      TEAMS JOIN PLAYERS
          ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO

Multiple tables can be joined with one FROM clause. Imagine that T1, T2, T3, and T4 are tables, and C is a join condition to join two tables. Then, the following examples are all allowed:

  • T1 INNER JOIN T2 ON C

  • T1 INNER JOIN T2 ON C INNER JOIN T3 ON C

  • (T1 INNER JOIN T2 ON C) INNER JOIN T3 ON C

  • T1 INNER JOIN (T2 INNER JOIN T3 ON C) ON C

  • (T1 INNER JOIN T2 ON C) INNER JOIN (T3 INNER JOIN T4 ON C) ON C

Exercise 7.12:

For each team, find the number and the name of the captain. In Exercise 7.5, an implicit join was used; use an explicit join now.

Exercise 7.13:

Find the numbers and the names of the players who live in the same town as player 27. In Exercise 7.9 an implicit join was used; use an explicit join now.

Exercise 7.14:

For each match, get the match number, the name of the player, and the division of the team. In Exercise 7.6, an implicit join was used; use an explicit join now.

Joins with USING

If the names of the join columns are equal, USING can also be used instead of the condition. Therefore, the following two FROM clauses are equal:

FROM      TEAMS INNER JOIN PLAYERS
          ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO

and

FROM      TEAMS INNER JOIN PLAYERS
          USING (PLAYERNO)

The use of USING has no influence on the result and does not create any additional possibilities with respect to the other form. It has only two limited advantages. First, the statement is a little shorter and, therefore, easier to read. Second, when a join of two or more columns must be specified, the formulation becomes much more compact.

Portability

USING is supported by most SQL products, but not all. For example, DB2 does not support it.

Outer Joins

The only join type discussed so far has been the inner join. However, the additional advantages of this type are limited. It is helpful to indicate more explicitly that the statement performs a join, but it is not a huge improvement. For the other join types, such as left outer join, however, statements become considerably clearer, more powerful and shorter.

We discuss the left outer, the right outer, and the full outer join, respectively.

The Left Outer Join

We start with an example.

Example 7.17. For all the players, find the player number, the name, and the penalties incurred by him or her; order the result by player number.

To answer this question, many of us would use the following SELECT statement:

SELECT   PLAYERS.PLAYERNO, NAME, AMOUNT
FROM     PLAYERS, PENALTIES
WHERE    PLAYERS.PLAYERNO = PENALTIES.PLAYERNO
ORDER BY 1

The result is:

PLAYERNO  NAME       AMOUNT
--------  ---------  ------
       6  Parmenter  100.00
       8  Newcastle   25.00
      27  Collins    100.00
      27  Collins     70.00
      44  Baker       75.00
      44  Baker       25.00
      44  Baker       30.00
     104  Moorman     50.00

However, the result is incomplete because all players who have no penalties are missing.

The intention of this question is to get all the players in the result. To get the missing players in the result as well, a so-called left outer join must be specified:

SELECT   PLAYERS.PLAYERNO, NAME, AMOUNT
FROM     PLAYERS LEFT OUTER JOIN PENALTIES
         ON PLAYERS.PLAYERNO = PENALTIES.PLAYERNO
ORDER BY 1

and the result is:

PLAYERNO  NAME       AMOUNT
--------  ---------  ------
       2  Everett         ?
       6  Parmenter  100.00
       7  Wise            ?
       8  Newcastle   25.00
      27  Collins    100.00
      27  Collins     75.00
      28  Collins         ?
      39  Bishop          ?
      44  Baker       75.00
      44  Baker       25.00
      44  Baker       30.00
      57  Brown           ?
      83  Hope            ?
      95  Miller          ?
     100  Parmenter       ?
     104  Moorman     50.00
     112  Bailey          ?

ExplanationIn the FROM clause, the join type is specified between the two tables—in this case, a left outer join. In addition, the join condition is specified after the word ON. When the join is specified in this way, SQL knows that all rows from the PLAYERS table must appear in the intermediate result of the FROM clause. The columns in the SELECT clause that belong to the PENALTIES table are filled automatically with NULL values for all those players for whom no penalty was paid.

Note that with all outer joins, the term OUTER can be omitted without any effect on the end result. Whether outer joins are necessary depends, as mentioned before, on the question and on the relationship between the join columns. Between the populations PLAYERS.PLAYERNO and PENALTIES.PLAYERNO, there is a subset relationship: The population of PENALTIES.PLAYERNO is a subset of the population PLAYERS.PLAYERNO. So, a left outer join is useful. The other way would make no sense; see the following example.

Example 7.18. For each penalty, get the payment number and the name of the player.

SELECT   PAYMENTNO, NAME
FROM     PENALTIES LEFT OUTER JOIN PLAYERS
         ON PENALTIES.PLAYERNO = PLAYERS.PLAYERNO
ORDER BY 1

The result is:

PAYMENTNO  NAME
---------  ---------
        1  Parmenter
        2  Baker
        3  Collins
        4  Moorman
        5  Baker
        6  Newcastle
        7  Baker
        8  Collins

Explanation: In this statement, PENALTIES is the left table. Because there are no penalties that do not belong to a specific player, no penalties are left out. In other words, a left outer join in this example is superfluous. An inner join would have returned the same result.

Example 7.19. Find, for each player, the player number, the name and numbers, and divisions of the teams that he or she captains; order the result by player number.

SELECT   P.PLAYERNO, NAME, TEAMNO, DIVISION
FROM     PLAYERS AS P LEFT OUTER JOIN TEAMS AS T
         ON P.PLAYERNO = T.PLAYERNO
ORDER BY P.PLAYERNO

The result is:

PLAYERNO  NAME       TEAMNO  DIVISION
--------  ---------  ------  --------
       2  Everett         ?  ?
       6  Parmenter       1  first
       7  Wise            ?  ?
       8  Newcastle       ?  ?
      27  Collins         2  second
      28  Collins         ?  ?
      39  Bishop          ?  ?
      44  Baker           ?  ?
      57  Brown           ?  ?
      83  Hope            ?  ?
      95  Miller          ?  ?
     100  Parmenter       ?  ?
     104  Moorman         ?  ?
     112  Bailey          ?  ?

Example 7.20. For each player born in Inglewood, find the player number, the name, the list of penalties, and the list of teams for which he or she has played a match.

SELECT   PLAYERS.PLAYERNO, NAME, AMOUNT, TEAMNO
FROM     PLAYERS LEFT OUTER JOIN PENALTIES
         ON PLAYERS.PLAYERNO = PENALTIES.PLAYERNO
            LEFT OUTER JOIN MATCHES
            ON PLAYERS.PLAYERNO = MATCHES.PLAYERNO
WHERE    TOWN = 'Inglewood'

The result is:

PLAYERNO  NAME       AMOUNT  TEAMNO
--------  ---------  ------  ------
       8  Newcastle   25.00       1
       8  Newcastle   25.00       2
      44  Baker       75.00       1
      44  Baker       25.00       1
      44  Baker       30.00       1

ExplanationFirst, the PLAYERS table is joined using a left outer join to the PENALTIES table. The result contains 17 rows consisting of two players from Inglewood: players 8 and 44. Player 8 has incurred only one penalty, and player 44 has three penalties. Then, the entire result is joined with the MATCHES table. Because player 8 played for two teams, he appears twice in the result.

Summarizing: A left outer join is useful only if there can exist values in the join column of the left table that do not appear in the join column of the right table.

The Right Outer Join

The right outer join is the mirror image of the left outer join. With the left outer join, it is guaranteed that all rows from the left table appear in the intermediate result of the FROM clause. With the right outer join, this guarantee is given for the right table.

Example 7.21. For all players, get the player number, the name, and the numbers of the teams for which they are the captain.

SELECT   PLAYERS.PLAYERNO, NAME, TEAMNO
FROM     TEAMS RIGHT OUTER JOIN PLAYERS
         ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO

The result is:

PLAYERNO  NAME       TEAMNO
--------  ---------  ------
       2  Everett         ?
       6  Parmenter       1
       7  Wise            ?
       8  Newcastle       ?
      27  Collins         2
      28  Collins         ?
      39  Bishop          ?
      44  Baker           ?
      57  Brown           ?
      83  Hope            ?
      95  Miller          ?
     100  Parmenter       ?
     104  Moorman         ?
     112  Bailey          ?2

ExplanationIt is obvious that players such as 2, 7, and 8 have been included in the result even though they are not captains. If a player was captain of two teams, he would appear twice in this result.

The Full Outer Join

What happens when we join two columns that are conjoint regarding their populations? Or in other words, both columns have values that possibly do not appear in the other column. If we want all the values from both columns to appear in the end result, a full outer join is required.

Portability

MySQL does not support the full outer join. However, because many other products do support full outer joins, we discuss it.

Example 7.22. Get the result of the full outer join of the MATCHES table with the COMMITTEE_MEMBERS table.

SELECT   DISTINCT MATCHES.MATCHNO,
         MATCHES.PLAYERNO AS MATCH_PNO,
         COMMITTEE_MEMBERS.PLAYERNO AS COMMITTEE_PNO
FROM     MATCHES FULL OUTER JOIN COMMITTEE_MEMBERS
         ON MATCHES.PLAYERNO = COMMITTEE_MEMBERS.PLAYERNO
ORDER BY 1, 2, 3

The result is:

MATCHNO  MATCH_PNO  COMMITTEE_PNO
-------  ---------  -------------
       1         6              6
       2         6              6
       3         6              6
       4        44              ?
       5        83              ?
       6         2              2
       7        57             57
       8         8              8
       9        27             27
      10       104              ?
      11       112            112
      12       112            112
      13         8              8
       ?         ?             95

ExplanationRows from both tables appear in the result that, without the full outer join, would not have appeared. The last row in the result has been added because there is a row in the COMMITTEE_MEMBERS table for player 95, and this player does not appear in the MATCHES table. Additionally, the matches with numbers 4, 5, and 10 have been added. The two join columns are really conjoint sets.

You could state that the result of a full outer join (if no duplicate rows appear in the result) is equal to the UNION of the left and right outer joins of the same tables.

Exercise 7.15:

Get for all players the player number and the list with penalties incurred by them.

Exercise 7.16:

Get for all players the player number and a list with numbers of teams for which they have ever played.

Exercise 7.17:

Get for all players the player number, the list with penalties incurred by them, and the list with numbers of team for which they have ever played.

Exercise 7.18:

Which of the following FROM clauses would be useful, and which would not?

  1. FROM     PENALTIES AS PEN LEFT OUTER JOIN PLAYERS AS P
             ON PEN.PLAYERNO = P.PLAYERNO
    
  2. FROM     PENALTIES AS PEN LEFT OUTER JOIN PLAYERS AS P
             ON PEN.PLAYERNO > P.PLAYERNO
    
  3. FROM     TEAMS AS T RIGHT OUTER JOIN MATCHES AS M
             ON T.TEAMNO = M.TEAMNO
    
  4. FROM     PENALTIES AS PEN FULL OUTER JOIN TEAMS AS T
             ON PEN.PLAYERNO = T.PLAYERNO
    

Exercise 7.19:

Determine the results of the following SELECT statements given the tables T1, T2, T3, and T4. Each of these tables has only one column.

T1  C        T2  C        T3  C        T4  C
-----        -----        -----        -----
    1            2            ?            ?
    2            3            2            2
    3            4                         3
  1. SELECT   T1.C, T2.C
    FROM     T1 INNER JOIN T2 ON T1.C = T2.C
    
  2. SELECT   T1.C, T2.C
    FROM     T1 LEFT OUTER JOIN T2 ON T1.C = T2.C
    
  3. SELECT    T1.C, T2.C
    FROM      T1 RIGHT OUTER JOIN T2 ON T1.C = T2.C
    
  4. SELECT   T1.C, T2.C
    FROM     T1 RIGHT OUTER JOIN T2 ON T1.C > T2.C
    
  5. SELECT   T1.C, T3.C
    FROM     T1 RIGHT OUTER JOIN T3 ON T1.C = T3.C
    
  6. SELECT   T1.C, T3.C
    FROM     T1 LEFT OUTER JOIN T3 ON T1.C = T3.C
    
  7. SELECT   T3.C, T4.C
    FROM     T3 LEFT OUTER JOIN T4 ON T3.C = T4.C
    
  8. SELECT   T3.C, T4.C
    FROM     T3 RIGHT OUTER JOIN T4 ON T3.C = T4.C
    
  9. SELECT   T1.C, T2.C
    FROM     T1 FULL OUTER JOIN T2 ON T1.C = T2.C
    
  10. SELECT   T1.C, T2.C, T3.C
    FROM     (T1 LEFT OUTER JOIN T3 ON T1.C = T3.C)
             FULL OUTER JOIN T2 ON T3.C = T2.C
    

Exercise 7.20:

Which of the following statements are correct? Assume that the column C1 belongs to the table T1, and the column C2 to T2.

  1. If C1 is a subset of C2, the result of T1.C1 left outer join T2.C2 is equal to an inner join of the same columns.

  2. If C2 is a subset of C1, the result of T1.C1 left outer join T2.C2 is equal to an inner join of the same columns.

  3. The result of T1.C1 left outer join T1.C1 is equal to an inner join of the same columns.

  4. If the populations of C1 and C2 are equal, the result of T1.C1 full outer join T2.C2 is equal to an inner join of the same columns.

  5. If the populations of C1 and C2 are conjoint, the result of T1.C1 left outer join T2.C2 is equal to a full outer join of the same columns.

Additional Conditions in the Join Condition

The condition in the FROM clause is primarily meant to be used to join tables. Other conditions that do not actually belong to the join are allowed to be included here. However, you should realize that moving a condition from the WHERE clause to the join condition can actually affect the result. The following statement shows that distinction.

Example 7.23. The next SELECT statement contains a left outer join plus an additional condition in the WHERE clause.

SELECT   TEAMS.PLAYERNO, TEAMS.TEAMNO, PENALTIES.PAYMENTNO
FROM     TEAMS LEFT OUTER JOIN PENALTIES
         ON TEAMS.PLAYERNO = PENALTIES.PLAYERNO
WHERE    DIVISION = 'second'

The result is:

PLAYERNO  TEAMNO  PAYMENTNO
--------  ------  ---------
      27       2          3
      27       2          8

ExplanationThe intermediate result of the FROM clause contains all the rows of the TEAMS table of which the captain appears in the PENALTIES table. If teams disappear from this join, they are brought back again because of the left outer join. In other words, that intermediate result looks as follows (on the left are the columns of the TEAMS table, and on the right those of the PENALTIES):

TEAMNO  PLAYERNO  DIVISION  PAYNO  PLAYERNO  PAYMENT_DATE  AMOUNT
------  --------  --------  -----  --------  ------------  ------
     1         6  first         1         6  1980-12-08    100.00
     2        27  second        3        27  1983-09-10    100.00
     2        27  second        8        27  1984-11-12     75.00

Next, the WHERE clause is processed, and that means that only the last two rows are passed on to the SELECT clause.

If we move the condition to the join condition, the following statement arises:

SELECT   TEAMS.PLAYERNO, TEAMS.TEAMNO, PENALTIES.PAYMENTNO
FROM     TEAMS LEFT OUTER JOIN PENALTIES
         ON TEAMS.PLAYERNO = PENALTIES.PLAYERNO
         AND DIVISION = 'second'

This statement has a result that differs from the previous statement:

PLAYERNO  TEAMNO  PAYMENTNO
--------  ------  ---------
       6       1          ?
      27       2          3
      27       2          8

Now, team 1 does appear in the result, but how did that happen? SQL processes the explicit join in two steps. During the first step, the join is processed as if no outer join has to be executed, but an inner join does. So, first a Cartesian product is created, and subsequently all conditions are processed, including the condition on the DIVISION column. This leads to the following result:

TEAMNO  PLAYERNO  DIVISION  PAYNO  PLAYERNO  PAYMENT_DATE  AMOUNT
------  --------  --------  -----  --------  ------------  ------
     2        27  second        3        27  1983-09-10    100.00
     2        27  second        8        27  1984-11-12     75.00

Team 1 does not appear in this intermediate result because it does not play in the second division. During the second step, SQL checks whether rows from the TEAMS table (because that is the table on the left of the left outer join) have disappeared from this intermediate result. Those rows have to be brought back again. As a result, team 1 will be added again:

TEAMNO  PLAYERNO  DIVISION  PAYNO  PLAYERNO  PAYMENT_DATE  AMOUNT
------  --------  --------  -----  --------  ------------  ------
     2        27  second        3        27  1983-09-10    100.00
     2        27  second        8        27  1984-11-12     75.00
     1         6  first         ?         ?           ?         ?

Because of the absence of a WHERE clause, all these rows are passed on to the SELECT clause, which means that the end result will differ from that of the first statement.

Example 7.24. The next SELECT statement contains a full outer join plus an additional condition in the WHERE clause.

SELECT   TEAMS.PLAYERNO, TEAMS.TEAMNO, PENALTIES.PAYMENTNO
FROM     TEAMS FULL OUTER JOIN PENALTIES
         ON TEAMS.PLAYERNO = PENALTIES.PLAYERNO
         AND TEAMS.PLAYERNO > 1000

The result is:

PLAYERNO  TEAMNO  PAYMENTNO
--------  ------  ---------
       ?       ?          3
       ?       ?          8
       ?       ?          1
       ?       ?          6
       ?       ?          2
       ?       ?          5
       ?       ?          7
       ?       ?          4
       6       1          ?
      27       2          ?

ExplanationAfter step 1 of the join has been processed, the intermediate result is empty. The reason is that there are no player numbers greater than 1000. Then, during step 2, SQL checks whether there are rows in the tables TEAMS and PENALTIES that do not appear in the result. That involves all the teams and all the penalties, so they are added again and a somewhat strange end result occurs.

Conclusion: If an outer join is used, it absolutely matters where certain conditions are placed: in the join condition or in the WHERE clause. Therefore, consider carefully where you want to place them. This does not apply to the inner join; work out why for yourself.

The Cross Join

This section discusses the cross join. We deal with this topic just briefly because the practical value of this join is restricted and these operators suffer heavily from criticism; see, among others, [DATE97].

With the cross join, we can explicitly ask for a Cartesian product of tables. Usually, we create a Cartesian product as follows:

SELECT   TEAMS.*, PENALTIES.*
FROM     TEAMS, PENALTIES

This statement couples each row from the TEAMS table with all rows from the PENALTIES table. The following statement, in which we use the cross join, generates the same result:

SELECT   *
FROM     TEAMS CROSS JOIN PENALTIES

Of course, it is not necessary to include a join condition with the cross join. If we used one, it would not result in a Cartesian product. Therefore, specifying that condition as part of the join is not permitted. However, you can include the condition in a WHERE clause.

Portability

Not all products support the cross join. MySQL is one that does.

The Union Join and the Natural Join

Two join types have been implemented by just a few SQL products: the union and the natural join. However, for the sake of completeness, we discuss them briefly.

Portability

The union and the natural join are supported by just a few SQL products. MySQL does not support them, either.

The union join is difficult to explain. We try to do this on the basis of an example.

Example 7.25. Get the union join of the TEAMS and the PENALTIES table.

SELECT   *
FROM     TEAMS UNION JOIN PENALTIES

The result is:

TEAMNO  PLAYERNO  DIVISION  PAYMENTNO  PLAYERNO  PAYMENT_DATE  AMOUNT
------  --------  --------  ---------  --------  ------------  ------
     1         6  first             ?         ?  ?                  ?
     2        27  second            ?         ?  ?                  ?
     ?         ?  ?                 1         6  1980-12-08    100.00
     ?         ?  ?                 2        44  1981-05-05     75.00
     ?         ?  ?                 3        27  1983-09-10    100.00
     ?         ?  ?                 4       104  1984-12-08     50.00
     ?         ?  ?                 5        44  1980-12-08     25.00
     ?         ?  ?                 6         8  1980-12-08     25.00
     ?         ?  ?                 7        44  1982-12-30     30.00
     ?         ?  ?                 8        27  1984-11-12     75.00

You can see that the TEAMS table is at the top left of the result and the PENALTIES table is at the bottom right. Each row from the TEAMS table occurs only once in the result, which also applies to each row of the PENALTIES table. This result contains all rows that form the difference between a full outer join and an inner join of the same two tables.

In the relational model, the concept of natural join has been defined. It has also been included in the SQL2 standard. Here is a statement as an example:

SELECT   T.PLAYERNO, T.TEAMNO, T.DIVISION,
         PEN.PAYMENTNO, PEN.PAYMENT_DATE, PEN.AMOUNT
FROM     TEAMS AS T INNER JOIN PENALTIES AS PEN
         ON T.PLAYERNO = PEN.PLAYERNO
WHERE    DIVISION = 'first'

The previous statement can have been formulated with a natural join, as follows:

SELECT   *
FROM     TEAMS NATURAL INNER JOIN PENALTIES
WHERE    DIVISION = 'first'

In this example, we do not have to indicate explicitly which columns must be joined. SQL examines whether the two tables have columns with identical names and assumes that those must be used in the join condition. Also, only one join column is included in the join condition. An ON or USING clause would be superfluous here and, therefore, is not allowed. For each join type (except for the cross join), NATURAL can be specified.

Equi Joins and Theta Joins

The concepts equi and theta join are frequently mentioned in the relational model. However, we have not yet seen these concepts applied in SQL; indeed, this is not likely to happen. However, if in the join condition the equal to operator is used, we refer to it as an equi join. So, the following table expressions contain an equi join:

SELECT   *
FROM     PLAYERS, TEAMS
WHERE    PLAYERS.PLAYERNO = TEAMS.PLAYERNO

and

SELECT   *
FROM     PLAYERS LEFT OUTER JOIN TEAMS
         ON PLAYERS.PLAYERNO = TEAMS.PLAYERNO

If we simply refer to a join in this book, we imply an equi join. Other joins also exist, such as the greater than join (see the example below) and the less than join. The term that is used for joins when the join condition does not contain the equal to operator is non-equi join.

SELECT   *
FROM     PLAYERS, TEAMS
WHERE    PLAYERS.PLAYERNO > TEAMS.PLAYERNO

The general join, or theta join, takes the following form in SQL; the question mark stands for any comparison operator:

SELECT   *
FROM     PLAYERS, TEAMS
WHERE    PLAYERS.PLAYERNO ? TEAMS.PLAYERNO

All equi and non-equi joins together form the set of theta joins.

For the sake of clarity, we note the following: The indication equi or non-equi is unrelated to whether a join is an inner, left outer, or full outer join. We can speak of an equi left outer join, an non-equi full outer join, or a greater than inner join.

The FROM Clause with Table Expressions

In Section 6.6, in Chapter 6, “SELECT Statements, Table Expressions, and Subqueries,” we mentioned that the FROM clause itself can contain a table expression. The table expression within the FROM clause is called a table subquery. In this section, we extend the definition of the FROM clause with that table subquery. Next, we present various examples to illustrate the extensive possibilities of table subqueries.

<from clause> ::=
   FROM <table reference> [ { , <table reference> 
The FROM Clause with Table Expressions}... ]

<table reference> ::=
   { <table specification> |
     <join specification>  |
     <table subquery>      }
   [ [ AS ] <pseudonym> ]

<table subquery> ::= ( <table expression> )

Example 7.26. Get the numbers of the players resident in Stratford.

SELECT   PLAYERNO
FROM    (SELECT   *
         FROM     PLAYERS
         WHERE    TOWN = 'Stratford') AS STRATFORDERS

Explanation: A table expression in the form of a table subquery is specified in the FROM clause. This subquery returns all the column values of all players from Stratford. The resulting table is named STRATFORDERS and is passed to the other clauses. The other clauses cannot see that the table, which they receive as input, has been generated with a subquery. This statement could have been formulated in the classical way, but we have used this formulation just to start with a simple example.

Example 7.27. Get the number of each player who is captain of a team playing in the first division.

SELECT   SMALL_TEAMS.PLAYERNO
FROM    (SELECT   PLAYERNO, DIVISION
         FROM     TEAMS) AS SMALL_TEAMS
WHERE    SMALL_TEAMS.DIVISION = 'first'

The result is:

SMALL_TEAMS.PLAYERNO
--------------------
                   6

ExplanationWith the table expression in the FROM clause, the following intermediate result is created:

PLAYERNO  DIVISION
--------  --------
       6  first
      27  second

This intermediate table gets the name SMALL_TEAMS. Next, the condition SMALL_TEAMS.DIVISION = 'first' is executed on this table, after which only the PLAYERNO column is retrieved.

Table expressions can be used, for example, to prevent a repeat of complex scalar expressions.

Example 7.28. Get the match number and the difference between the total number of sets won and the total number of sets lost for each match where that difference is greater than 2.

SELECT   MATCHNO, DIFFERENCE
FROM    (SELECT   MATCHNO,
                  ABS(WON – LOST) AS DIFFERENCE
         FROM     MATCHES) AS M
WHERE    DIFFERENCE > 2

The result is:

MATCHNO  DIFFERENCE
-------  ----------
      3           3
      5           3
      7           3
      8           3
     13           3

Explanation: The subquery in the FROM clause returns for each match the match number and the difference between the WON and LOST column. In the main query, a condition is executed on this difference. To refer to that calculation in the main query, a column name has to be introduced in the subquery. For the first time, we have an example in which the specification of a column name is of more use than just to improve the readability of the result.

A special variant of the table expression is the one in which only the SELECT clause is used. This variant can also be used as table subquery.

Example 7.29. Create a virtual table called TOWNS.

SELECT   *
FROM    (SELECT 'Stratford' AS TOWN, 4 AS NUMBER
         UNION
         SELECT 'Plymouth', 6
         UNION
         SELECT 'Inglewood', 1
         UNION
         SELECT 'Douglas', 2) AS TOWNS
ORDER BY TOWN

The result is:

TOWN       NUMBER
---------  ------
Douglas         2
Inglewood       1
Plymouth        6
Stratford       4

Explanation: In this FROM clause, a table is created consisting of two columns (the first an alphanumeric one and the second a numeric one) and four rows. This table is named TOWNS. The first column has the name TOWN and contains the name of a town. The second is named NUMBER and contains a relative indication of the number of residents in that city. Note that an end result is created here without so much as querying one of the existing tables.

The table that is created as a result is a normal table to all the other clauses. For example, a WHERE clause does not know whether the intermediate result from the FROM clause is the contents of a “real” table, a subquery, a view, or temporarily created table. So, we can use all the other operations on this temporarily created table.

Example 7.30. Find for each player the number, the name, the town, and the number of residents living in that town.

SELECT   PLAYERNO, NAME, PLAYERS.TOWN, NUMBER * 1000
FROM     PLAYERS,
        (SELECT 'Stratford' AS TOWN, 4 AS NUMBER
         UNION
         SELECT 'Plymouth', 6
         UNION
         SELECT 'Inglewood', 1
         UNION
         SELECT 'Douglas', 2) AS TOWNS
WHERE    PLAYERS.TOWN = TOWNS.TOWN
ORDER BY 1

The result is:

PLAYERNO  NAME       TOWN       NUMBER
--------  ---------  ---------  ------
       2  Everett    Stratford    4000
       6  Parmenter  Stratford    4000
       7  Wise       Stratford    4000
       8  Newcastle  Inglewood    1000
      39  Bishop     Stratford    4000
      44  Baker      Inglewood    1000
      57  Brown      Stratford    4000
      83  Hope       Stratford    4000
      95  Miller     Douglas      2000
     100  Parmenter  Stratford    4000
     112  Bailey     Plymouth     6000

Explanation: The PLAYERS table is joined with the TOWNS table. Because an inner join is used, we lose all the players who live in towns that do not appear in the TOWNS table. The next table expression makes sure that we do not lose players from the result:

SELECT   PLAYERNO, NAME, PLAYERS.TOWN, NUMBER
FROM     PLAYERS LEFT OUTER JOIN
        (SELECT 'Stratford' AS TOWN, 4 AS NUMBER
         UNION
         SELECT 'Plymouth', 6
         UNION
         SELECT 'Inglewood', 1
         UNION
         SELECT 'Douglas', 2) AS TOWNS
         ON PLAYERS.TOWN = TOWNS.TOWN
ORDER BY 1

Example 7.31. Find the numbers of the players who live in a town with a population indicator greater than 2.

SELECT   PLAYERNO
FROM     PLAYERS LEFT OUTER JOIN
        (SELECT 'Stratford' AS TOWN, 4 AS NUMBER
         UNION
         SELECT 'Plymouth', 6
         UNION
         SELECT 'Inglewood', 1
         UNION
         SELECT 'Douglas', 2) AS TOWNS
         ON PLAYERS.TOWN = TOWNS.TOWN
WHERE    TOWNS.NUMBER > 2

The result is:

PLAYERNO
--------
       2
       6
       7
      39
      57
      83
     100
     112

Example 7.32. Get all combinations possible of the first names John, Mark, and Arnold, and the last names Berg, Johnson, and Williams.

SELECT   *
FROM    (SELECT 'John' AS FIRST_NAME
         UNION
         SELECT 'Mark'
         UNION
         SELECT 'Arnold') AS FIRST_NAMES,
        (SELECT 'Berg' AS LAST_NAME
         UNION
         SELECT 'Johnson'
         UNION
         SELECT 'Williams') AS LAST_NAMES

The result is:

FIRST_NAME  LAST_NAME
----------  ---------
John        Berg
Mark        Berg
Arnold      Berg
John        Johnson
Mark        Johnson
Arnold      Johnson
John        Williams
Mark        Williams
Arnold      Williams

Example 7.33. For the numbers 10 to 19, find the value to the power of three. However, if the result is greater than 4,000, it should not be included in the result.

SELECT   NUMBER, POWER(NUMBER,3)
FROM    (SELECT 10 NUMBER UNION SELECT 11 UNION SELECT 12
         UNION
         SELECT 13 UNION SELECT 14 UNION SELECT 15
         UNION
         SELECT 16 UNION SELECT 17 UNION SELECT 18
         UNION
         SELECT 19) AS NUMBERS
WHERE    POWER(NUMBER,3) <= 4000

The result is:

NUMBER  POWER(NUMBER)
------  -------------
    10           1000
    11           1331
    12           1728
    13           2197
    14           2744
    15           3375

This statement works well if the numbers are limited. When we want to do the same with a hundred or more numbers, the statement would not be as simple. In that case, we could try to avoid the problem by generating a long list of numbers in a more creative way.

Example 7.34. Generate the numbers 0 up to and including 999.

SELECT   NUMBER
FROM    (SELECT   CAST(CONCAT(DIGIT1.DIGIT,
                  CONCAT(DIGIT2.DIGIT,
                  DIGIT3.DIGIT)) AS UNSIGNED INTEGER)
                  AS NUMBER
         FROM    (SELECT '0' DIGIT UNION SELECT '1' UNION
                  SELECT '2' UNION SELECT '3' UNION
                  SELECT '4' UNION SELECT '5' UNION
                  SELECT '6' UNION SELECT '7' UNION
                  SELECT '8' UNION SELECT '9') AS DIGIT1,
                 (SELECT '0' DIGIT UNION SELECT '1' UNION
                  SELECT '2' UNION SELECT '3' UNION
                  SELECT '4' UNION SELECT '5' UNION
                  SELECT '6' UNION SELECT '7' UNION
                  SELECT '8' UNION SELECT '9') AS DIGIT2,
                 (SELECT '0' DIGIT UNION SELECT '1' UNION
                  SELECT '2' UNION SELECT '3' UNION
                  SELECT '4' UNION SELECT '5' UNION
                  SELECT '6' UNION SELECT '7' UNION
                  SELECT '8' UNION SELECT '9') AS DIGIT3)
                  AS NUMBERS
ORDER BY 1

The result is:

NUMBER
------
     0
     1
     2
     :
   998
   999

Example 7.35. Find the squares of whole numbers between 0 and 999.

SELECT   NUMBER AS SQUARE, ROUND(SQRT(NUMBER)) AS BASIS
FROM    (SELECT    CAST(CONCAT(DIGIT1.DIGIT,
                   CONCAT(DIGIT2.DIGIT,
                   DIGIT3.DIGIT)) AS UNSIGNED INTEGER)
                   AS NUMBER
         FROM     (SELECT '0' DIGIT UNION SELECT '1' UNION
                   SELECT '2' UNION SELECT '3' UNION
                   SELECT '4' UNION SELECT '5' UNION
                   SELECT '6' UNION SELECT '7' UNION
                   SELECT '8' UNION SELECT '9') AS DIGIT1,
                  (SELECT '0' DIGIT UNION SELECT '1' UNION
                   SELECT '2' UNION SELECT '3' UNION
                   SELECT '4' UNION SELECT '5' UNION
                   SELECT '6' UNION SELECT '7' UNION
                   SELECT '8' UNION SELECT '9') AS DIGIT2,
                  (SELECT '0' DIGIT UNION SELECT '1' UNION
                   SELECT '2' UNION SELECT '3' UNION
                   SELECT '4' UNION SELECT '5' UNION
                   SELECT '6' UNION SELECT '7' UNION
                   SELECT '8' UNION SELECT '9') AS DIGIT3)
                   AS NUMBERS
WHERE    SQRT(NUMBER) = ROUND(SQRT(NUMBER))
ORDER BY 1

The result is:

SQUARE  BASIS
------  -----
     0      0
     1      1
     4      2
     :      :
   900     30
   961     31

Exercise 7.21:

For each player, get the difference between the year they joined the club and the year in which they were born, but return only those players of which that difference is greater than 20.

Exercise 7.22:

Get a list of all combinations of three letters that you can make with the letters a, b, c, and d.

Exercise 7.23:

Find 10 random integer numbers between 0 and 1000.

Answers

7.1

Both tables have a column called PLAYERNO.

The SELECT clause refers to the PLAYERS table even though it is not specified in the FROM clause.

7.2

The question: “Get the name of each player who is captain of a team.”

The FROM clause:

TEAMNO  PLAYERNO  DIVISION  PLAYERNO  NAME       ...
------  --------  --------  --------  ---------  ---
     1         6  first            6  Parmenter  ...
     1         6  first           44  Baker      ...
     1         6  first           83  Hope       ...
     1         6  first            2  Everett    ...
     1         6  first           27  Collins    ...
     1         6  first          104  Moorman    ...
     1         6  first            7  Wise       ...
     1         6  first           57  Brown      ...
     1         6  first           39  Bishop     ...
     1         6  first          112  Bailey     ...
     1         6  first            8  Newcastle  ...
     1         6  first          100  Parmenter  ...
     1         6  first           28  Collins    ...
     1         6  first           95  Miller     ...
     2        27  second           6  Parmenter  ...
     2        27  second          44  Baker      ...
     2        27  second          83  Hope       ...
     2        27  second           2  Everett    ...
     2        27  second          27  Collins    ...
     2        27  second         104  Moorman    ...
     2        27  second           7  Wise       ...
     2        27  second          57  Brown      ...
     2        27  second          39  Bishop     ...
     2        27  second         112  Bailey     ...
     2        27  second           8  Newcastle  ...
     2        27  second         100  Parmenter  ...
     2        27  second          28  Collins    ...
     2        27  second          95  Miller     ...

The WHERE clause:

TEAMNO  PLAYERNO  DIVISION  PLAYERNO  NAME       ...
------  --------  --------  --------  ---------  ---
     1         6  first            6  Parmenter  ...
     2        27  second          27  Collins    ...

The SELECT clause and also the end result:

NAME
---------
Parmenter
Collins

7.3

SELECT   PAYMENTNO, AMOUNT, PLAYERS.PLAYERNO, NAME
FROM     PENALTIES, PLAYERS
WHERE    PENALTIES.PLAYERNO = PLAYERS.PLAYERNO

7.4

SELECT   PAYMENTNO, NAME
FROM     PENALTIES, PLAYERS, TEAMS
WHERE    PENALTIES.PLAYERNO = TEAMS.PLAYERNO
AND      TEAMS.PLAYERNO = PLAYERS.PLAYERNO

7.5

SELECT   T.TEAMNO, P.NAME
FROM     TEAMS AS T, PLAYERS AS P
WHERE    T.PLAYERNO = P.PLAYERNO

7.6

SELECT   M.MATCHNO, P.NAME, T.DIVISION
FROM     MATCHES AS M, PLAYERS AS P, TEAMS AS T
WHERE    M.PLAYERNO = P.PLAYERNO
AND      M.TEAMNO = T.TEAMNO

7.7

SELECT   P.PLAYERNO, P.NAME
FROM     PLAYERS AS P, COMMITTEE_MEMBERS AS C
WHERE    P.PLAYERNO = C.PLAYERNO
AND      B.POSITION = 'Chairman'

7.8

SELECT   DISTINCT CM.PLAYERNO
FROM     COMMITTEE_MEMBERS AS CM, PENALTIES AS PEN
WHERE    CM.PLAYERNO = PEN.PLAYERNO
AND      CM.BEGIN_DATE = PEN.PAYMENT_DATE

7.9

SELECT   P.PLAYERNO, P.NAME
FROM     PLAYERS AS P, PLAYERS AS P27
WHERE    P.TOWN = P27.TOWN
AND      P27.PLAYERNO = 27
AND      P.PLAYERNO <> 27

7.10

SELECT   DISTINCT P.PLAYERNO AS PLAYER_PLAYERNO,
         P.NAME AS PLAYER_NAME,
         CAP.PLAYERNO AS CAPTAIN_PLAYERNO,
         CAP.NAME AS CAPTAIN_NAME
FROM     PLAYERS AS P, PLAYERS AS CAP,
         MATCHES AS M, TEAMS AS T
WHERE    M.PLAYERNO = P.PLAYERNO
AND      T.TEAMNO = M.TEAMNO
AND      M.PLAYERNO <> T.PLAYERNO
AND      CAP.PLAYERNO = T.PLAYERNO

7.11

SELECT   PEN1.PAYMENTNO, PEN1.PLAYERNO
FROM     PENALTIES AS PEN1, PENALTIES AS PEN2
WHERE    PEN1.AMOUNT = PEN2.AMOUNT
AND      PEN2.PLAYERNO = 44
AND      PEN1.PLAYERNO <> 44

7.12

SELECT   T.TEAMNO, P.NAME
FROM     TEAMS AS T INNER JOIN PLAYERS AS P
         ON T.PLAYERNO = P.PLAYERNO

7.13

SELECT   P.PLAYERNO, P.NAME
FROM     PLAYERS AS P INNER JOIN PLAYERS AS P27
         ON P.TOWN = P27.TOWN
AND      P27.PLAYERNO = 27
AND      P.PLAYERNO <> 27

7.14

SELECT   M.MATCHNO, P.NAME, T.DIVISION
FROM    (MATCHES AS M INNER JOIN PLAYERS AS P
         ON M.PLAYERNO = P.PLAYERNO)
         INNER JOIN TEAMS AS T
         ON M.TEAMNO = T.TEAMNO

7.15

SELECT   PLAYERS.PLAYERNO, PENALTIES.AMOUNT
FROM     PLAYERS LEFT OUTER JOIN PENALTIES
         ON PLAYERS.PLAYERNO = PENALTIES.PLAYERNO

7.16

SELECT   P.PLAYERNO, M.TEAMNO
FROM     PLAYERS AS P LEFT OUTER JOIN MATCHES AS M
         ON P.PLAYERNO = M.PLAYERNO

7.17

SELECT   P.PLAYERNO, M.TEAMNO
FROM    (PLAYERS AS P LEFT OUTER JOIN MATCHES AS M
         ON P.PLAYERNO = M.PLAYERNO)
         LEFT OUTER JOIN PENALTIES AS PEN
         ON P.PLAYERNO = PEN.PLAYERNO

7.18

  1. The left outer join indicates that all rows that possibly disappear from the left table (the PENALTIES table) still have to be included in the end result. But there are no rows in the PENALTIES table of which the player number does not appear in the PLAYERS table. So, the outer join in this FROM clause has no use; an inner join would return the same result.

  2. The left outer join indicates that all rows that possibly disappear from the left table (the PENALTIES table) still have to be included in the end result. In this example, rows could disappear because a greater than operator is used in the join condition. Therefore, this FROM clause serves a purpose.

  3. The right outer join indicates that all rows that possibly disappear from the right table (the MATCHES table) still have to be included in the end result. But there are no rows in the MATCHES table of which the team number does not appear in the TEAMS table. So, this FROM clause has no use; an inner join would give a similar result.

  4. The full outer join indicates that all rows that possibly disappear from the left table (the PENALTIES table) and the right table (the TEAM table) still have to be included in the end result. In this situation, it is indeed possible. So, this FROM clause is useful.

7.19

  1. T1.C  T2.C
    ----  ----
       2     3
       2     3
    
  2. T1.C  T2.C
    ----  ----
       1     ?
       2     2
       3     3
    
  3. T1.C  T2.C
    ----  ----
       2     2
       3     3
       ?     4
    
  4. T1.C  T2.C
    ----  ----
       3     2
       ?     3
       ?     4
    
  5. T1.C  T3.C
    ----  ----
       2     2
       ?     ?
    
  6. T1.C  T3.C
    ----  ----
       1     ?
       2     2
       3     ?
    
  7. T3.C  T4.C
    ----  ----
       ?     ?
       2     2
    
  8. T3.C  T4.C
    ----  ----
       ?     ?
       2     2
       ?     3
    
  9. T1.C  T2.C
    ----  ----
       1     ?
       2     2
       3     3
       ?     4
    
  10. T1.C  T2.C  T3.C
    ----  ----  ----
       1     ?     ?
       2     2     2
       3     ?     ?
       ?     3     ?
       ?     4     ?
    

7.20

Correct.

Incorrect.

Correct.

Correct.

Incorrect.

7.21

SELECT   PLAYERNO, DIFFERENCE
FROM    (SELECT   PLAYERNO,
                  JOINED - YEAR(BIRTH_DATE) AS DIFFERENCE
         FROM     PLAYERS) AS DIFFERENCES
WHERE    DIFFERENCE > 20

7.22

SELECT   LETTER1 || LETTER2 || LETTER3
FROM    (SELECT 'a' AS LETTER1 UNION SELECT 'b'
         UNION SELECT 'c' UNION SELECT 'd') AS LETTERS2,
        (SELECT 'a' AS LETTER3 UNION SELECT 'b'
         UNION SELECT 'c' UNION SELECT 'd') AS LETTERS3

7.23

SELECT   ROUND(RAND() * 1000)
FROM    (SELECT 0 NUMBER UNION SELECT 1 UNION SELECT 2
         UNION
         SELECT 3 UNION SELECT 4 UNION SELECT 5
         UNION
         SELECT 6 UNION SELECT 7 UNION SELECT 8
         UNION
         SELECT 9) AS NUMBERS
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.130.201