Writing SQL queries to answer questions: The basics

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Writing SQL queries to answer questions: The basics

You can retrieve data by using the SQL statement SELECT to specify a result table that can be derived from one or more tables. This section provides examples of SQL statements that illustrate how to code and use each clause of the SELECT statement to query a table. Examples of more advanced queries explain how to fine-tune your queries by using functions and expressions and how to query multiple tables with more complex queries that include unions, joins, and subqueries.

The best way to learn SQL is to develop SQL statements similar to these examples and then execute them dynamically by using a tool such as DB2 QMF for Windows.

The data that is retrieved through SQL is always in the form of a table. “Chapter 2. DB2 concepts” introduces some examples of result tables. Like the tables from which you retrieve the data, a result table has rows and columns. A program fetches this data one or more rows at a time.

Example: Consider this SELECT statement:

SELECT LASTNAME, FIRSTNME
 FROM EMP
 WHERE DEPT = 'D11'
 ORDER BY LASTNAME;

This SELECT statement returns the following result table:

LASTNAME     FIRSTNME
========     ========
BROWN        DAVID
LUTZ         JENNIFER
STERN        IRVING

The result tables in this book are displayed like this table.

Example tables

The examples in this chapter, unless otherwise noted, are based on the following four tables that represent information about the employees of a computer company:

The department (DEPT) table contains information about the departments to which the employees report.
The employee (EMP) table contains information about each employee.
The project (PROJ) table contains information about projects that employees work on.
The employee project activity (EMPPROJACT) table contains information about employee participation in a project.

Figure 5.6 shows the DEPT table. Each row in the DEPT table contains data for a single department: its number, its name, the employee number of its manager, and its administrative department.

Figure 5.6. Example DEPT table

Figure 5.7 shows the EMP table. Each row in the EMP table contains data for a single employee: employee number, first name, last name, the department to which the employee reports, the employee's hire date, job title, education level, salary, and commission.

Figure 5.7. Example EMP table

[View full size image]

Figure 5.8 shows the PROJ table. Each row in the PROJ table contains data for a single project: the project number and name, the department and employee who is responsible for the project, and the name of the major project that includes the project.

Figure 5.8. Example PROJ table

Figure 5.9 shows the EMPPROJACT table. Each row in the EMPPROJACT table contains data for the employee who performs an activity for a project: the employee number, project number, and the start and end dates of project activity.

Figure 5.9. Example EMPPROJACT table

Selecting data from columns: SELECT

You have several options for selecting columns from a database for your result tables. This section describes the various techniques that you can use for selecting columns.

Selecting all columns: SELECT *

You do not need to know the column names to select DB2 data. Use an asterisk (*) in the SELECT clause to indicate all columns from each selected row of the named table. DB2 selects the columns in the order in which the columns are declared in that table.

Example: Consider this query:

SELECT *
 FROM DEPT;

The result table looks like this:

DEPTNO       DEPTNAME                MGRNO         ADMRDEPT
======       =====================   ======        ========
A00          CHAIRMANS OFFICE        000010        A00
B01          PLANNING                000020        A00
C01          INFORMATION CENTER      000030        A00
D11          MANUFACTURING SYSTEMS   000060        D11
E21          SOFTWARE SUPPORT        ------        D11

This SELECT statement retrieves data from each column (SELECT *) of each retrieved row of the DEPT table. Because the example does not specify a WHERE clause, the statement retrieves data from all rows. “Filtering the number of returned rows: WHERE” on page 137 explains how to use the WHERE clause to narrow your selection.

In this example, the fifth row contains a null value because no manager is identified for this department. All examples of output in this book display dashes for null values. “Selecting rows that have null values” on page 138 has more information about null values.

SELECT * is most appropriate when used with dynamic SQL and view definitions.

Recommendation: Avoid using SELECT * in static SQL. You write static SQL applications when you know the number of columns that your application will return. That number can change outside your application. If a change occurs to the table, you need to update the application to reflect the changed number of columns in the table.

You can read more about static and dynamic SQL in “Chapter 6. Writing an application program.”

Selecting some columns: SELECT column-name

Select the columns that you want by naming each column. All columns appear in the order that you specify, not in their order in the table.

Example: Notice that the DEPT table contains the DEPTNO column before the MGRNO column. Consider this query:

SELECT MGRNO, DEPTNO
FROM DEPT;

The result table looks like this:

MGRNO       DEPTNO
======      ======
000010      A00
000020      B01
000030      C01
000060      D11
------      E21

This SELECT statement retrieves data that is contained in the two named columns of each row in the DEPT table. You can select data from 1 column or as many as 750 columns with a single SELECT statement.

Selecting derived columns: SELECT expression

You can select columns that are derived from a constant, an expression, or a function.

Example: Consider this query, which contains an expression:

SELECT EMPNO, (SALARY + COMM)
FROM EMP;

The result table looks like this:

EMPNO
======              ========
000010              56970.00
000020              44550.00
000030              41310.00
000060              34830.00
.
.
.

This query selects data from all rows in the EMP table, calculates the result of the expression, and returns the columns in the order that the SELECT statement indicates. In the result table, the derived columns, such as (SALARY + COMM), do not have names. You can use the AS clause to give names to unnamed columns. “Naming result columns: AS” on page 127 has information about the AS clause.

To order the rows in the result table by the values in a derived column, specify a name for the column by using the AS clause and use that name in the ORDER BY clause, which is described in “Putting the rows in order: ORDER BY” on page 145.

Eliminating duplicate rows: DISTINCT

The DISTINCT keyword removes redundant duplicate rows from the result table so that each row contains unique data.

Example: Consider the following query:

SELECT ADMRDEPT
 FROM DEPT;

The result table looks like this:

ADMRDEPT
========
A00
A00
A00
D11
D11

When you omit the DISTINCT keyword, the ADMRDEPT column value of each selected row is returned, even though the result table includes several duplicate rows.

Example: Compare the previous example with the following query, which uses the DISTINCT keyword to list the department numbers of the administrative departments:

SELECT DISTINCT ADMRDEPT
 FROM DEPT;

The result table looks like this:

ADMRDEPT
========
A00
D11

You can use more than one DISTINCT keyword in a single query.

Naming result columns: AS

With AS, you can name result columns in a SELECT clause. This keyword is particularly useful for a column that is derived from an expression or a function.

Example: In the following query, the expression SALARY+COMM is named TOTAL_SAL:

SELECT EMPNO, SALARY + COMM AS TOTAL_SAL
 FROM EMP
 ORDER BY TOTAL_SAL;

The result table looks like this:

EMPNO        TOTAL_SAL
======       =========
000320        21546.00
200340        25747.00
000330        27400.00
000200        29957.00
.
.
.

Notice how this result differs from the result of a similar query that didn't use an AS clause, shown in “Selecting derived columns: SELECT expression” on page 125.

Processing a SELECT statement

SELECT statements (and, in fact, SQL statements in general) are made up of a series of clauses that are defined by SQL as being executed in a certain order. You have already seen examples of the SELECT, FROM, and ORDER BY clauses. The following clause list is in its order of processing:

FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY

In addition:

Subselects are processed from the innermost to the outermost subselect. A subselect in a WHERE clause or a HAVING clause of another SQL statement is called a subquery.
The ORDER BY clause can appear only in the outermost SELECT statement.
If you use an AS clause to define a name in the outermost SELECT clause, only the ORDER BY clause can refer to that name. If you use an AS clause in a subselect, you can refer to the name that it defines outside the subselect.

Example: Consider this SELECT statement, which is not valid:

SELECT EMPNO, (SALARY + COMM) AS TOTAL_SAL
 FROM EMP
 WHERE TOTAL_SAL > 50000;

The WHERE clause is not valid, because DB2 does not process the AS TOTAL_SAL portion of the statement until after the WHERE clause is processed. Therefore, DB2 does not recognize the name TOTAL_SAL that the AS clause defines.

The following SELECT statement, however, is valid because the ORDER BY clause refers to the column name TOTAL_SAL that the AS clause defines:

SELECT EMPNO, (SALARY + COMM) AS TOTAL_SAL
 FROM EMP
 ORDER BY TOTAL_SAL;

Later sections in this chapter examine each clause in more detail and introduce more advanced topics.

Accessing DB2 data that is not in a table

You can access DB2 data that is not in a table by returning the value of an SQL expression (which does not include a column of a table) in a host variable in the following ways.

Set the contents of a host variable to the value of an expression by using the SET host-variable assignment statement.
Example:
```
EXEC SQL SET :HVRANDVAL = RAND(:HVRAND);
```
In addition, you can use the VALUES INTO statement to return the value of an expression in a host variable.
Example:
```
EXEC SQL VALUES RAND(:HVRAND)
INTO :HVRANDVAL;
```

“Accessing data with host variables, variable arrays, and structures” on page 187 provides more information about host variables.

Using functions and expressions

You can use functions and expressions to control the appearance and values of rows and columns in your result tables. This section introduces you to some of the many built-in functions that DB2 offers, including aggregate functions and scalar functions. A built-in function is a function that is supplied with DB2 UDB for z/OS.

You can also read about user-defined functions and CASE expressions in this section.

Concatenating strings: CONCAT

You can concatenate strings by using the CONCAT operator or the CONCAT built-in function. When the operands of two strings are concatenated, the result of the expression is a string. The operands of concatenation must be compatible strings.

Example: Consider this query:

SELECT LASTNAME CONCAT ',' CONCAT FIRSTNME
 FROM EMP;

This SELECT statement concatenates the last name, a comma, and the first name of each result row. The result table looks like this:

================
HAAS,CHRISTINE
THOMPSON,MICHAEL
KWAN,SALLY
STERN,IRVING
.
.
.

Alternative syntax for the SELECT statement shown above is as follows:

SELECT LASTNAME CONCAT(CONCAT(LASTNAME,','),FIRSTNME)
 FROM EMP;

In this case, the SELECT statement concatenates the last name and then concatenates that result to the first name.

Calculating values in a column or across columns

You can perform calculations on numeric or datetime data. The numeric data types are binary integer, floating-point, and decimal. The datetime data types are date, time, and timestamp. “Choosing a data type for the column” on page 224 has more information about data types.

You can retrieve calculated values, just as you display column values, for selected rows.

Example: Consider this query:

SELECT EMPNO,
 SALARY / 12 AS MONTHLY_SAL,
 SALARY / 52 AS WEEKLY_SAL
 FROM EMP
 WHERE DEPT = 'A00';

The result table looks like this:

EMPNO            MONTHLY_SAL    WEEKLY_SAL
======         ============= =============
000010         4395.83333333 1014.42307692
000120         2437.50000000  562.50000000
200010         3875.00000000  894.23076923

The result table displays the monthly and weekly salaries of employees in department A00. If you prefer results with only two digits to the right of the decimal point, you can use the DECIMAL function (described in “Returning a single value from a single value: Scalar functions” on page 132).

Example: To retrieve the department number, employee number, salary, and commission for those employees whose combined salary and commission is greater than $45,000, write the query as follows:

SELECT DEPT, EMPNO, SALARY, COMM
 FROM EMP
 WHERE SALARY + COMM > 45000;

The result table looks like this:

DEPT     EMPNO         SALARY         COMM
====     ======      ========      =======
A00      000010      52750.00      4220.00
A00      200010      46500.00      4220.00

Calculating aggregate values: Aggregate functions

An aggregate function is an operation that derives its result by using values from one or more rows. An aggregate function is also known as a column function. The argument of an aggregate function is a set of values that are derived from an expression. You can use the SQL aggregate functions to calculate values based on entire columns of data. The calculated values are from only selected rows (all rows that satisfy the WHERE clause).

You can use the following aggregate functions:

SUM

Returns the total value.

MIN

Returns the minimum value.

AVG

Returns the average value.

MAX

Returns the maximum value.

COUNT

Returns the number of selected rows.

XMLAGG

Returns a concatenation of XML elements from a collection of XML elements.

Example: This query calculates, for department D11, the sum of employee salaries, the minimum, average, and maximum salary, and the count of employees in the department:

SELECT SUM(SALARY) AS SUMSAL,
 MIN(SALARY) AS MINSAL,
 AVG(SALARY) AS AVGSAL,
 MAX(SALARY) AS MAXSAL,
 COUNT(*) AS CNTSAL
FROM EMP
WHERE DEPT = 'A00';

The result table looks like this:

   SUMSAL      MINSAL            AVGSAL      MAXSAL   CNTSAL
=========    ========    ==============    ========   ======
128500.00    29250.00    42833.33333333    52750.00        3

You can use (*) in the COUNT and COUNT_BIG functions. In this example, COUNT(*) returns the rows that DB2 processes according to the WHERE clause.

You can use DISTINCT with the SUM, AVG, and COUNT functions. DISTINCT means that the selected function operates on only the unique values in a column.

Recommendation: Do not use DISTINCT with the MAX and MIN functions because using it does not affect the result of those functions.

You can use SUM and AVG only with numbers. You can use MIN, MAX, and COUNT with any built-in data type.

Example: This query counts the number of employees that are described in the EMP table:

SELECT COUNT(*)
 FROM EMP;

Example: This query counts the different jobs in the EMP table:

SELECT COUNT(DISTINCT JOB)
 FROM EMP;

Aggregate functions like COUNT ignore nulls in the values on which they operate. The preceding example counts job values that are not null.

Returning a single value from a single value: Scalar functions

Like an aggregate function, a scalar function produces a single value. Unlike the argument of an aggregate function, an argument of a scalar function is a single value.

Example: This query, which uses the YEAR scalar function, returns the year in which each employee in a particular department was hired:

SELECT YEAR(HIREDATE) AS HIREYEAR
 FROM EMP
 WHERE DEPT = 'A00';

The result table looks like this:

HIREYEAR
========
    1975
    1990
    1985

The scalar function YEAR produces a single scalar value for each row of EMP that satisfies the search condition. In this example, three rows satisfy the search condition, so YEAR results in three scalar values.

DB2 offers almost 100 different scalar functions. In addition to YEAR, this section provides examples of the CHAR, DECIMAL, and NULLIF scalar functions:

CHAR

The CHAR function returns a string representation of the input value.

DECIMAL

The DECIMAL function returns a decimal representation of the input value.

NULLIF

The NULLIF function returns a null value if the two arguments of the function are equal. If the arguments are not equal, NULLIF returns the value of the first argument.

Example: CHAR: The following SQL statement sets the host variable AVERAGE to the character string representation of the average employee salary:

SELECT CHAR(AVG(SALARY))
 INTO :AVERAGE
 FROM EMP;

Example: DECIMAL: Assume that you want to change the decimal data type to return a value with a precision and scale that you prefer. The following example represents the average salary of employees as an eight-digit decimal number (the precision) with two of these digits to the right of the decimal point (the scale):

SELECT DECIMAL(AVG(SALARY),8,2)
 FROM EMP;

The result table looks like this:

=========
 32602.30

Example: NULLIF: Suppose that you want to calculate the average earnings of all employees who are eligible to receive a commission. All eligible employees have a commission of greater than 0, and other employees have a value of 0 for commission:

SELECT AVG(SALARY+NULLIF(COMM,0))
 AS "AVERAGE EARNINGS"
 FROM EMP;

The result table looks like this:

AVERAGE EARNINGS
================
   35248.8461538

Specifying a simple expression for the sum of the salary and commission in the select list would include all employees in the calculation of the average. To avoid including in the average those employees who do not earn a commission, you can use the NULLIF function to return a null value instead. The result of adding a null value for the commission to SALARY is itself a null value, and aggregate functions, like AVG, ignore null values. Therefore, this use of NULLIF inside AVG causes the query to exclude each row in which the employee is not eligible for a commission.

Nesting aggregate and scalar functions

You can nest functions in the following ways:

Scalar functions within scalar functions
Example: Suppose that you want to know the month and day of hire for a particular employee in department D11. Suppose also that you want the result in USA format (mm/dd/yyyy). (You can read about the different date formats in Table 7.4 on page 229.) Use this query:
```
SELECT SUBSTR((CHAR(HIREDATE, USA)),1,5)
 FROM EMP
 WHERE LASTNAME = 'BROWN' AND DEPT = 'D11';
```
The result table looks like this:
```
=====
03/03
```
Scalar functions within aggregate functions
In some cases, you might need to invoke a scalar function from within an aggregate function.
Example: Suppose that you want to know the average number of years of employment for employees in department A00. Use this query:
```
SELECT AVG(DECIMAL(YEAR(CURRENT DATE - HIREDATE)))
 FROM EMP
 WHERE DEPT = 'A00';
```
The result table looks like this:
```
=======
20.6666
```

The actual form of the result, 20.6666, depends on how you define the host variable to which you assign the result.

Aggregate functions within scalar functions
Example: Suppose that you want to know the year in which the last employee was hired in department E21. Use this query:
```
SELECT YEAR(MAX(HIREDATE))
 FROM EMP
 WHERE DEPT = 'E21';
```
The result table looks like this:
```
====
2002
```

Using user-defined functions

With DB2, you can define your own functions. For example, if DB2 scalar functions don't meet your needs, you can define and write a user-defined function to perform that operation. User-defined functions are small programs that you explicitly create by using a CREATE FUNCTION statement. You name the function and specify its semantics so that the function satisfies your specific programming needs. You can use a user-defined function wherever you can use a built-in function.

Example: Assume that you define a distinct type called US_DOLLAR (as in the example in “Defining and using distinct types” on page 232). You might want to allow instances of US_DOLLAR to be added. You can create a user-defined function that uses a built-in addition operation and takes instances of US_DOLLAR as input. This kind of function, called a sourced function, requires no application coding. Or, you might create a more complex user-defined function that can take a US_DOLLAR instance as input and then convert from U.S. dollars to another currency.

“Defining user-defined functions” on page 278 has information about implementing user-defined functions.

Using CASE expressions

With a CASE expression, an SQL expression can be executed in several different ways, depending on the value of a search condition.

One use of a CASE expression is to replace the values in a result table with more meaningful values.

Example: Suppose that you want to display the employee number, name, and education level of all field representatives in the EMP table. Education levels are stored in the EDL column as small integers, but you want to replace the values in this column with more descriptive phrases. Use a query like this:

SELECT EMPNO, FIRSTNME, LASTNAME,
CASE
 WHEN EDL<=12 THEN 'HIGH SCHOOL OR LESS'
 WHEN EDL>12 AND EDL<=14 THEN 'JUNIOR COLLEGE'
 WHEN EDL>14 AND EDL<=17 THEN 'FOUR-YEAR COLLEGE'
 WHEN EDL>17 THEN 'GRADUATE SCHOOL'
 ELSE 'UNKNOWN'
END
AS EDUCATION
FROM EMP
WHERE JOB='FLD';

The result table looks like this:

EMPNO         FIRSTNME        LASTNAME     EDUCATION
======        ========        ========     =================
000320        RAMLAL          MEHTA        FOUR-YEAR COLLEGE
000330        WING            LEE          JUNI0R COLLEGE
200340        ROY             ALONZO       FOUR-YEAR COLLEGE

The CASE expression replaces each small integer value of EDL with a description of the amount of each field representative's education. If the value of EDL is null, the CASE expression substitutes the word UNKNOWN.

Another use of a CASE expression is to prevent undesirable operations, such as division by zero, from being performed on column values.

Example: If you want to determine the ratio of employees' commissions to their salaries, you could execute this query:

SELECT EMPNO, DEPT,
 COMM/SALARY AS "COMMISSION/SALARY",
 FROM EMP;

This SELECT statement has a problem, however. If an employee has not earned any salary, a division-by-zero error occurs. By modifying the following SELECT statement with a CASE expression, you can avoid division by zero:

SELECT EMPNO, DEPT,
 (CASE WHEN SALARY=0 THEN NULL
  ELSE COMM/SALARY
 END) AS "COMMISSION/SALARY"
FROM EMP;

The CASE expression determines the ratio of commission to salary only if the salary is not zero. Otherwise, DB2 sets the ratio to a null value.

Filtering the number of returned rows: WHERE

Use a WHERE clause to select the rows that are of interest to you. For example, suppose you want to select only the rows that represent the employees who earn a salary greater than $40,000. A WHERE clause specifies a search condition. A search condition is the criteria that DB2 uses to select rows. For any given row, the result of a search condition is true, false, or unknown. If the search condition evaluates to true, the row qualifies for additional processing. In other words, that row can become a row of the result table that the query returns. If the condition evaluates to false or unknown, the row does not qualify for additional processing.

A search condition consists of one or more predicates that are combined through the use of the logical operators AND, OR, and NOT. An individual predicate specifies a test that you want DB2 to apply to each row, for example, SALARY > 40000. When DB2 evaluates a predicate for a row, it evaluates to true, false, or unknown. Results are unknown only if a value (called an operand) of the predicate is null. If a particular employee's salary is not known (and is set to null), the result of the predicate SALARY > 40000 is unknown.

The sections that follow explain different comparison operators that you can use in the predicate of a WHERE clause. Table 5.1 lists the comparison operators.

Table 5.1. Comparison operators used in conditions
Type of comparison	Specified with...	Example of predicate with comparison
Equal to null	IS NULL	COMM IS NULL
Equal to	=	DEPTNO = 'X01'
Not equal to	<>	DEPTNO <> 'X01'
Less than	<	AVG(SALARY) < 30000
Less than or equal to	<=	SALARY <= 50000
Greater than	>	SALARY > 25000
Greater than or equal to	>=	SALARY >= 50000
Similar to another value	LIKE	NAME LIKE '%STERN%' or STATUS LIKE 'N_'
At least one of two predicates	OR	HIREDATE < '2000-01-01' OR SALARY < 40000
Both of two predicates	AND	HIREDATE < '2000-01-01' AND SALARY < 40000
Between two values	BETWEEN	SALARY BETWEEN 20000 AND 40000
Equals a value in a set	IN (X, Y, Z)	DEPTNO IN ('B01', 'C01', 'D11')
Note: Another predicate, EXISTS, tests for the existence of certain rows. The result of the predicate is true if the result table that is returned by the subselect contains at least one row. Otherwise, the result is false.

You can also search for rows that do not satisfy one of the predicates in Table 5.1 by using the NOT keyword before the specified predicate. “Using the NOT keyword with comparison operators” on page 139 has more information about using the NOT keyword.

Selecting rows that have null values

A null value indicates the absence of a column value in a row. A null value is not the same as zero or all blanks.

Example: You can use a WHERE clause to retrieve rows that contain a null value in a specific column. Specify:

WHERE column-name IS NULL

Example: You can also use a predicate to exclude null values. Specify:

WHERE column-name IS NOT NULL

You cannot use the equal sign to retrieve rows that contain a null value. (WHERE column-name = NULL is not allowed.)

Selecting rows by using equalities and inequalities

You can use an equal sign (=), various inequality symbols, and the NOT keyword to specify search conditions in the WHERE clause.

Testing for equality

You can use an equal sign (=) to select rows for which a specified column contains a specified value.

Example: To select only the rows in which the department number is A00, use WHERE DEPT = 'A00' in your SELECT statement:

SELECT FIRSTNME, LASTNAME
 FROM EMP
 WHERE DEPT = 'A00';

This query retrieves the first and last name of each employee in department A00.

Testing for inequalities

You can use the following inequalities to specify search conditions:

<>  <  <=  >  >=

Example: To select all employees that were hired before January 1, 2001, you can use this query:

SELECT HIREDATE, FIRSTNME, LASTNAME
 FROM EMP
 WHERE HIREDATE < '2001-01-01';

This SELECT statement retrieves the hire date and name for each employee that was hired before 2001.

Testing for equality or inequality in a set of columns

You can also use the equal operator or the not equal operator to test whether a set of columns is equal or not equal to a set of values.

Example: To select the rows in which the department number is A00 and the education level is 14, you can use this query:

SELECT FIRSTNME, LASTNAME
 FROM EMP
 WHERE (DEPT, EDL) = ('A00', 14);

Example: To select the rows in which the department number is not A00 and the education level is not 14, you can use this query:

SELECT FIRSTNME, LASTNAME
 FROM EMP
 WHERE (DEPT, EDL) <> ('A00', 14);

Using the NOT keyword with comparison operators

You can use the NOT keyword to select all rows for which the predicate is false (but not rows for which the predicate is unknown). The NOT keyword must precede the predicate.

Example: To select all managers whose compensation is not greater than $40,000, use this query:

SELECT DEPT, EMPNO
 FROM EMP
 WHERE NOT (SALARY + COMM) > 40000 AND JOB = 'MGR'
 ORDER BY DEPT;

Table 5.2 contrasts WHERE clauses that use a NOT keyword with comparison operators and WHERE clauses that use only comparison operators. The WHERE clauses are equivalent.

Table 5.2. Equivalent WHERE clauses
Using NOT	Equivalent clause without NOT
WHERE NOT DEPTNO = 'A00'	WHERE DEPTNO <> 'A00'
WHERE NOT DEPTNO < 'A00'	WHERE DEPTNO >= 'A00'
WHERE NOT DEPTNO > 'A00'	WHERE DEPTNO <= 'A00'
WHERE NOT DEPTNO <> 'A00'	WHERE DEPTNO = 'A00'
WHERE NOT DEPTNO <= 'A00'	WHERE DEPTNO > 'A00'
WHERE NOT DEPTNO >= 'A00'	WHERE DEPTNO < 'A00'

You cannot use the NOT keyword directly preceding equality and inequality comparison operators.

Example: The following WHERE clause results in an error:

Wrong:

WHERE DEPT NOT = 'A00'

Example: The following two clauses are equivalent:

Correct:

WHERE MGRNO NOT IN ('000010', '000020')
WHERE NOT MGRNO IN ('000010', '000020')

Selecting values similar to a character string

Use LIKE to specify a character string that is similar to the column value of rows that you want to select:

Use a percent sign (%) to indicate any string of zero or more characters.
Use an underscore (_) to indicate any single character.

A LIKE pattern must match the character string in its entirety.

You can also use NOT LIKE to specify a character string that is not similar to the column value of rows that you want to select.

Selecting values similar to a string of unknown characters

The percent sign (%) means “any string or no string.”

Example: The following query selects data from each row for employees with the initials D B:

SELECT FIRSTNME, LASTNAME, DEPT
 FROM EMP
 WHERE FIRSTNME LIKE 'D%' AND LASTNAME LIKE 'B%';

Example: The following query selects data from each row of the department table, where the department name contains “CENTER” anywhere in its name:

SELECT DEPTNO, DEPTNAME
 FROM DEPT
 WHERE DEPTNAME LIKE '%CENTER%';

Example: Assume that the DEPTNO column is a three-character column of fixed length. You can use the following search condition to return rows with department numbers that begin with E and end with 1:

...WHERE DEPTNO LIKE 'E%1';

In this example, if E1 is a department number, its third character is a blank and does not match the search condition. If you define the DEPTNO column as a three-character column of varying length instead of fixed length, department E1 would match the search condition. Varying-length columns can have any number of characters, up to and including the maximum number that was specified when the column was created. “String data types” on page 225 has more information about varying-length and fixed-length columns.

Example: The following query selects data from each row of the department table, where the department number starts with an E and contains a 1:

SELECT DEPTNO, DEPTNAME
 FROM DEPT
 WHERE DEPTNO LIKE 'E%1%';

Selecting a value similar to a single unknown character

The underscore (_) means “any single character.”

Example: Consider the following query:

SELECT DEPTNO, DEPTNAME
 FROM DEPT
 WHERE DEPTNO LIKE 'E_1';

In this example, 'E_1' means E, followed by any character, followed by 1. (Be careful: '_' is an underscore character, not a hyphen.) 'E_1' selects only three-character department numbers that begin with E and end with 1; it does not select the department number 'E1'.

Selecting rows that meet more than one condition

You can use AND and OR to combine predicates. Use AND to specify that a search must satisfy both of the conditions. Use OR to specify that the search must satisfy at least one of the conditions.

Example: This query retrieves the employee number, hire date, and salary for each employee who was hired before 1998 and earns a salary of less than $35,000 per year:

SELECT EMPNO, HIREDATE, SALARY
 FROM EMP
 WHERE HIREDATE < '1998-01-01' AND SALARY < 35000;

Example: This query retrieves the employee number, hire date, and salary for each employee who either was hired before 1998 or earns a salary less than $35,000 per year or both:

SELECT EMPNO, HIREDATE, SALARY
 FROM EMP
 WHERE HIREDATE < '1998-01-01' OR SALARY < 35000;

Using parentheses with AND and OR

If you use more than two conditions with AND or OR, you can use parentheses to specify the order in which you want DB2 to evaluate the search conditions. If you move the parentheses, the meaning of the WHERE clause can change significantly.

Example: This query retrieves the row of each employee that satisfies at least one of the following conditions:

The employee's hire date is before 1998 and salary is less than $40,000.
The employee's education level is less than 18.

SELECT EMPNO
 FROM EMP
 WHERE (HIREDATE < '1998-01-01' AND SALARY < 40000) OR (EDL < 18);

Example: This query retrieves the row of each employee that satisfies both of the following conditions:

The employee's hire date is before 1998.
The employee's salary is less than $40,000 or the employee's education level is less than 18.

SELECT EMPNO
 FROM EMP
 WHERE HIREDATE < '1998-01-01' AND (SALARY < 40000 OR EDL < 18);

Example: This query retrieves the employee number of each employee that satisfies one of the following conditions:

Hired before 1998 and salary is less than $40,000.
Hired after January 1, 1998, and salary is greater than $40,000.

SELECT EMPNO
 FROM EMP
 WHERE (HIREDATE < '1998-01-01' AND SALARY < 40000)
 OR (HIREDATE > '1998-01-01' AND SALARY > 40000);

Using NOT with AND and OR

When you use NOT with AND and OR, the placement of the parentheses is important.

Example: This query retrieves the employee number, education level, and job title of each employee who satisfies both of the following conditions:

The employee's salary is less than $50,000.
The employee's education level is less than 18.

SELECT EMPNO, EDL, JOB
 FROM EMP
 WHERE NOT (SALARY >= 50000) AND (EDL < 18);

In this query, NOT affects only the first search condition (SALARY >= 50000).

Example: This query retrieves the employee number, education level, and job title of each employee who satisfies at least one of the following conditions:

The employee's salary is less than or equal to $50,000.
The employee's education level is less than or equal to 18.

SELECT EMPNO, EDL, JOB
 FROM EMP
 WHERE NOT (SALARY > 50000 AND EDL > 18);

To negate a set of predicates, enclose the entire set in parentheses and precede the set with the NOT keyword.

Using BETWEEN to specify ranges to select

You can use BETWEEN to select rows in which a column has a value within two limits.

Specify the lower boundary of the BETWEEN predicate first, and then specify the upper boundary. The limits are inclusive.

Example: Suppose that you specify the following WHERE clause in which the value of the column-name column is an integer:

WHERE column-name BETWEEN 6 AND 8

DB2 selects all rows whose column-name value is 6, 7, or 8. If you specify a range from a larger number to a smaller number (for example, BETWEEN 8 AND 6), the predicate is always false.

Example: This query retrieves the department number and manager number of each department whose number is between C00 and D31:

SELECT DEPTNO, MGRNO
 FROM DEPT
 WHERE DEPTNO BETWEEN 'C00' AND 'D31';

You can also use NOT BETWEEN to select rows in which a column has a value that is outside the two limits.

Using IN to specify values in a list

You can use the IN predicate to select each row that has a column value that is equal to one of several listed values.

In the values list after the IN predicate, the order of the items is not important and does not affect the ordering of the result. Enclose the entire list of values in parentheses, and separate items by commas; the blanks are optional.

Example: This query retrieves the department number and manager number for departments B01, C01, and D11:

SELECT DEPTNO, MGRNO
 FROM DEPT
 WHERE DEPTNO IN ('B01', 'C01', 'D11'),

Using the IN predicate gives the same results as a much longer set of conditions that are separated by the OR keyword.

Example: You can alternatively code the WHERE clause in the SELECT statement in the previous example as:

WHERE DEPTNO = 'B01' OR DEPTNO = 'C01' OR DEPTNO = 'D11';

However, the IN predicate saves coding time and is easier to understand.

Example: This query finds the projects that do not include employees in department C01 or E21:

SELECT PROJNO, PROJNAME, RESPEMP
 FROM PROJ
 WHERE DEPTNO NOT IN ('C01', 'E21'),

Putting the rows in order: ORDER BY

To retrieve rows in a specific order, use the ORDER BY clause. Using ORDER BY is the only way to guarantee that your rows are in the sequence in which you want them. The following sections show you how to use the ORDER BY clause.

Specifying the sort key

The order of the selected rows depends on the sort keys that you identify in the ORDER BY clause. A sort key can be a column name, an integer that represents the number of a column in the result table, or an expression. You can identify more than one column.

You can list the rows in ascending or descending order. Null values appear last in an ascending sort and first in a descending sort.

DB2 sorts strings in the collating sequence associated with the encoding scheme of the table. DB2 sorts numbers algebraically and sorts datetime values chronologically.

Listing rows in ascending order

To retrieve the result in ascending order, specify ASC.

Example: This query retrieves the employee numbers, last names, and hire dates of employees in department A00 in ascending order of hire dates:

SELECT EMPNO, LASTNAME, HIREDATE
 FROM EMP
 WHERE DEPT = 'A00'
 ORDER BY HIREDATE ASC;

The result table looks like this:

EMPNO        LASTNAME      HIREDATE
======       =========     ==========
000010       HAAS          1975-01-01
200010       HEMMINGER     1985-01-01
000120       CONNOR        1990-12-05

This SELECT statement shows the seniority of employees. ASC is the default sorting order.

Listing rows in descending order

To put the rows in descending order, specify DESC.

Example: This query retrieves department numbers, last names, and employee numbers in descending order of department number:

SELECT DEPT, LASTNAME, EMPNO
 FROM EMP
 WHERE JOB = 'SLS'
 ORDER BY DEPT DESC;

The result table looks like this:

DEPT  LASTNAME    EMPNO
====  =========   ======
C01   NICHOLLS    000140
A00   HEMMINGER   200010
A00   CONNOR      000120

Ordering with more than one column as the sort key

To order rows by more than one column's values, you can specify more than one column name in the ORDER BY clause.

When several rows have the same first ordering column value, those rows are in order of the second column that you identify in the ORDER BY clause, and then in the third ordering column, and so on.

Example: Consider this query:

SELECT JOB, EDL, LASTNAME
 FROM EMP
 WHERE DEPT = 'A00'
 ORDER BY JOB, EDL;

The result table looks like this:

JOB   EDL    LASTNAME
====  ===    ==========
PRES  18     HAAS
SLS   14     CONNOR
SLS   18     HEMMMINGER

Ordering with an expression as the sort key

In addition to specifying a column name or integer, you can specify an expression with operators as the sort key for the result table of a SELECT statement. The query to which ordering is applied must be a subselect to use this form of the sort key for the ORDER BY clause.

Example: This query is a part of a subselect. The query retrieves the employee numbers, salaries, commissions, and total compensation (salary plus commission) for employees with a total compensation greater than $40,000. Order the results by total compensation:

SELECT EMPNO, SALARY, COMM, SALARY+COMM AS "TOTAL COMP"
 FROM EMP
 WHERE SALARY+COMM > 40000
 ORDER BY SALARY+COMM;

The result table looks like this:

EMPNO          SALARY           COMM       TOTAL COMP
======       ========        =======       ==========
000030       38250.00        3060.00         41310.00
000020       41250.00        3300.00         44550.00
200010       46500.00        4220.00         50720.00
000010       52750.00        4220.00         56970.00

Summarizing group values: GROUP BY

Use GROUP BY to group rows by the values of one or more columns. You can then apply aggregate functions to each group. In Version 8, you can use an expression in the GROUP BY clause to specify how to group the rows.

Except for the columns that are named in the GROUP BY clause, the SELECT statement must specify any other selected columns as an operand of one of the aggregate functions.

Example: This query lists, for each department, the lowest and highest education level within that department:

SELECT DEPT, MIN(EDL), MAX(EDL)
 FROM EMP
 GROUP BY DEPT;

The result table looks like this:

DEPT
====   ==     ==
A00    14     18
B01    18     18
C01    18     20
D11    16     18
E21    14     16

If a column that you specify in the GROUP BY clause contains null values, DB2 considers those null values to be equal, and all nulls form a single group.

Within the SELECT statement, the GROUP BY clause follows the FROM clause and any WHERE clause and it precedes the HAVING and ORDER BY clauses.

You can also group the rows by the values of more than one column.

Example: This query finds the average salary for employees with the same job in departments D11 and E21:

SELECT DEPT, JOB, AVG(SALARY) AS AVG_SALARY
 FROM EMP
 WHERE DEPT IN ('D11', 'E21')
 GROUP BY DEPT, JOB;

The result table looks like this:

DEPT   JOB    AVG_SALARY
====   ===    ==============
D11    DES    28790.00000000
D11    MGR    32250.00000000
E21    FLD    23053.33333333

In this example, DB2 groups the rows first by department number and next (within each department) by job before deriving the average salary value for each group.

Example: This query finds the average salary for all employees that were hired in the same year. You can use the following subselect to group the rows by the year of hire:

SELECT AVG(SALARY), YEAR(HIREDATE)
 FROM EMP
 GROUP BY YEAR(HIREDATE);

Subjecting groups to conditions: HAVING

Use HAVING to specify a search condition that each retrieved group must satisfy. The HAVING clause acts like a WHERE clause for groups, and it can contain the same kind of search conditions that you can specify in a WHERE clause. The search condition in the HAVING clause tests properties of each group rather than properties of individual rows in the group.

Example: Consider this query:

SELECT DEPT, AVG(SALARY) AS AVG_SALARY
 FROM EMP
 GROUP BY DEPT
 HAVING COUNT(*) > 1
 ORDER BY DEPT;

The result table looks like this:

DEPT             AVG_SALARY
====         ==============
A00          42833.33333333
C01          31696.66666666
D11          29943.33333333
E21          23053.33333333

Compare the preceding example with the first example in “Summarizing group values: GROUP BY” on page 148. The HAVING COUNT(*) > 1 clause ensures that only departments with more than one member are displayed. (In this case, department B01 is not displayed because it consists of only one employee.)

Example: You can use the HAVING clause to retrieve the average salary and minimum education level of employees that were hired after 1990 and that report to departments in which the education level of all employees is greater than or equal to 14. Assuming that you want results only from departments A00 and D11, the following SQL statement tests the group property, MIN(EDL):

SELECT DEPT, AVG(SALARY) AS AVG_SALARY,
 MIN(EDL) AS MIN_EDL
 FROM EMP
 WHERE HIREDATE >= '1990-01-01' AND DEPT IN ('A00', 'D11')
 GROUP BY DEPT
 HAVING MIN(EDL) >= 14;

The result table looks like this:

DEPT  AVG_SALARY       MIN_EDL
====  ==============   =======
A00   29250.00000000   14
D11   29943.33333333   16

When you specify both GROUP BY and HAVING, the HAVING clause must follow the GROUP BY clause in the syntax. A function in a HAVING clause can include multiple occurrences of the DISTINCT clause. You can also connect multiple predicates in a HAVING clause with AND and OR, and you can use NOT for any predicate of a search condition. “Selecting rows that meet more than one condition” on page 142 provides many examples of statements that use AND, OR, and NOT.

Merging lists of values: UNION

A union is an SQL operation that combines the results of two SELECT statements to form a single result table. When DB2 encounters the UNION keyword, it processes each SELECT statement to form an interim result table. DB2 then combines the interim result table of each statement. If you use UNION to combine two columns with the same name, the corresponding column of the result table inherits that name.

You can use the UNION keyword to obtain distinct rows in the result table of a union, or you can use UNION with the optional keyword ALL to obtain all rows, including duplicates.

Eliminating duplicates

Use UNION to eliminate duplicates when merging lists of values that are obtained from several tables. The following example combines values from the EMP table and the EMPPROJACT table.

Example: List the employee numbers of all employees for which either of the following statements is true:

The employee's department number begins with 'D'.
The employee is assigned to projects whose project numbers begin with 'MA'.

SELECT EMPNO FROM EMP
 WHERE DEPT LIKE 'D%'
 UNION
 SELECT EMPNO FROM EMPPROJACT
 WHERE PROJNO LIKE 'MA%';

The result table looks like this:

EMPNO
======
000010
000020
000060
000200
000220

The result is the union of two result tables, one formed from the EMP table, the other formed from the EMPPROJACT table. The result, a one-column table, is a list of employee numbers. The entries in the list are distinct.

Keeping duplicates

If you want to keep duplicates in the result of a union, specify the optional keyword ALL after the UNION keyword.

Example: Replace the UNION keyword in the previous example with UNION ALL:

SELECT EMPNO FROM EMP
 WHERE DEPT LIKE 'D%'
 UNION ALL
 SELECT EMPNO FROM EMPPROJACT
 WHERE PROJNO LIKE 'MA%';

The result table looks like this:

EMPNO
======
000220
000200
000060
000010
000020
000010

Now, 000010 appears in the list more than once because this employee works in a department that begins with 'D' and also works on a project that begins with 'MA'.

Joining data from more than one table

Sometimes the information that you want to see is not in a single table. To form a row of the result table, you might want to retrieve some column values from one table and some column values from another table. You can use a SELECT statement to retrieve and join column values from two or more tables into a single row. A join operation allows retrieval of data from up to 255 tables. Retrieval is based on a specified condition, usually of matching column values.

This section provides examples to distinguish the different types of joins.

Example tables

The majority of examples in this section use two example tables: the parts table (PARTS) and the products table (PRODUCTS), both of which consist of hardware supplies.

Figure 5.10 shows that each row in the PARTS table contains data for a single part: the part name, the part number, and the supplier of the part.

Figure 5.10. Example PARTS table

Figure 5.11 shows that each row in the PRODUCTS table contains data for a single product: the product number, name, and price.

Figure 5.11. Example PRODUCTS table

Overview of joins

The main ingredient of a join is, typically, matching column values in rows of each table that participates in the join. The result of a join associates rows from one table with rows from another table. Depending on the type of join operation, some rows might be formed that contain column values in one table that do not match column values in another table.

A joined table specifies an intermediate result table that is the result of either an inner join or an outer join. The table is derived by applying one of the join operators—INNER, LEFT OUTER, or RIGHT OUTER—to its operands.

DB2 supports inner joins and outer joins (left, right, and full).

Inner join: Combines each row of the left table with each row of the right table, keeping only the rows in which the join condition is true.
Outer join: Includes the rows that are produced by the inner join, plus the missing rows, depending on the type of outer join:
- Left outer join: Includes the rows from the left table that were missing from the inner join.

- Right outer join: Includes the rows from the right table that were missing from the inner join.

- Full outer join: Includes the rows from both tables that were missing from the inner join.

Figure 5.12 shows the ways to combine the PARTS and PRODUCTS tables by using outer join functions. The illustration is based on a subset of columns in each table.

Figure 5.12. Outer joins of two tables. Each join is on column PROD#.

[View full size image]

An inner join consists of rows that are formed from the PARTS and PRODUCTS tables, based on matching the equality of column values between the PROD# column in the PARTS table and the PROD# column in the PRODUCTS table. The inner join does not contain any rows that are formed from unmatched columns when the PROD# columns are not equal.

You can specify joins in the FROM clause of a query. Data from the rows that satisfy the search conditions are joined from all the tables to form the result table.

The result columns of a join have names if the outermost SELECT list refers to base columns. However, if you use a function (such as COALESCE) to build a column of the result, that column does not have a name unless you use the AS clause in the SELECT list. (You can read about the COALESCE function later in this section.)

Inner join

To request an inner join, run a SELECT statement in which you specify the tables that you want to join in the FROM clause and specify a WHERE clause or an ON clause to indicate the join condition. The join condition can be any simple or compound search condition that does not contain a subquery reference. (You can read more about subqueries in “Using subqueries” on page 160.)

In the simplest type of inner join, the join condition is column1=column2.

Example: You can join the PARTS and PRODUCTS tables on the PROD# column to form a table of parts with their suppliers and the products that use the parts.

Consider the two following SELECT statements:

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT
 FROM PARTS, PRODUCTS
 WHERE PARTS.PROD# = PRODUCTS.PROD#;

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT
 FROM PARTS INNER JOIN PRODUCTS
 ON PARTS.PROD# = PRODUCTS.PROD#;

Either of these statements gives this result:

PART            SUPPLIER            PROD#    PRODUCT
=======         ============        =====    =========
WIRE            ACWF                10       GENERATOR
MAGNETS         BATEMAN             10       GENERATOR
BLADES          ACE_STEEL           205      SAW
PLASTIC         PLASTIK_CORP        30       ELAY

Notice three things about this example:

One part in the PARTS table (OIL) has a product number (160) that is not in the PRODUCTS table. One product (505, SCREWDRIVER) has no parts listed in the PARTS table. Neither OIL nor SCREWDRIVER appears in the result of the join.
Explicit syntax expresses that this join is an inner join. You can use INNER JOIN in the FROM clause instead of the comma. ON (rather than WHERE) specifies the join condition when you explicitly join tables in the FROM clause.
If you do not specify a WHERE clause in the first form of the query, the result table contains all possible combinations of rows for the tables that are identified in the FROM clause. You can obtain the same result by specifying a join condition that is always true in the second form of the query.

Example: Consider this query:

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT
 FROM PARTS INNER JOIN PRODUCTS
 ON 1=1;

The number of rows in the result table is the product of the number of rows in each table:

PART    SUPPLIER        PROD#         PRODUCT
====    ============    =====         ===========
WIRE    ACWF            10            SCREWDRIVER
WIRE    ACWF            10            RELAY
WIRE    ACWF            10            SAW
WIRE    ACWF            10            GENERATOR
OIL     WESTERN_CHEM    160           SCREWDRIVER
OIL     WESTERN_CHEM    160           RELAY
OIL     WESTERN_CHEM    160           SAW
OIL     WESTERN_CHEM    160           GENERATOR
.
.
.

You can specify more complicated join conditions to obtain different sets of results.

Example: To eliminate the suppliers that begin with the letter A from the table of parts, suppliers, product numbers, and products, write a query like this:

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT
 FROM PARTS INNER JOIN PRODUCTS
  ON PARTS.PROD# = PRODUCTS.PROD#
  AND SUPPLIER NOT LIKE 'A%';

The result of the query is all rows that do not have a supplier that begins with A:

PART          SUPPLIER            PROD#    PRODUCT
=======       ============        ====     =========
MAGNETS       BATEMAN             10       GENERATOR
PLASTIC       PLASTIK_CORP        30       RELAY

Example: This example joins the PROJ table (shown on page 123) to itself by using an inner join. The query returns the number and name of each major project, followed by the number and name of the project that is part of it:

SELECT A.PROJNO, A.PROJNAME, B.PROJNO, B.PROJNAME
 FROM PROJ A, PROJ B
 WHERE A.PROJNO = B.MAJPROJ;

In this example, A indicates the first instance of table PROJ, and B indicates a second instance of this table. The join condition is such that the value in column PROJNO in table PROJ A must be equal to a value in column MAJPROJ in table PROJ B.

The result table looks like this:

PROJNO  PROJNAME         PROJNO     PROJNAME
======  ===============  ======     ===========
IF2000  USER EDUCATION   MA2100     DOCUMENTATION
MA2100  DOCUMENTATION    MA2110     SYSTEM PROGRAMMING
OP2011  SYSTEMS SUPPORT  OP2012     APPLICATIONS SUPPORT

In this example, the comma in the FROM clause implicitly specifies an inner join, and it acts the same as if the INNER JOIN keywords had been used. When you use the comma for an inner join, you must specify the join condition in the WHERE clause. When you use the INNER JOIN keywords, you must specify the join condition in the ON clause.

Left outer join

When you specify the LEFT OUTER JOIN clause, the results include rows from the table that is specified before LEFT OUTER JOIN. (This is PARTS in the example that follows.) The results include only those rows that have no matching values in the table that is specified after the LEFT OUTER JOIN clause. (This is PRODUCTS in the example that follows.)

As in an inner join, the join condition of a left outer join can be any simple or compound search condition that does not contain a subquery reference. (You can read about subqueries in “Using subqueries” on page 160.)

Example: To include rows from the PARTS table that have no matching values in the PRODUCTS table and to include prices that exceed $10.00, run this query:

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT, PRICE
 FROM PARTS LEFT OUTER JOIN PRODUCTS
 ON PARTS.PROD# = PRODUCTS.PROD#
 AND PRODUCTS.PRICE>10.00;

The result table looks like this:

PART     SUPPLIER           PROD#      PRODUCT     PRICE
=======  =================  =====      =========  ======
WIRE     ACWF               10         GENERATOR   45.75
MAGNETS  BATEMAN            10         GENERATOR   45.75
OIL      WESTERN_CHEM       160        -----       -----
BLADES   ACE_STEEL          205        SAW         18.90
PLASTIC  PLASTIK_CORP       30         -----       -----

Because the PARTS table can have rows that are not matched by values in the joined columns and because the PRICE column is not in the PARTS table, rows in which the PRICE value does not exceed $10.00 are included in the result of the join but the PRICE value is set to null.

In this result table, the row for PROD# 160 has null values on the right two columns because PROD# 160 does not match another product number. PROD# 30 has null values on the right two columns because the price of PROD# 30 is less than $10.00.

Right outer join

When you specify the RIGHT OUTER JOIN clause, the results include rows from the table that is specified after RIGHT OUTER JOIN. (This is PRODUCTS in the example that follows.) The results include only those rows that have matching values in the table that is specified before the RIGHT OUTER JOIN clause. (This is PARTS in the example that follows.)

As in an inner join, the join condition of a right outer join can be any simple or compound search condition that does not contain a subquery reference.

Example: To include rows from the PRODUCTS table that have no matching values in the PARTS table and to include only prices that exceed $10.00, run this query:

SELECT PART, SUPPLIER, PRODUCTS.PROD#, PRODUCT, PRICE
 FROM PARTS RIGHT OUTER JOIN PRODUCTS
 ON PARTS.PROD# = PRODUCTS.PROD#
 WHERE PRODUCTS.PRICE>10.00;

The result table looks like this:

PART      SUPPLIER      PROD#   PRODUCT      PRICE
=======   ============  =====   ==========   =====
MAGNETS   BATEMAN       10      GENERATOR    45.75
WIRE      ACWF          10      GENERATOR    45.75
BLADES    ACE_STEEL     205     SAW          18.90

Because the PRODUCTS table cannot have rows that are not matched by values in the joined columns and because the PRICE column is in the PRODUCTS table, rows in which the PRICE value does not exceed $10.00 are not included in the result of the join.

Full outer join

The FULL OUTER JOIN clause results in the inclusion of rows from both tables. If a value is missing when rows are joined, that value is null in the result table.

The join condition for a full outer join must be a search condition that compares two columns. The predicates of the search condition can only be combined with AND. Each predicate must have the form 'expression = expression'.

Example: This query performs a full outer join of the PARTS and PRODUCTS tables:

SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT
 FROM PARTS FULL OUTER JOIN PRODUCTS
 ON PARTS.PROD# = PRODUCTS.PROD#;

The result table looks like this:

PART         SUPPLIER           PROD#         PRODUCT
========     ============       =====         ===========
WIRE         ACWF               10            GENERATOR
MAGNETS      BATEMAN            10            GENERATOR
OIL          WESTERN_CHEM       160           -----
BLADES       ACE_STEEL          205           SAW
PLASTIC      PLASTIK_CORP       30            RELAY
-----        -----              -----         SCREWDRIVER

Using COALESCE

This function can be particularly useful in full outer join operations because it returns the first nonnull value. For example, notice that the result in the example above is null for SCREWDRIVER, even though the PRODUCTS table contains a product number for SCREWDRIVER. If you select PRODUCTS.PROD# instead, PROD# is null for OIL. If you select both PRODUCTS.PROD# and PARTS.PROD#, the result contains two columns, and both columns contain some null values.

Example: You can merge data from both columns into a single column, eliminating the null values, by using the COALESCE function. Consider this query with the same PARTS and PRODUCTS tables:

SELECT PART, SUPPLIER,
 COALESCE(PARTS.PROD#, PRODUCTS.PROD#) AS
 PRODNUM,PRODUCT
 FROM PARTS FULL OUTER JOIN PRODUCTS
 ON PARTS.PROD# = PRODUCTS.PROD#;

This statement gives this result:

PART          SUPPLIER           PRODNUM      PRODUCT
=======       ============       =======      ===========
WIRE          ACWF               10           GENERATOR
MAGNETS       BATEMAN            10           GENERATOR
OIL           WESTERN_CHEM       160          -----
BLADES        ACE_STEEL          205          SAW
PLASTIC       PLASTIK_CORP       30           RELAY
-----         -----              505          SCREWDRIVER

The AS clause AS PRODNUM provides a name for the result of the COALESCE function.

Using subqueries

A subquery is a nested SQL statement, or subselect, that contains a SELECT statement within the WHERE or HAVING clause of another SQL statement. This section describes some basic uses for subqueries. You can also code more complex subqueries, such as correlated subqueries and subqueries with quantified predicates.

You can use a subquery when you need to narrow your search condition that is based on information in an interim table. For example, you might want to find all employee numbers in one table that also exist for a given project in a second table.

Example: Suppose that you want a list of the employee numbers, names, and commissions of all employees that work on a particular project, such as project number IF2000. The first part of the SELECT statement is easy to write:

SELECT EMPNO, LASTNAME, COMM
 FROM EMP
 WHERE EMPNO
 .
 .
 .

However, you cannot go further because the EMP table does not include project number data. You do not know which employees are working on project IF2000 without issuing another SELECT statement against the EMPPROJACT table.

You can use a subselect to solve this problem. The SELECT statement that surrounds the subquery is the outer SELECT.

Example: This query expands the SELECT statement that started in the previous example to include a subquery:

SELECT EMPNO, LASTNAME, COMM
 FROM EMP
 WHERE EMPNO IN
  (SELECT EMPNO
    FROM EMPPROJACT
    WHERE PROJNO = 'IF2000'),

To better understand what results from this SQL statement, imagine that DB2 goes through the following process:

1.	DB2 evaluates the subquery to obtain a list of EMPNO values: (SELECT EMPNO FROM EMPPROJACT WHERE PROJNO = 'IF2000'), The result is the following interim result table: EMPNO ====== 000140 000140 000030
2.	The interim result table then serves as a list in the search condition of the outer SELECT. Effectively, DB2 runs this SELECT statement: SELECT EMPNO, LASTNAME, COMM FROM EMP WHERE EMPNO IN ('000140', '000030'), The result table looks like this: EMPNO LASTNAME COMM ===== ======== ======= 000140 NICHOLLS 2274.00 000030 KWAN 3060.00

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Writing SQL queries to answer questions: The basics

Create new playlist

Sign In

Sign Up

Writing SQL queries to answer questions: The basics

Example tables

Figure 5.6. Example DEPT table

Figure 5.7. Example EMP table

Figure 5.8. Example PROJ table

Figure 5.9. Example EMPPROJACT table

Selecting data from columns: SELECT

Selecting all columns: SELECT *

Selecting some columns: SELECT column-name

Selecting derived columns: SELECT expression

Eliminating duplicate rows: DISTINCT

Naming result columns: AS

Processing a SELECT statement

Accessing DB2 data that is not in a table

Using functions and expressions

Concatenating strings: CONCAT

Calculating values in a column or across columns

Calculating aggregate values: Aggregate functions

Returning a single value from a single value: Scalar functions

Nesting aggregate and scalar functions

Using user-defined functions

Using CASE expressions

Filtering the number of returned rows: WHERE

Selecting rows that have null values

Selecting rows by using equalities and inequalities

Testing for equality

Testing for inequalities

Testing for equality or inequality in a set of columns

Using the NOT keyword with comparison operators

Selecting values similar to a character string

Selecting values similar to a string of unknown characters

Selecting a value similar to a single unknown character

Selecting rows that meet more than one condition

Using parentheses with AND and OR

Using NOT with AND and OR

Using BETWEEN to specify ranges to select

Using IN to specify values in a list

Putting the rows in order: ORDER BY

Specifying the sort key

Listing rows in ascending order

Listing rows in descending order

Ordering with more than one column as the sort key

Ordering with an expression as the sort key

Summarizing group values: GROUP BY

Subjecting groups to conditions: HAVING

Merging lists of values: UNION

Eliminating duplicates

Keeping duplicates

Joining data from more than one table

Example tables

Figure 5.10. Example PARTS table

Figure 5.11. Example PRODUCTS table

Overview of joins

Figure 5.12. Outer joins of two tables. Each join is on column PROD#.

Inner join

Left outer join

Right outer join

Full outer join

Using COALESCE

Using subqueries

Table of Contents for
Writing SQL queries to answer questions: The basics