Hour 14. Using Subqueries to Define Unknown Data


What You’ll Learn in This Hour:

What a subquery is

The justifications of using subqueries

Examples of subqueries in regular database queries

Using subqueries with data manipulation commands

Embedded subqueries


In this hour, you are introduced to the concept of subqueries. Using subqueries enables you to more easily preform complex queries.

What Is a Subquery?

A subquery, also known as a nested query, is a query embedded within the WHERE clause of another query to further restrict data returned by the query. A subquery returns data that is used in the main query as a condition to further restrict the data to be retrieved. Subqueries are employed with the SELECT, INSERT, UPDATE, and DELETE statements.

You can use a subquery in some cases in place of a join operation by indirectly linking data between the tables based on one or more conditions. When you have a subquery in a query, the subquery is resolved first, and then the main query is resolved according to the condition(s) resolved by the subquery. The results of the subquery process expressions in the WHERE clause of the main query. You can use the subquery either in the WHERE clause or the HAVING clause of the main query. You can use logical and relational operators, such as =, >, <, <>,!=, IN, NOT IN, AND, OR, and so on, within the subquery as well as to evaluate a subquery in the WHERE or HAVING clause.


By the Way: The Rules of Using Subqueries

The same rules that apply to standard queries also apply to subqueries. You can use join operations, functions, conversions, and other options within a subquery.



Did You Know?: Use Indentation for Neater Statement Syntax

Notice the use of indentation in our examples. The use of indentation is merely for readability. The neater your statements are, the easier it is to read and find syntax errors.


Subqueries must follow a few rules:

• Subqueries must be enclosed within parentheses.

• A subquery can have only one column in the SELECT clause, unless multiple columns are in the main query for the subquery to compare its selected columns.

• You cannot use an ORDER BY clause in a subquery, although the main query can use an ORDER BY clause. You can use the GROUP BY clause to perform the same function as the ORDER BY clause in a subquery.

• You can only use subqueries that return more than one row with multiple value operators, such as the IN operator.

• The SELECT list cannot include references to values that evaluate to a BLOB, ARRAY, CLOB, or NCLOB.

• You cannot immediately enclose a subquery in a SET function.

• You cannot use the BETWEEN operator with a subquery; however, you can use the BETWEEN operator within the subquery.

The basic syntax for a subquery is as follows:

SELECT COLUMN_NAME
FROM TABLE
WHERE COLUMN_NAME = (SELECT COLUMN_NAME
                     FROM TABLE
                     WHERE CONDITIONS);

The following examples show how you can and cannot use the BETWEEN operator with a subquery. Here is an example of a correct use of BETWEEN in the subquery:

SELECT COLUMN_NAME
FROM TABLE_A

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME
                            FROM TABLE_B)
                            WHERE VALUE BETWEEN VALUE)

You cannot use BETWEEN as an operator outside the subquery. The following is an example of an illegal use of BETWEEN with a subquery:

SELECT COLUMN_NAME
FROM TABLE_A
WHERE COLUMN_NAME BETWEEN VALUE AND (SELECT COLUMN_NAME
                                     FROM TABLE_B)

Subqueries with the SELECT Statement

Subqueries are most frequently used with the SELECT statement, although you can use them within a data manipulation statement as well. The subquery, when employed with the SELECT statement, retrieves data for the main query to use.

The basic syntax is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ]
FROM TABLE1 [, TABLE2 ]
WHERE COLUMN_NAME OPERATOR
                  (SELECT COLUMN_NAME [, COLUMN_NAME ]
                   FROM TABLE1 [, TABLE2 ]
                  [ WHERE ])

The following is an example:

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE
FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP
WHERE E.EMP_ID = EP.EMP_ID
AND EP.PAY_RATE < (SELECT PAY_RATE
                   FROM EMPLOYEE_PAY_TBL
                   WHERE EMP_ID = '443679012'),

The preceding SQL statement returns the employee identification, last name, first name, and pay rate for all employees who have a pay rate greater than that of the employee with the identification 443679012. In this case, you do not necessarily know (or care) what the exact pay rate is for this particular employee; you only care about the pay rate for the purpose of getting a list of employees who bring home more than the employee specified in the subquery.


Did You Know?: Using Subqueries for Unknown Values

Subqueries are frequently used to place conditions on a query when the exact conditions are unknown. The pay rate for 220984332 was unknown, but the subquery was designed to do the footwork for you.


The next query selects the pay rate for a particular employee. This query is used as the subquery in the following example.

SELECT PAY_RATE
FROM EMPLOYEE_PAY_TBL
WHERE EMP_ID = '220984332';

  PAY_RATE
----------
11

1 row selected.

The previous query is used as a subquery in the WHERE clause of the following query:

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE
FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP
WHERE E.EMP_ID = EP.EMP_ID
  AND EP.PAY_RATE > (SELECT PAY_RATE
                     FROM EMPLOYEE_PAY_TBL
                     WHERE EMP_ID = '220984332'),

EMP_ID         LAST_NAME             FIRST_NAME             PAY_RATE
---------      ---------             -----------            ---------
442346889      PLEW                  LINDA                  14.75
443679012      SPURGEON              TIFFANY                15

2 rows selected.

The result of the subquery is 11 (shown in the last example), so the last condition of the WHERE clause is evaluated as

AND EP.PAY_RATE > 11

You did not know the value of the pay rate for the given individual when you executed the query. However, the main query was able to compare each individual’s pay rate to the subquery results.

Subqueries with the INSERT Statement


By the Way: Always Remember to COMMIT Your DML

Remember to use the COMMIT and ROLLBACK commands when using DML commands such as the INSERT statement.


You can also use subqueries in conjunction with Data Manipulation Language (DML) statements. The INSERT statement is the first instance you examine. It uses the data returned from the subquery to insert into another table. You can modify the selected data in the subquery with any of the character, date, or number functions.

The basic syntax is as follows:

INSERT INTO TABLE_NAME [ (COLUMN1 [, COLUMN2 ]) ]
SELECT [ *|COLUMN1 [, COLUMN2 ]
FROM TABLE1 [, TABLE2 ]
[ WHERE VALUE OPERATOR ]

The following is an example of the INSERT statement with a subquery:

INSERT INTO RICH_EMPLOYEES
SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE
FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP
WHERE E.EMP_ID = EP.EMP_ID
  AND EP.PAY_RATE > (SELECT PAY_RATE
                     FROM EMPLOYEE_PAY_TBL
                     WHERE EMP_ID = '220984332'),

2 rows created.

This INSERT statement inserts the EMP_ID, LAST_NAME, FIRST_NAME, and PAY_RATE into a table called RICH_EMPLOYEES for all records of employees who have a pay rate greater than the pay rate of the employee with identification 220984332.

Subqueries with the UPDATE Statement

You can use subqueries in conjunction with the UPDATE statement to update single or multiple columns in a table. The basic syntax is as follows:

UPDATE TABLE
SET COLUMN_NAME [, COLUMN_NAME) ] =
    (SELECT ]COLUMN_NAME [, COLUMN_NAME) ]
    FROM TABLE
    [ WHERE ]

Examples showing the use of the UPDATE statement with a subquery follow. The first query returns the employee identification of all employees who reside in Indianapolis. You can see that four individuals meet this criterion.

SELECT EMP_ID
FROM EMPLOYEE_TBL
WHERE CITY = 'INDIANAPOLIS';

EMP_ID
---------
442346889
313782439
220984332

443679012

4 rows selected.

The first query is used as the subquery in the following statement; it proves how many employee identifications are returned by the subquery. The following is the UPDATE with the subquery:

UPDATE EMPLOYEE_PAY_TBL
SET PAY_RATE = PAY_RATE * 1.1
WHERE EMP_ID IN (SELECT EMP_ID
                 FROM EMPLOYEE_TBL
                 WHERE CITY = 'INDIANAPOLIS'),

4 rows updated.

As expected, four rows are updated. One important thing to notice is that, unlike the example in the first section, this subquery returns multiple rows of data. Because you expect multiple rows to be returned, you use the IN operator instead of the equal sign. Remember that IN compares an expression to values in a list. If you had used the equal sign, an error would have been returned.

Subqueries with the DELETE Statement

You can also use subqueries in conjunction with the DELETE statement. The basic syntax is as follows:

DELETE FROM TABLE_NAME
[ WHERE OPERATOR [ VALUE ]
                (SELECT COLUMN_NAME
                FROM TABLE_NAME)
                [ WHERE) ]

In the following example, you delete the BRANDON GLASS record from EMPLOYEE_PAY_TBL. You do not know Brandon’s employee identification number, but you can use a subquery to get his identification number from EMPLOYEE_TBL, which contains the FIRST_NAME and LAST_NAME columns.

DELETE FROM EMPLOYEE_PAY_TBL
WHERE EMP_ID = (SELECT EMP_ID
                FROM EMPLOYEE_TBL
                WHERE LAST_NAME = 'GLASS'
                  AND FIRST_NAME = 'BRANDON'),

1 row deleted.

Embedded Subqueries


By the Way: Check the Limits of Your System

You must check your particular implementation for limits on the number of subqueries, if any, that you can use in a single statement. It might differ between vendors.


You can embed a subquery within another subquery, just as you can embed the subquery within a regular query. When a subquery is used, that subquery is resolved before the main query. Likewise, the lowest level subquery is resolved first in embedded or nested subqueries, working out to the main query.

The basic syntax for embedded subqueries is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ]
FROM TABLE1 [, TABLE2 ]
WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME
                            FROM TABLE
                            WHERE COLUMN_NAME OPERATOR
                                    (SELECT COLUMN_NAME
                                    FROM TABLE
                                    [ WHERE COLUMN_NAME OPERATOR VALUE ]))

The following example uses two subqueries, one embedded within the other. You want to find out what customers have placed orders in which the quantity multiplied by the cost of a single order is greater than the sum of the cost of all products.

SELECT CUST_ID, CUST_NAME
FROM CUSTOMER_TBL
WHERE CUST_ID IN (SELECT O.CUST_ID
                  FROM ORDERS_TBL O, PRODUCTS_TBL P
                  WHERE O.PROD_ID = P.PROD_ID
                    AND O.QTY + P.COST < (SELECT SUM(COST)
                                          FROM
                                              PRODUCTS_TBL));

CUST_ID        CUST_NAME
----------     ------------------
090            WENDY WOLF
232            LESLIE GLEASON
287            GAVINS PLACE
43             SCHYLERS NOVELTIES
432            SCOTTYS MARKET
560            ANDYS CANDIES

6 rows selected.


Watch Out!: Always Use a WHERE Clause

Do not forget the use of the WHERE clause with the UPDATE and DELETE statements. All rows are updated or deleted from the target table if the WHERE clause is not used. You can utilize a SELECT statement with the WHERE clause first to ensure that you are modifying the correct rows. See Hour 5, “Manipulating Data.”


Six rows that meet the criteria of both subqueries were selected.

The following two examples show the results of each of the subqueries to aid your understanding of how the main query was resolved:

SELECT SUM(COST) FROM PRODUCTS_TBL;

SUM(COST)
----------
138.08

1 row selected.

SELECT O.CUST_ID
FROM ORDERS_TBL O, PRODUCTS_TBL P
WHERE O.PROD_ID = P.PROD_ID
  AND O.QTY + P.COST > 138.08;

CUST_ID
-------
43
287

2 rows selected.

In essence, the main query, after the substitution of the second subquery, is evaluated as shown in the following example:

SELECT CUST_ID, CUST_NAME
FROM CUSTOMER_TBL
WHERE CUST_ID IN (SELECT O.CUST_ID
                  FROM ORDERS_TBL O, PRODUCTS_TBL P
                  WHERE O.PROD_ID = P.PROD_ID
                    AND O.QTY + P.COST > 138.08);

The following shows how the main query is evaluated after the substitution of the first subquery:

SELECT CUST_ID, CUST_NAME
FROM CUSTOMER_TBL
WHERE CUST_ID IN (287,43);

The following is the final result:

CUST_ID        CUST_NAME
----------     ------------------

43             SCHYLERS NOVELTIES
287            GAVINS PLACE

2 rows selected.


Watch Out!: Multiple Subqueries Can Cause Problems

The use of multiple subqueries results in slower response time and might result in reduced accuracy of the results due to possible mistakes in the statement coding.


Correlated Subqueries

Correlated subqueries are common in many SQL implementations. The concept of correlated subqueries is discussed as an ANSI-standard SQL topic and is covered briefly in this hour. A correlated subquery is a subquery that is dependent upon information in the main query. This means that tables in a subquery can be related to tables in the main query.

In the following example, the table join between CUSTOMER_TBL and ORDERS_TBL in the subquery is dependent on the alias for CUSTOMER_TBL (C) in the main query. This query returns the name of all customers who have ordered more than 10 units of one or more items.

SELECT C.CUST_NAME
FROM CUSTOMER_TBL C
WHERE 10 < (SELECT SUM(O.QTY)
            FROM ORDERS_TBL O
            WHERE O.CUST_ID = C.CUST_ID);

CUST_NAME
------------------

SCOTTYS MARKET
SCHYLERS NOVELTIES
MARYS GIFT SHOP

3 rows selected.

You can extract and slightly modify the subquery from the previous statement in the next statement to show you the total quantity of units ordered for each customer, allowing the previous results to be verified:

SELECT C.CUST_NAME, SUM(O.QTY)
FROM CUSTOMER_TBL C,
     ORDERS_TBL O
WHERE C.CUST_ID = O.CUST_ID
GROUP BY C.CUST_NAME;


CUST_NAME                         SUM(O.QTY)
-----------------------           ----------
ANDYS CANDIES                     1
GAVINS PLACE                      10
LESLIE GLEASON                    1
MARYS GIFT SHOP                   100
SCHYLERS NOVELTIES                25
SCOTTYS MARKET                    20
WENDY WOLF                        2

7 rows selected.

The GROUP BY clause in this example is required because another column is being selected with the aggregate function SUM. This gives you a sum for each customer. In the original subquery, a GROUP BY clause is not required because SUM achieves a total for the entire query, which is run against the record for each customer.

Subquery Performance

Subqueries do have performance implications when used within a query. You must consider those implications prior to implementing them in a production environment. Consider that a subquery must be evaluated prior to the main part of the query, so the time that it takes to execute the subquery has a direct effect on the time it takes for the main query to execute. Let’s look at our previous example:


By the Way: Proper Use of Correlated Subqueries

In the case of a correlated subquery, you must reference the table in the main query before you can resolve the subquery.


SELECT CUST_ID, CUST_NAME
FROM CUSTOMER_TBL
WHERE CUST_ID IN (SELECT O.CUST_ID
                  FROM ORDERS_TBL O, PRODUCTS_TBL P
                  WHERE O.PROD_ID = P.PROD_ID
                    AND O.QTY + P.COST < (SELECT SUM(COST)
                                          FROM
                                              PRODUCTS_TBL));

Imagine what would happen if PRODUCTS_TBL contained a couple thousand product lines and ORDERS_TBL contained a few million lines of customer orders. The resulting effect of having to do a SUM across PRODUCTS_TBL and then join it with ORDERS_TBL could slow the query down quite considerably. So always remember to evaluate the effect that using a subquery has on performance when deciding on a course of action to take for getting information out of the database.

Summary

By simple definition and general concept, a subquery is a query that is performed within another query to place further conditions on a query. You can use a subquery in an SQL statement’s WHERE clause or HAVING clause. Queries are typically used within other queries (Data Query Language), but you can also use them in the resolution of DML statements such as INSERT, UPDATE, and DELETE. All basic rules for DML apply when using subqueries with DML commands.

The subquery’s syntax is virtually the same as that of a standalone query, with a few minor restrictions. One of these restrictions is that you cannot use the ORDER BY clause within a subquery; you can use a GROUP BY clause, however, which renders virtually the same effect. Subqueries are used to place conditions that are not necessarily known for a query, providing more power and flexibility with SQL.

Q&A

Q. In the examples of subqueries, I noticed quite a bit of indentation. Is this necessary in the syntax of a subquery?

A. Absolutely not. The indentation is used merely to break the statement into separate parts, making the statement more readable and easier to follow.

Q. Is there a limit on the number of embedded subqueries that can be used in a single query?

A. Limitations such as the number of embedded subqueries allowed and the number of tables joined in a query are specific to each implementation. Some implementations might not have limits, although the use of too many embedded subqueries could drastically hinder SQL statement performance. Most limitations are affected by the actual hardware, CPU speed, and system memory available, although there are many other considerations.

Q. It seems that debugging a query with subqueries can prove to be confusing, especially with embedded subqueries. What is the best way to debug a query with subqueries?

A. The best way to debug a query with subqueries is to evaluate the query in sections. First evaluate the lowest-level subquery, and then work your way to the main query (the same way the database evaluates the query). When you evaluate each subquery individually, you can substitute the returned values for each subquery to check your main query’s logic. An error with a subquery often results from the use of the operator that evaluates the subquery, such as (=), IN, >, <, and so on.

Workshop

The following workshop is composed of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises are intended to afford you the opportunity to apply the concepts discussed during the current hour, as well as build upon the knowledge acquired in previous hours of study. Please take time to complete the quiz questions and exercises before continuing. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.

Quiz

1. What is the function of a subquery when used with a SELECT statement?

2. Can you update more than one column when using the UPDATE statement in conjunction with a subquery?

3. Do the following have the correct syntax? If not, what is the correct syntax?

a. SELECT CUST_ID, CUST_NAME
        FROM CUSTOMER_TBL
        WHERE CUST_ID =
                       (SELECT CUST_ID
                               FROM ORDERS_TBL
                               WHERE ORD_NUM = '16C17'),

b. SELECT EMP_ID, SALARY
       FROM EMPLOYEE_PAY_TBL
       WHERE SALARY BETWEEN '20000'
                    AND (SELECT SALARY
                         FROM EMPLOYEE_ID
                         WHERE SALARY = '40000'),

c. UPDATE PRODUCTS_TBL
   SET COST = 1.15
   WHERE CUST_ID =
                  (SELECT CUST_ID
                   FROM ORDERS_TBL
                   WHERE ORD_NUM = '32A132'),

4. What would happen if you ran the following statement?

DELETE FROM EMPLOYEE_TBL
WHERE EMP_ID IN
              (SELECT EMP_ID
              FROM EMPLOYEE_PAY_TBL);

Exercises

1. Write the SQL code for the requested subqueries, and compare your results to ours. Use the following tables to complete the exercises:

EMPLOYEE_TBL
EMP_ID        VARCHAR(9)     NOT NULL     primary key
LAST_NAME     VARCHAR(15)    NOT NULL
FIRST_NAME    VARCHAR(15)    NOT NULL
MIDDLE_NAME   VARCHAR(15)
ADDRESS       VARCHAR(30)    NOT NULL
CITY          VARCHAR(15)    NOT NULL
STATE         VARCHAR(2)     NOT NULL
ZIP           INTEGER(5)     NOT NULL
PHONE         VARCHAR(10)
PAGER         VARCHAR(10)

EMPLOYEE_PAY_TBL
EMP_ID            VARCHAR(9)    NOT NULL     primary key
POSITION          VARCHAR(15)   NOT NULL
DATE_HIRE         DATETIME
PAY_RATE          DECIMAL(4,2)  NOT NULL
DATE_LAST_RAISE   DATETIME
CONSTRAINT EMP_FK  FOREIGN KEY (EMP_ID_ REFERENCES
EMPLOYEE_TBL (EMP_ID)

CUSTOMER_TBL
CUST_ID       VARCHAR(10)    NOT NULL     primary key
CUST_NAME     VARCHAR(30)    NOT NULL
CUST_ADDRESS  VARCHAR(20)    NOT NULL
CUST_CITY     VARCHAR(15)    NOT NULL
CUST_STATE    VARCHAR(2)     NOT NULL
CUST_ZIP      INTEGER(5)     NOT NULL
CUST_PHONE    INTEGER(10)
CUST_FAX      INTEGER(10)


ORDERS_TBL
ORD_NUM       VARCHAR(10)    NOT NULL      primary key
CUST_ID       VARCHAR(10)    NOT NULL
PROD_ID       VARCHAR(10)    NOT NULL
QTY           INTEGER(6)     NOT NULL
ORD_DATE      DATETIME

PRODUCTS_TBL
PROD_ID       VARCHAR(10)    NOT NULL     primary key
PROD_DESC     VARCHAR(40)    NOT NULL
COST          DECIMAL(6,2)   NOT NULL

2. Using a subquery, write an SQL statement to update CUSTOMER_TBL. Find the customer with the order number 23E934, contained in the field ORD_NUM, and change the customer name to DAVIDS MARKET.

3. Using a subquery, write a query that returns the names of all employees who have a pay rate greater than JOHN DOE, whose employee identification number is 343559876.

4. Using a subquery, write a query that lists all products that cost more than the average cost of all products.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.29.148