Hour 9. Summarizing Data Results from a Query


What You’ll Learn in This Hour:

What functions are

How functions are used

When to use functions

Using aggregate functions

Summarizing data with aggregate functions

Results from using functions


In this hour, you learn about SQL’s aggregate functions. You can perform a variety of useful functions with aggregate functions, such as getting the highest total of a sale or counting the number of orders processed on a given day. The real power of aggregate functions will be discussed in the next hour when we tackle the GROUP BY clause.

What Are Aggregate Functions?

Functions are keywords in SQL used to manipulate values within columns for output purposes. A function is a command normally used in conjunction with a column name or expression that processes the incoming data to produce a result. SQL contains several types of functions. This hour covers aggregate functions. An aggregate function provides summarization information for an SQL statement, such as counts, totals, and averages.

The basic set of aggregate functions discussed in this hour are

COUNT

SUM

MAX

MIN

AVG

The following queries show the data used for most of this hour’s examples:

SELECT * FROM PRODUCTS_TBL;

PROD_ID    PROD_DESC                       COST
------------------------------------------------
11235      WITCH COSTUME                  29.99
222        PLASTIC PUMPKIN 18 INCH         7.75
13         FALSE PARAFFIN TEETH            1.1
90         LIGHTED LANTERNS               14.5
15         ASSORTED COSTUMES              10
9          CANDY CORN                      1.35
6          PUMPKIN CANDY                   1.45
87         PLASTIC SPIDERS                 1.05
119        ASSORTED MASKS                  4.95
1234       KEY CHAIN                       5.95
2345       OAK BOOKSHELF                  59.99

11 rows selected.

The following query lists the employee information from the EMPLOYEE_TBL table. Note that some of the employees do not have pager numbers assigned.

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER
FROM EMPLOYEE_TBL;

EMP_ID    LAST_NAM FIRST_NA PAGER
--------------------------------------
311549902 STEPHENS TINA
442346889 PLEW     LINDA
213764555 GLASS    BRANDON  3175709980
313782439 GLASS    JACOB    8887345678
220984332 WALLACE  MARIAH
443679012 SPURGEON TIFFANY

6 rows selected.

COUNT

You use the COUNT function to count rows or values of a column that do not contain a NULL value. When used within a query, the COUNT function returns a numeric value. You can also use the COUNT function with the DISTINCT command to only count the distinct rows of a dataset. ALL (opposite of DISTINCT) is the default; it is not necessary to include ALL in the syntax. Duplicate rows are counted if DISTINCT is not specified. One other option with the COUNT function is to use it with an asterisk. COUNT(*) counts all the rows of a table including duplicates, whether a NULL value is contained in a column or not.


By the Way: DISTINCT Can Only Be Used in Certain Circumstances

You cannot use the DISTINCT command with COUNT(*), only with COUNT (column_name).


The syntax for the COUNT function is as follows:

COUNT [ (*) | (DISTINCT | ALL) ] (COLUMN NAME)

This example counts all employee IDs:

SELECT COUNT(EMPLOYEE_ID) FROM EMPLOYEE_PAY_ID

This example counts only the distinct rows:

SELECT COUNT(DISTINCT SALARY)FROM EMPLOYEE_PAY_TBL

This example counts all rows for SALARY:

SELECT COUNT(ALL SALARY)FROM EMPLOYEE_PAY_TBL

This final example counts all rows of the EMPLOYEE table:

SELECT COUNT(*) FROM EMPLOYEE_TBL

COUNT(*) is used in the following example to get a count of all records in the EMPLOYEE_TBL table. There are six employees.

SELECT COUNT(*)
FROM EMPLOYEE_TBL;

COUNT(*)
---------
6


Watch Out!: COUNT(*) Is Different from Other Versions

COUNT(*) produces slightly different calculations than other count variations. This is because when the COUNT function is used with the asterisk, it counts the rows in the returned result set without regard to duplicates and NULL values. This is an important distinction. If you need your query to return a count of a particular field and include NULLs, you need to use a function such as ISNULL to replace the NULL values.


COUNT(EMP_ID) is used in the next example to get a count of all the employee identification IDs that exist in the table. The returned count is the same as the last query because all employees have an identification number.

SELECT COUNT(EMP_ID)
FROM EMPLOYEE_TBL;

COUNT(EMP_ID)
-------------
6

COUNT(PAGER) is used in the following example to get a count of all the employee records that have a pager number. Only two employees had pager numbers.

SELECT COUNT(PAGER)
FROM EMPLOYEE_TBL;

COUNT(PAGER)
------------
2

The ORDERS_TBL table is shown next:

SELECT *
FROM ORDERS_TBL;

ORD_NUM    CUST_ID    PROD_ID           QTY ORD_DATE_
-----------------------------------------------------
56A901     232        11235               1 22-OCT-99
56A917     12         907               100 30-SEP-99
32A132     43         222                25 10-OCT-99
16C17      090        222                 2 17-OCT-99
18D778     287        90                 10 17-OCT-99
23E934     432        13                 20 15-OCT-99
90C461     560        1234                2

7 rows selected.

This example obtains a count of all distinct product identifications in the ORDERS_TBL table.

SELECT COUNT(DISTINCT PROD_ID )
FROM ORDERS_TBL;

COUNT(DISTINCT PROD_ID )
------------------------
                       6

The PROD_ID 222 has two entries in the table, thus reducing the distinct values from 7 to 6.


By the Way: Data Types Do Not COUNT

Because the COUNT function counts the rows, data types do not play a part. The rows can contain columns with any data type.


SUM

The SUM function returns a total on the values of a column for a group of rows. You can also use the SUM function in conjunction with DISTINCT.

When you use SUM with DISTINCT, only the distinct rows are totaled, which might not have much purpose. Your total is not accurate in that case because rows of data are omitted.

The syntax for the SUM function is as follows:

SUM ([ DISTINCT ] COLUMN NAME)


Watch Out!: SUM Must Be Numeric

The value of an argument must be numeric to use the SUM function. You cannot use the SUM function on columns having a data type other than numeric, such as character or date.


This example totals the salaries:

SELECT SUM(SALARY) FROM EMPLOYEE_PAY_TBL

This example totals the distinct salaries:

SELECT SUM(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL

In the following query, the sum, or total amount, of all cost values is being retrieved from the PRODUCTS_TBL table:

SELECT SUM(COST)
FROM PRODUCTS_TBL;

SUM(COST)
-----------
163.07

Observe the way the DISTINCT command in the following example skews the previous results. This is why it is rarely useful:

SELECT SUM(DISTINCT COST)
FROM PRODUCTS_TBL;


SUM(COST)
----------
72.14

The following query demonstrates that, although some aggregate functions require numeric data, this is only limited to the type of data. Here the PAGER column of the EMPLOYEE_TBL table shows that the implicit conversion of the CHAR data to a numeric type is supported:

SELECT SUM(PAGER)
FROM EMPLOYEE_TBL;

SUM(PAGER)
-----------
12063055658

When you use a type of data that cannot be implicitly converted to a numeric type, such as the LAST_NAME column, it returns a result of 0.

SELECT SUM(LAST_NAME)
FROM EMPLOYEE_TBL;

SUM(LAST_NAME)
----------
0

AVG

The AVG function finds the average value for a given group of rows. When used with the DISTINCT command, the AVG function returns the average of the distinct rows. The syntax for the AVG function is as follows:

AVG ([ DISTINCT ] COLUMN NAME)


By the Way: AVG Must Be Numeric

The value of the argument must be numeric for the AVG function to work.


This example returns the average salary:

SELECT AVG(SALARY) FROM EMPLOYEE_PAY_TBL

This example returns the distinct average salary:

SELECT AVG(DISTINCT SALARY) EMPLOYEE_PAY_TBL

The average value for all values in the PRODUCTS_TBL table’s COST column is being retrieved in the following example:

SELECT AVG(COST)
FROM PRODUCTS_TBL;

AVG(COST)
----------
13.5891667


Watch Out!: Sometimes Your Data Is Truncated

In some implementations, the results of your query might be truncated to the precision of the data type.


The next example uses two aggregate functions in the same query. Because some employees are paid hourly and others are on salary, you want to retrieve the average value for both PAY_RATE and SALARY.

SELECT AVG(PAY_RATE), AVG(SALARY)
FROM EMPLOYEE_PAY_TBL;

AVG(PAY_RATE)         AVG(SALARY)
-------------         ---------
13.5833333           30000

MAX

The MAX function returns the maximum value from the values of a column in a group of rows. NULL values are ignored when using the MAX function. The DISTINCT command is an option. However, because the maximum value for all the rows is the same as the distinct maximum value, DISTINCT is useless.

The syntax for the MAX function is

MAX([ DISTINCT ] COLUMN NAME)

This example returns the highest salary:

SELECT MAX(SALARY) FROM EMPLOYEE_PAY_TBL

This example returns the highest distinct salary:

SELECT MAX(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL

The following example returns the maximum value for the COST column in the PRODUCTS_TBL table:

SELECT MAX(COST)
FROM PRODUCTS_TBL;


MAX(COST)
----------29.99

SELECT MAX(DISTICNT COST)
FROM PRODUCTS_TBL;

MAX(COST)
29.99

You can also use aggregate functions such as MAX and MIN on character data. In the case of these values, collation of your database comes into play again. Most commonly your database collation is set to a dictionary order, so the results are ranked according to that. For example, say we performed a MAX on the PRODUCT_DESC column of the products table:

SELECT MAX(PRODUCT_DESC)
FROM PRODUCTS_TBL;

MAX(PRODUCT_DESC)
-------------------
WITCH COSTUME

In this instance, the function returned the largest value according to a dictionary ordering of the data in the column.

MIN

The MIN function returns the minimum value of a column for a group of rows. NULL values are ignored when using the MIN function. The DISTINCT command is an option. However, because the minimum value for all rows is the same as the minimum value for distinct rows, DISTINCT is useless.

The syntax for the MIN function is

MIN([ DISTINCT ] COLUMN NAME)

This example returns the lowest salary:

SELECT MIN(SALARY) FROM EMPLOYEE_PAY_TBL

This example returns the lowest distinct salary:

SELECT MIN(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL

The following example returns the minimum value for the COST column in the PRODUCTS_TBL table:

SELECT MIN(COST)
FROM PRODUCTS_TBL;


MIN(COST)
----------
      1.05
SELECT MIN(DISTINCT COST)
FROM PRODUCTS_TBL;

MIN(COST)
----------
1.05


By the Way: DISTINCT and Aggregate Functions Don’t Always Mix

One important thing to keep in mind when using aggregate functions with the DISTINCT command is that your query might not return the desired results. The purpose of aggregate functions is to return summarized data based on all rows of data in a table.


As with the MAX function, the MIN function can work against character data and returns the minimum value according to the dictionary ordering of the data.

SELECT MINPRODUCT_DESC)
FROM PRODUCTS_TBL;

MIN(PRODUCT_DESC)
-------------------
ASSORTED COSTUMES

The final example combines aggregate functions with the use of arithmetic operators:

SELECT COUNT(ORD_NUM), SUM(QTY),
       SUM(QTY) / COUNT(ORD_NUM) AVG_QTY
FROM ORDERS_TBL;

COUNT(ORD_NUM)   SUM(QTY)    AVG_QTY
--------------   --------    ---------
7               160         22.857143

You have performed a count on all order numbers, figured the sum of all quantities ordered, and, by dividing the two figures, derived the average quantity of an item per order. You also created a column alias for the computation—AVG_QTY.

Summary

Aggregate functions can be useful and are quite simple to use. You have learned how to count values in columns, count rows of data in a table, get the maximum and minimum values for a column, figure the sum of the values in a column, and figure the average value for values in a column. Remember that NULL values are not considered when using aggregate functions, except when using the COUNT function in the format COUNT(*).

Aggregate functions are the first functions in SQL that you have learned, but more follow. You can also use aggregate functions for group values, which are discussed during the next hour. As you learn about other functions, you see that the syntaxes of most functions are similar to one another and that their concepts of use are relatively easy to understand.

Q&A

Q. Why are NULL values ignored when using the MAX or MIN function?

A. A NULL value means that nothing is there.

Q. Why don’t data types matter when using the COUNT function?

A. The COUNT function only counts rows.

Workshop

The following workshop is composed of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises are intended to afford you the opportunity to apply the concepts discussed during the current hour, as well as build upon the knowledge acquired in previous hours of study. Please take time to complete the quiz questions and exercises before continuing. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.

Quiz

1. True or false: The AVG function returns an average of all rows from a SELECT column, including any NULL values.

2. True or false: The SUM function adds column totals.

3. True or false: The COUNT(*) function counts all rows in a table.

4. Will the following SELECT statements work? If not, what fixes the statements?

a. SELECT COUNT *
FROM EMPLOYEE_PAY_TBL;

b. SELECT COUNT(EMPLOYEE_ID), SALARY
FROM EMPLOYEE_PAY_TBL;

c. SELECT MIN(BONUS), MAX(SALARY)
FROM EMPLOYEE_PAY_TBL
WHERE SALARY > 20000;

d. SELECT COUNT(DISTINCT PROD_ID) FROM PRODUCTS_TBL;

e. SELECT AVG(LAST_NAME) FROM EMPLOYEE_TBL;

f. SELECT AVG(PAGER) FROM EMPLOYEE_TBL;

Exercises

1. Use EMPLOYEE_PAY_TBL to construct SQL statements to solve the following exercises:

A. What is the average salary?

B. What is the maximum bonus?

C. What are the total salaries?

D. What is the minimum pay rate?

E. How many rows are in the table?

2. Write a query to determine how many employees are in the company whose last names begin with a G.

3. Write a query to determine the total dollar amount for all the orders in the system. Rewrite the query to determine the total dollar amount if we set the price of each item as $10.00.

4. Write two sets of queries to find the first employee name and last employee name when they are listed in alphabetical order.

5. Write a query to perform an AVG function on the employee names. Does the statement work? Determine why it is that you got that result.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.88.165