Using Aggregate Functions

It is often necessary to summarize data without actually retrieving it all, and SQL provides special functions for this purpose. Using these functions, SQL queries are often used to retrieve data for analysis and reporting purposes. Examples of this type of retrieval are

  • Determining the number of rows in a table (or the number of rows that meet some condition or contain a specific value).

  • Obtaining the sum of a set of rows in a table.

  • Finding the highest, lowest, and average values in a table column (either for all rows or for specific rows).

In each of these examples, you want a summary of the data in a table, not the actual data itself. Therefore, returning the actual table data would be a waste of time and processing resources (not to mention bandwidth). To repeat, all you really want is the summary information.

To facilitate this type of retrieval, SQL features a set of five aggregate functions, which are listed in Table 9.1. These functions enable you to perform all the types of retrieval just enumerated. You'll be relieved to know that unlike the data manipulation functions in the last lesson, SQL's aggregate functions are supported pretty consistently by the major SQL implementations.

Note

Aggregate Functions Functions that operate on a set of rows to calculate and return a single value.


Table 9.1. SQL Aggregate Functions
FunctionDescription
AVG()Returns a column's average value
COUNT()Returns the number of rows in a column
MAX()Returns a column's highest value
MIN()Returns a column's lowest value
SUM()Returns the sum of a column's values

The use of each of these functions is explained in the following sections.

The AVG() Function

AVG() is used to return the average value of a specific column by counting both the number of rows in the table and the sum of their values. AVG() can be used to return the average value of all columns or of specific columns or rows.

This first example uses AVG() to return the average price of all the products in the Products table:

SELECT AVG(prod_price) AS avg_price
FROM Products;

avg_price
-----------
6.0614

The SELECT statement above returns a single value, avg_price that contains the average price of all products in the Products table. avg_price is an alias as explained in Lesson 7, "Creating Calculated Fields."

AVG() can also be used to determine the average value of specific columns or rows. The following example returns the average price of products offered by a specific vendor:

SELECT AVG(prod_price) AS avg_price
FROM Products
WHERE vend_id = 'dll01';

avg_price
-----------
3.8650

This SELECT statement differs from the previous one only in that this one contains a WHERE clause. The WHERE clause filters only products with a vendor_id of DLL01, and, therefore, the value returned in avg_price is the average of just that vendor's products.

Caution

Individual Columns Only AVG() may only be used to determine the average of a specific numeric column, and that column name must be specified as the function parameter. To obtain the average value of multiple columns, multiple AVG() functions must be used.


Tip

NULL Values Columns containing NULL values are ignored by the AVG() function.


The COUNT() Function

COUNT() does just that: It counts. Using COUNT(), you can determine the number of rows in a table or the number of rows that match a specific criteria.

COUNT() can be used two ways:

  • Use COUNT(*) to count the number of rows in a table, whether columns contain values or NULL values.

  • Use COUNT(column) to count the number of rows that have values in a specific column, ignoring NULL values.

This first example returns the total number of customers in the Customers table:

SELECT COUNT(*) AS num_cust
FROM Customers;

num_cust
--------
5

In this example, COUNT(*) is used to count all rows, regardless of values. The count is returned in num_cust.

The following example counts just the customers with an email address:

SELECT COUNT(cust_email) AS num_cust
FROM Customers;

num_cust
--------
3

This SELECT statement uses COUNT(cust_email) to count only rows with a value in the cust_email column. In this example, cust_email is 3 (meaning that only 3 of the 5 customers have email addresses).

Tip

NULL Values Columns with NULL values in them are ignored by the COUNT() function if a column name is specified, but not if the asterisk (*) is used.


The MAX() Functions

MAX() returns the highest value in a specified column. MAX() requires that the column name be specified, as seen here:

SELECT MAX(prod_price) AS max_price
FROM Products;

max_price
----------
11.9900

Here MAX() returns the price of the most expensive item in Products table.

Tip

Using MAX() With Non-Numeric Data Although MAX() is usually used to find the highest numeric or date values, many (but not all) DBMSs allow it to be used to return the highest value in any columns including textual columns. When used with textual data, MAX() returns the row that would be the last if the data were sorted by that column.


Tip

NULL Values Columns with NULL values in them are ignored by the MAX() function.


The MIN() Function

MIN() does the exact opposite of MAX();, it returns the lowest value in a specified column. Like MAX(), MIN() requires that the column name be specified, as seen here:

SELECT MIN(prod_price) AS min_price
FROM Products;

min_price
----------
3.4900

Here MIN() returns the price of the least expensive item in Products table.

Tip

Using MIN() With Non Numeric Data Although MIN() is usually used to find the lowest numeric or date values, many (but not all) DBMSs allow it to be used to return the lowest value in any columns including textual columns. When used with textual data, MIN() will return the row that would be first if the data were sorted by that column.


Tip

NULL Values Columns with NULL values in them are ignored by the MIN() function.


The SUM() Function

SUM() is used to return the sum (total) of the values in a specific column.

Here is an example to demonstrate this. The OrderItems table contains the actual items in an order. Each item has item_price and a quantity, so the expanded price for each item is item_price*quantity (the item price multiplied by quantity orders). The total order cost is the sum of all the expanded prices in that order. Look at the following statement:

SELECT SUM(item_price*quantity) AS total_price
FROM OrderItems
WHERE order_num = 20005;

total_price
----------
1648.0000

The function SUM(item_price*quantity) returns the sum of all the expanded prices in an order, and the WHERE clause ensures that just the right order items are included.

Tip

Performing Calculations on Multiple Columns All the aggregate functions can be used to perform calculations on multiple columns using the standard mathematical operators, as shown in the example.


Tip

NULL Values Columns with NULL values in them are ignored by the SUM() function.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.56.28