We begin this section by covering the basics
of using the SELECT
statement. We then introduce
the WHERE
clause for selecting data that matches a
condition. The section concludes with an introduction to the more
advanced features of SELECT
statements.
The SELECT
statement is used to query a database
and for all output operations in SQL. Consider an example query:
SELECT surname, firstname FROM customer;
This outputs the values of the attributes surname
and firstname
from all rows, or
records,
in the customer table. Assuming we previously
inserted four rows when we created the winestore
database, the output from the MySQL command interpreter is:
+-----------+-----------+ | surname | firstname | +-----------+-----------+ | Marzalla | Dimitria | | LaTrobe | Anthony | | Fong | Nicholas | | Stribling | James | +-----------+-----------+ 4 rows in set (0.04 sec)
Any attributes of a table may be listed in a
SELECT
statement by separating each with a comma.
If all attributes are required, the shortcut of an asterisk character
(*)
can be used. Consider the statement:
SELECT * FROM region;
This outputs all the data from the table region:
+-----------+---------------------+-------------+------+ | region_id | region_name | description | map | +-----------+---------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | | 4 | Upper Hunter Valley | NULL | NULL | +-----------+---------------------+-------------+------+ 4 rows in set (0.07 sec)
SELECT
statements can also output data that
isn’t from a database. Consider the following
example:
SELECT curtime( );
This example runs a function that displays the current time:
+-----------+ | curtime( ) | +-----------+ | 08:41:50 | +-----------+ 1 row in set (0.02 sec)
The SELECT
statement can even be used as a simple
calculator, using the mathematical functions described in the later
section Section 3.9:
SELECT log(100)*4*pi( );
This outputs:
+-----------------+ | log(100)*4*pi( ) | +-----------------+ | 57.870275 | +-----------------+ 1 row in set (0.19 sec)
A WHERE
clause is used
as part of most SELECT
queries; it limits
retrieval to those rows that match a condition.
Consider this grape-growing region table containing the details of nine regions:
SELECT * from region; +-----------+---------------------+-------------+------+ | region_id | region_name | description | map | +-----------+---------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | | 4 | Upper Hunter Valley | NULL | NULL | | 5 | Lower Hunter Valley | NULL | NULL | | 6 | Barossa Valley | NULL | NULL | | 7 | Riverland | NULL | NULL | | 8 | Margaret River | NULL | NULL | | 9 | Swan Valley | NULL | NULL | +-----------+---------------------+-------------+------+ 9 rows in set (0.00 sec)
It is possible to select only a few rows with a
SELECT
statement by adding a
WHERE
clause. For example, to show only the first
three regions, you can issue the following statement:
SELECT * FROM region WHERE region_id<=3;
This outputs all attributes for only the first three region rows:
+-----------+------------------+-------------+------+ | region_id | region_name | description | map | +-----------+------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | +-----------+------------------+-------------+------+ 3 rows in set (0.00 sec)
You can combine the attribute and row restrictions and select only
the region_name
and region_id
attributes for the first three regions:
SELECT region_id, region_name FROM region WHERE region_id <= 3; +-----------+------------------+ | region_id | region_name | +-----------+------------------+ | 1 | Goulburn Valley | | 2 | Rutherglen | | 3 | Coonawarra | +-----------+------------------+ 3 rows in set (0.00 sec)
More complex WHERE
clauses use the Boolean
operators AND
and OR
, as well
as the functions described later in Section 3.9. The Boolean operators
AND
and OR
have the same
function as the PHP &&
and
||
operators introduced in Chapter 2.
Consider an example query that uses the Boolean operators:
SELECT * FROM customer WHERE surname='Marzalla' AND firstname='Dimitria';
This retrieves rows that match both criteria, that is, those
customers with a surname
Marzalla and a
firstname
Dimitria.
Consider a more complex example:
SELECT cust_id FROM customer WHERE (surname='Marzalla' AND firstname LIKE 'M%') OR email='[email protected]';
This finds rows with either the surname
Marzalla
and a firstname
beginning with M, or customers
with the email address [email protected]. The OR
operator isn’t exclusive, so an answer can have an
email
of [email protected], a
surname
of Marzalla, and a
firstname
beginning with M. This query, when run
on the winestore database, returns:
+---------+ | cust_id | +---------+ | 440 | | 493 | +---------+ 2 rows in set (0.01 sec)
SELECT
queries are often sophisticated and a long
WHERE
clause may include many
AND
and OR
operators. More
complex examples of queries are shown in the later section Section 3.7.
The WHERE
clause is also a common component of
UPDATE
and
DELETE
statements, and we have shown simple
examples of using WHERE
with these earlier in this
chapter. Consider another example of an UPDATE
with a WHERE
clause:
UPDATE wine SET winery_id = 298 WHERE winery_id = 299;
In this case, for wines that are made by the winery with
winery_id=299
, the winery_id
is
changed to winery_id=298
.
The WHERE
clause can be used similarly in a
DELETE
. Consider an example:
DELETE FROM wine WHERE winery_id = 299;
This removes only selected rows based on a condition; here the wines
made by the winery with winery_id=299
are
deleted.
We will now discuss techniques to manage the order and grouping of the output.
The ORDER BY
clause sorts the data after the query
has been evaluated. Consider an example:
SELECT surname, firstname FROM customer WHERE title='Mr' AND city = 'Portsea' ORDER by surname;
This query finds all customers who have a title
Mr
and live in Portsea. It then presents the results sorted
alphabetically by ascending surname
:
+-----------+-----------+ | surname | firstname | +-----------+-----------+ | Dalion | Anthony | | Galti | Jim | | Keisling | Mark | | Leramonth | James | | Mellili | Derryn | | Mockridge | James | | Nancarral | Joshua | | Ritterman | James | +-----------+-----------+ 8 rows in set (0.01 sec)
Sorting can be on multiple attributes. For example:
SELECT surname, firstname, initial FROM customer WHERE zipcode='3001' OR zipcode='3000' ORDER BY surname, firstname, initial;
This presents a list of customers in areas with
zipcode='3000'
or
zipcode='3001'
, sorted first by ascending
surname
, then (for those customers with the same
surname) by firstname
, and (for those customers
with the same surname and first name), by initial
.
So, for example, the output may be:
+-----------+-----------+---------+ | surname | firstname | initial | +-----------+-----------+---------+ | Keisling | Belinda | C | | Leramonth | Hugh | D | | Leramonth | Joshua | H | | Leramonth | Joshua | R | | Young | Bob | A | +-----------+-----------+---------+ 5 rows in set (0.11 sec)
By default, the ORDER
BY
clause
sorts in ascending order, or ASC
. To sort in
reverse or descending order, DESC
can be used.
Consider an example:
SELECT * FROM customer WHERE city='Melbourne' ORDER BY surname DESC;
The GROUP
BY
clause is
different from ORDER
BY
because
it doesn’t sort the data for output. Instead, it
sorts the data early in the query process, for the purpose of
grouping or
aggregation
.
An example shows the difference:
SELECT city, COUNT(*) FROM customer GROUP BY city;
This query outputs a sorted list of cities and, for each city, the
COUNT
of the number of customers who live in that
city. The effect of COUNT(*)
is to count the
number of rows per group. In this example, it
doesn’t matter what is counted;
COUNT(surname)
has exactly the same result.
Here are the first few lines output by the query:
+--------------+----------+ | city | COUNT(*) | +--------------+----------+ | Alexandra | 14 | | Armidale | 7 | | Athlone | 9 | | Bauple | 6 | | Belmont | 11 | | Bentley | 10 | | Berala | 9 | | Broadmeadows | 11 |
The query aggregates or groups all the rows for each city into sets,
and the COUNT(*)
operation counts the number in
each set. So, for example, there are 14 customers who live in
Alexandra.
The GROUP
BY
clause can find
different properties of the aggregated rows. Here’s
an example:
SELECT city, MAX(salary) FROM customer GROUP BY city;
This query first groups the rows by city and then shows the maximum salary in each city. The first few rows of the output are as follows:
+-----------+-------------+ | city | MAX(salary) | +-----------+-------------+ | Alexandra | 109000 | | Armidale | 75000 | | Athlone | 84000 | | Bauple | 33000 |
The GROUP
BY
clause should be
used only when the query is designed to find a characteristic of a
group of rows, not the details of individual rows.
There are several functions that can be used in aggregation with the
GROUP
BY
clause. Five
particularly useful functions are:
AVG( )
Finds the average value of a numeric attribute in a set
MIN( )
Finds a minimum value of a string or numeric attribute in a set
MAX( )
Finds a maximum value of a string or numeric attribute in a set
SUM( )
Finds the sum total of a numeric attribute
COUNT( )
Counts the number of rows in a set
The SQL standard places a constraint on the GROUP
BY
clause that MySQL doesn’t
enforce. In the standard, all attributes that are selected (i.e.,
appear after the SELECT
statement) must appear in
the GROUP
BY
clause. Most
examples in this chapter don’t meet this unnecessary
constraint of the SQL standard.
The HAVING
clause permits conditional aggregation
of data into groups. For example, consider the following query:
SELECT city, count(*), max(salary) FROM customer GROUP BY city HAVING count(*) > 10;
The query groups rows by city
, but only for cities
that have more than 10 resident customers. For those groups, the
city
, count( )
of customers,
and maximum salary
of a customer in that city is
output. Cities with less than 10 customers are omitted from the
result set. The first few rows of the output are as follows:
+--------------+----------+-------------+ | city | count(*) | max(salary) | +--------------+----------+-------------+ | Alexandra | 14 | 109000 | | Belmont | 11 | 71000 | | Broadmeadows | 11 | 51000 | | Doveton | 13 | 77000 | | Eleker | 11 | 97000 | | Gray | 12 | 77000 |
The HAVING
clause must contain an attribute or
expression from the SELECT
clause.
The DISTINCT
operator presents only one example of
each row from a query. Consider an example:
SELECT DISTINCT surname FROM customer;
This shows one example of each different customer surname in the customer table. This example has exactly the same effect as:
SELECT surname FROM customer GROUP BY surname;
The DISTINCT
clause is usually slow to run, much
like the GROUP
BY
clause. We
discuss how indexes and query optimization can speed queries later in
this chapter.
An additional operator is available in MySQL that limits the size of the result sets. For example, the following query returns only the first five rows from the wine table:
SELECT * FROM wine LIMIT 5;
This saves query evaluation time and reduces the size of the result
set that must be buffered by the DBMS. The LIMIT
operator is MySQL-specific.
18.116.67.22