What is actually the sequence in which the rows in the result of a SELECT
statement are presented? If the SELECT
statement has no ORDER BY
clause, the sequence is unpredictable. When working through the examples or exercises, you might have found once or twice that the sequence of the rows in your result is different from the one in the book. The addition of an ORDER BY
clause at the end of a SELECT
statement is the only guarantee that the rows in the end result will be sorted in a certain way.
<order by clause> ::= ORDER BY <sorting> [ { , <sorting> }... ] <sorting> ::= <scalar expression> [ <sort direction> ] | <sequence number> [<sort direction> ] | <column heading> [ <sort direction> ] <sort direction> ::= ASC | DESC |
Sorting on one column is the simplest method. In this case, the sorting consists of one column specification. You are allowed to sort on each column specified in the SELECT
clause.
Example 12.1. Find the payment number and the player number of each penalty incurred; sort the result by player number.
SELECT PAYMENTNO, PLAYERNO FROM PENALTIES ORDER BY PLAYERNO
The result is:
PAYMENTNO PLAYERNO --------- -------- 1 6 6 8 3 27 8 27 5 44 2 44 7 44 4 104
Explanation: The rows are sorted based upon the values in the PLAYERNO column, with the lowest value first and the highest value last.
You are allowed to sort on more than one column. This could be relevant if the first column consists of duplicate values. For example, the PLAYERNO
column in the PENALTIES
table contains duplicate values. If you sort on one column only, SQL is allowed to determine itself how the rows with duplicate player numbers are sorted. When you add another column for sorting, you explicitly indicate how the duplicate values must be sorted.
Example 12.2. Find all player numbers and penalty amounts; sort the result on both columns.
SELECT PLAYERNO, AMOUNT FROM PENALTIES ORDER BY PLAYERNO, AMOUNT
The result is:
PLAYERNO AMOUNT -------- ------ 6 100.00 8 25.00 27 75.00 27 100.00 44 25.00 44 30.00 44 75.00 104 50.00
Explanation. The result shows that if the player numbers are equal, the penalty amount is used to sort. Two sort keys are needed to get the rows in the desired sequence.
In most cases, a sorting is specified on columns and expressions that also appear in the SELECT
clause. However, this is not a necessity. The ORDER BY
clause can contain expressions that do not appear in the SELECT
clause.
Example 12.3. Get all penalty amounts, and sort the result on player number and penalty amount.
SELECT AMOUNT FROM PENALTIES ORDER BY PLAYERNO, AMOUNT
The result is:
AMOUNT ------ 100.00 25.00 75.00 100.00 25.00 30.00 75.00 50.00
Explanation. When the previous result is compared to the result of Example 12.2, we can see that the rows are indeed sorted on player number, even though this column does not appear in the SELECT
clause.
Besides sorting on column names, a sorting can consist of scalar expressions.
Example 12.4. Get for all players the last name, the initials, and the player number, and sort the result on the first letter of the last name.
SELECT NAME, INITIALS, PLAYERNO FROM PLAYERS ORDER BY SUBSTR(NAME, 1, 1)
The result is:
NAME INITIALS PLAYERNO --------- -------- -------- Bishop D 39 Baker E 44 Brown M 57 Bailey IP 112 Collins DD 27 Collins C 28 Everett R 2 Hope PK 83 Miller P 95 Moorman D 104 Newcastle B 8 Parmenter R 6 Parmenter P 100 Wise GWS 7
Explanation: Because several names begin with the same letter, SQL can decide for itself the sequence in which the rows with equal letters are presented.
The expressions in the ORDER BY
clause can even contain subqueries.
Example 12.5. Get the player number and the amount of all penalties, and sort the result on the difference between the amount and the average penalty amount.
Explanation: The value of the subquery is calculated first. Next, the value of the scalar expression is calculated for each row individually, and the result is sorted on that.
Subqueries that are used in the ORDER BY
clause can even be correlated.
Example 12.6. Get the player number and the amount of all penalties, and sort the result on the average penalty amount of each player.
SELECT PLAYERNO, AMOUNT FROM PENALTIES AS P1 ORDER BY (SELECT AVG(AMOUNT) FROM PENALTIES AS P2 WHERE P1.PLAYERNO = P2.PLAYERNO)
The result is:
PLAYERNO AMOUNT -------- ------ 8 25.00 44 75.00 44 25.00 44 30.00 104 50.00 27 100.00 27 75.00 6 100.00
Explanation. The average penalty amount of player 8 is $25, so this amount comes first, followed by the penalties of player 44 because his average penalty amount is $43.33. The average of player 104 is $50, that of player 27 is $87.50 and, finally, the average penalty amount of player 6 is $100.
In the ORDER BY
clause, we can replace a sorting consisting of column names or expressions with sequence numbers. A sequence number assigns a number to the expression in the SELECT
clause on which sorting is performed. This next two statements are, therefore, equivalent:
SELECT PAYMENTNO, PLAYERNO FROM PENALTIES ORDER BY PLAYERNO
and
SELECT PAYMENTNO, PLAYERNO FROM PENALTIES ORDER BY 2
The sequence number 2
stands for the second expression in the SELECT
clause. It is not essential to use sequence numbers, but this can simplify the formulation of a statement.
Example 12.7. For each player who has incurred at least one penalty, get the total penalty amount; sort the result on this total.
SELECT PLAYERNO, SUM(AMOUNT) FROM PENALTIES GROUP BY PLAYERNO ORDER BY 2
The result is:
PLAYERNO SUM(AMOUNT) -------- ----------- 8 25.00 104 50.00 6 100.00 44 130.00 27 175.00
Example 12.8. For each player, get the player number, the last name, and the sum of his penalties; sort the result on this sum.
SELECT PLAYERNO, NAME, (SELECT SUM(AMOUNT) FROM PENALTIES AS PEN WHERE PEN.PLAYERNO=P.PLAYERNO) FROM PLAYERS AS P ORDER BY 3
The result is:
PLAYERNO NAME SELECT SUM -------- --------- ---------- 2 Everett ? 100 Parmenter ? 95 Miller ? 83 Hope ? 57 Brown ? 112 Bailey ? 39 Bishop ? 28 Collins ? 7 Wise ? 8 Newcastle 25.00 104 Moorman 50.00 6 Parmenter 100.00 44 Baker 130.00 27 Collins 175.00
Your question might be: Isn’t a sequence number a form of an expression as well? The answer is, no! In the ORDER BY
clause, the sequence number is not considered to be an expression consisting of one literal. A sequence number is regarded an exception here.
The previous problem can also be solved by using column headings. The specification of column headings was introduced in Section 5.4. Column headings can also be used to sort rows. The next statement is equivalent, then, to the previous one:
SELECT PLAYERNO, NAME, (SELECT SUM(AMOUNT) FROM PENALTIES AS PEN WHERE PEN.PLAYERNO=P.PLAYERNO) AS TOTAL FROM PLAYERS AS P ORDER BY TOTAL
If you do not specify anything after a sorting, SQL sorts the result in ascending order. The same result can be achieved by explicitly specifying ASC
(ascending) after the sorting. If you specify DESC
(descending), the rows in the result are presented in descending order. Sorting values in a descending order always returns the reverse presentation of sorting in an ascending order, regardless of the data type of the values.
Example 12.9. For each penalty, get the player number and the penalty amount; sort the result in descending order on player number and in ascending order on penalty amount.
Sorting numeric values in ascending order is obvious. It means that the lowest value is presented first and the highest is presented last. Sorting on dates, times, and timestamps is also obvious. An ascending sort of dates means that dates are presented in chronological order. The same applies to time and timestamp values.
Sorting alphanumeric values in ascending order is the same as alphabetical sorting of words (such as in a dictionary). First come the words beginning with the letter A, then those with the letter B, and so on. Sorting alphanumeric values is, nevertheless, not as simple as it seems. For example, does the lowercase letter a come before or after the uppercase A, and do digits come before or after letters? And what do we do with symbols such as ë, é, and è? And let us not forget ç, œ, b, and æ? How letters and digits are sorted depends on the character set with which you work. In a character set, an internal value is defined for each character. Well-known character sets are ASCII (American Standard Code for Information Interchange), EBCDIC (Extended Binary Coded Decimal Interchange Code), and Unicode. A given operating system usually works with a specific character set. Modern versions of Windows, for example, use the Unicode character set, while the classic IBM mainframes support the EBCDIC character set. The sequence also depends on the so-called collating sequences. In Chapter 17, “Character Sets and Collating Sequences,” we discuss character sets and collating sequences in detail.
In this book, we assume that you work with the Unicode character set. Under Windows, it is simple to examine the Unicode character set with the program Character Map, which is one of the accessories of Windows; see Figure 12.1. This figure shows that all uppercase letters come before the lowercase letters, and that digits come before uppercase letters.
Example 12.10. Create the following PEOPLE
table, add the six rows, and see how the different values are sorted.
CREATE TABLE CODES (CODE CHAR(4) NOT NULL) INSERT INTO CODES VALUES ('abc') INSERT INTO CODES VALUES ('ABC') INSERT INTO CODES VALUES ('-abc') INSERT INTO CODES VALUES ('a bc') INSERT INTO CODES VALUES ('ab') INSERT INTO CODES VALUES ('9abc')
The SELECT
statement:
SELECT * FROM CODES ORDER BY CODE
The result is:
CODE ---- -abc 9abc a bc ab abc ABC
Explanation. This result clearly shows that digits come before letters, that the hyphen comes before the digits, and that short values are placed before long values. We can also see that uppercase letters come after lowercase letters.
NULL
values introduce a problem with sorting, and the various SQL products handle the ordering of NULL
values in different ways. You should consult the relevant SQL manuals for more details. Four options exist:
NULL
values are always presented first, regardless of whether the ordering is ascending or descending.
NULL
values are always presented last, regardless of whether the ordering is ascending or descending.
NULL
values are seen as the lowest values.
NULL
values are seen as the highest values.
MySQL treats NULL
values as the lowest values in a column. Therefore, they are always placed at the bottom of the result if the order is descending and at the top if the order is ascending; see the following example and the accompanying result.
Example 12.11. Get the different league numbers, and sort the result in descending order.
SELECT DISTINCT LEAGUENO FROM PLAYERS ORDER BY 1 DESC
The result is:
LEAGUENO -------- 8467 7060 6524 6409 2983 2513 2411 1608 1319 1124 ?
3.128.78.30