It is important to understand that GROUP BY and ORDER BY are very different, even though they often accomplish the same thing. Table 10.1 summarizes the differences between them.
ORDER BY | GROUP BY |
---|---|
Sorts generated output | Groups rows. The output might not be in group order, however. |
Any columns (even columns not selected) or may be used | Only selected columns or expressions or may be used, and every selected column expression must be used. |
Never required | Required if using columns (or expressions) with aggregate functions. |
The first difference listed in Table 10.1 is extremely important. More often than not, you will find that data grouped using GROUP BY will indeed be output in group order. But that is not always the case, and it is not actually required by the SQL specifications. Furthermore, even if your particular DBMS does, in fact, always sort the data by the specified GROUP BY clause, you might actually want it sorted differently. Just because you group data one way (to obtain group specific aggregate values) does not mean that you want the output sorted that same way. You should always provide an explicit ORDER BY clause as well, even if it is identical to the GROUP BY clause.
Tip
Don't Forget ORDER BY As a rule, anytime you use a GROUP BY clause, you should also specify an ORDER BY clause. That is the only way to ensure that data will be sorted properly. Never rely on GROUP BY to sort your data.
To demonstrate the use of both GROUP BY and ORDER BY, let's look at an example. The following SELECT statement is similar to the ones seen previously. It retrieves the order number and number of items ordered for all orders containing three or more items:
SELECT order_num, COUNT(*) AS items FROM OrderItems GROUP BY order_num HAVING COUNT(*) >= 3;
order_num items ---------- ----- 20006 3 20007 5 20008 5 20009 3
To sort the output by number of items ordered, all you need to do is add an ORDER BY clause, as follows:
SELECT order_num, COUNT(*) AS items FROM OrderItems GROUP BY order_num HAVING COUNT(*) >= 3 ORDER BY items, order_num;
order_num items ---------- ----- 20006 3 20009 3 20007 5 20008 5
In this example, the GROUP BY clause is used to group the data by order number (the order_num column) so that the COUNT(*) function can return the number of items in each order. The HAVING clause filters the data so that only orders with three or more items are returned. Finally, the output is sorted using the ORDER BY clause.
3.138.105.255