GROUP BY looks straightforward, but it's caused its share of headaches for SQL users. Here, for example, is a query that seems reasonable but won't work on most systems:
SQL (variant)select pub_id, type, count(title) from titles group by pub_id
Since the table is divided into sets by publisher (group by pub_id) and there are three publishers, the query must return no more than three rows, and each SELECT list item must have a single value for the set. Unfortunately, there is more than one type per publisher, so most systems find the query impossible to answer. You could solve this problem by adding type to the GROUP BY clause, as shown in Groups Within Groups.
SQL VARIANTSOne system that handles this nonconforming GROUP BY query is Adaptive Server Enterprise. It returns results like the following, rather than an error: Adaptive Server Enterpriseselect pub_id, type, count(title) from titles group by pub_id pub_id type ------ ---- ----------- 1389 popular_comp 6 1389 business 6 0736 psychology 6 0736 psychology 6 1389 business 6 0877 mod_cook 6 0877 trad_cook 6 0877 trad_cook 6 1389 popular_comp 6 0736 business 6 0736 psychology 6 0736 psychology 6 0877 mod_cook 6 0877 trad_cook 6 0877 NULL 6 1389 business 6 0736 psychology 6 1389 popular_comp 6 (18 rows affected) In this case, ASE finds the number of books for each group (6 for 1389, 6 for 0877, 6 for 0736). It then combines these set results with row results for the whole table. The results erroneously imply that there are six books in each pub_id-type group or 18 * 6 (108) books. |
Another limitation in some SQL implementations concerns expressions. You're always safe with vanilla column names in the GROUP BY clause. Aggregate expressions, display labels, and SELECT list position numbers may or may not be allowed, depending on the SQL dialect you're using.
How about multiple summary values for multiple levels of grouping? Let's say you're grouping by pub_id and type. You want to see the total number of books for each publisher and the total number of books for each type of book the publisher carries. You might think something like the following query would do the trick, but the results show you're on the wrong track:
SQLselect pub_id, count(title_id), type, count(title_id) from titles group by pub_id, type pub_id count(titles.title_id) type count(titles.title_id) ====== ====================== ============ ====================== 1389 3 popular_comp 3 1389 3 business 3 0736 5 psychology 5 0877 2 mod_cook 2 0877 3 trad_cook 3 0736 1 business 1 0877 1 (NULL) 1 [7 rows]
The number of books is the same for publisher and for type, and the results don't make much sense in light of what you are trying to get. What you're seeing (in two columns) is the number of books for each publisher/type combination because that is the bottom level of the group. To get the results you want, you need to run two queries: the first one grouped by publisher (to give the publisher totals) and the second grouped by publisher and then by type within each publisher (to give the publisher/type totals).
SQLselect pub_id, count(title_id) from titles group by pub_id pub_id count(titles.title_id) ====== ====================== 1389 6 0736 6 0877 6 [3 rows]
Each publisher has six books. The next query finds the total number of books for each publisher/type combination.
SQLselect pub_id, type, count(title_id) from titles group by pub_id, type pub_id type count(titles.title_id) ====== ============ ====================== 1389 popular_comp 3 1389 business 3 0736 psychology 5 0877 mod_cook 2 0877 trad_cook 3 0736 business 1 0877 (NULL) 1 [7 rows]
If you group by type alone, the query has a different meaning, and you'll get different results:
SQLselect type, count(title_id) from titles group by type type count(titles.title_id) ============ ====================== popular_comp 3 business 4 psychology 5 mod_cook 2 trad_cook 3 (NULL) 1 [6 rows]
This last set of results shows the number of books of each type, regardless of publisher. There are four books in the business category. However, two publishers sell this type of book: One has a single business book, and the other has three. The query results don't show how many are in each publisher/type combination.
Because the need to see multiple levels of summary values is so pervasive, many database vendors provide a report generator of some type. However, the report generator usually is an application program rather than a part of SQL because report results with multiple levels of summary values cannot be represented as relational tables.
SQL VARIANTSTransact-SQL provides an extension that represents a SQL-based approach to the problem. Here's an example: Transact-SQLselect pub_id, type, title_id from titles where type like '%cook%' order by pub_id, type compute count(title_id) by pub_id, type compute count(title_id) by pub_id pub_id type title_id ------ ---- -------- 0877 mod_cook MC2222 0877 mod_cook MC3021 count -------- 2 0877 trad_cook TC3218 0877 trad_cook TC4203 0877 trad_cook TC7777 count -------- 3 count -------- 5 These results show row values, subgroup totals, and group totals in one set of results. Note that these results, unlike the results of other SQL queries, are not a relation and cannot be further manipulated by other SQL statements. Oracle SQL Plus provides tools for creating similar reports. |
3.144.151.126