Cautions and Restrictions

GROUP BY looks straightforward, but it's caused its share of headaches for SQL users. Here, for example, is a query that seems reasonable but won't work on most systems:

SQL (variant)
select pub_id, type, count(title)
from titles
group by pub_id
					

Since the table is divided into sets by publisher (group by pub_id) and there are three publishers, the query must return no more than three rows, and each SELECT list item must have a single value for the set. Unfortunately, there is more than one type per publisher, so most systems find the query impossible to answer. You could solve this problem by adding type to the GROUP BY clause, as shown in Groups Within Groups.

SQL VARIANTS

One system that handles this nonconforming GROUP BY query is Adaptive Server Enterprise. It returns results like the following, rather than an error:

Adaptive Server Enterprise
select pub_id, type, count(title)
from titles
group by pub_id
pub_id type
------ ----         -----------
1389   popular_comp           6
1389   business               6
0736   psychology             6
0736   psychology             6
1389   business               6
0877   mod_cook               6
0877   trad_cook              6
0877   trad_cook              6
1389   popular_comp           6
0736   business               6
0736   psychology             6
0736   psychology             6
0877   mod_cook               6
0877   trad_cook              6
0877   NULL                   6
1389   business               6
0736   psychology             6
1389   popular_comp           6
(18 rows affected)

In this case, ASE finds the number of books for each group (6 for 1389, 6 for 0877, 6 for 0736). It then combines these set results with row results for the whole table. The results erroneously imply that there are six books in each pub_id-type group or 18 * 6 (108) books.


Using Complex Expressions

Another limitation in some SQL implementations concerns expressions. You're always safe with vanilla column names in the GROUP BY clause. Aggregate expressions, display labels, and SELECT list position numbers may or may not be allowed, depending on the SQL dialect you're using.

Multiple Summary Values for Multiple Levels of Grouping

How about multiple summary values for multiple levels of grouping? Let's say you're grouping by pub_id and type. You want to see the total number of books for each publisher and the total number of books for each type of book the publisher carries. You might think something like the following query would do the trick, but the results show you're on the wrong track:

SQL
select pub_id, count(title_id), type, count(title_id)
from titles
group by pub_id, type
pub_id count(titles.title_id) type         count(titles.title_id)
====== ====================== ============ ======================
1389                        3 popular_comp                      3
1389                        3 business                          3
0736                        5 psychology                        5
0877                        2 mod_cook                          2
0877                        3 trad_cook                         3
0736                        1 business                          1
0877                        1 (NULL)                            1
[7 rows]

The number of books is the same for publisher and for type, and the results don't make much sense in light of what you are trying to get. What you're seeing (in two columns) is the number of books for each publisher/type combination because that is the bottom level of the group. To get the results you want, you need to run two queries: the first one grouped by publisher (to give the publisher totals) and the second grouped by publisher and then by type within each publisher (to give the publisher/type totals).

SQL
select pub_id, count(title_id)
from titles
group by pub_id
pub_id count(titles.title_id)
====== ======================
1389                        6
0736                        6
0877                        6
[3 rows]

Each publisher has six books. The next query finds the total number of books for each publisher/type combination.

SQL
select pub_id, type, count(title_id)
from titles
group by pub_id, type
pub_id type         count(titles.title_id)
====== ============ ======================
1389   popular_comp                      3
1389   business                          3
0736   psychology                        5
0877   mod_cook                          2
0877   trad_cook                         3
0736   business                          1
0877   (NULL)                            1
[7 rows]

If you group by type alone, the query has a different meaning, and you'll get different results:

SQL
select type, count(title_id)
from titles
group by type
type         count(titles.title_id)
============ ======================
popular_comp                      3
business                          4
psychology                        5
mod_cook                          2
trad_cook                         3
(NULL)                            1
[6 rows]

This last set of results shows the number of books of each type, regardless of publisher. There are four books in the business category. However, two publishers sell this type of book: One has a single business book, and the other has three. The query results don't show how many are in each publisher/type combination.

Because the need to see multiple levels of summary values is so pervasive, many database vendors provide a report generator of some type. However, the report generator usually is an application program rather than a part of SQL because report results with multiple levels of summary values cannot be represented as relational tables.

SQL VARIANTS

Transact-SQL provides an extension that represents a SQL-based approach to the problem. Here's an example:

Transact-SQL
select pub_id, type, title_id
from titles
where type like '%cook%'
order by pub_id, type
compute count(title_id) by pub_id, type
								compute count(title_id) by pub_id
pub_id type         title_id
------ ----         --------
								0877   mod_cook     MC2222
								0877   mod_cook     MC3021
								                    count
								                    --------
								                           2
								0877   trad_cook    TC3218
								0877   trad_cook    TC4203
								0877   trad_cook    TC7777
								                    count
								                    --------
								                           3
								                    count
								                    --------
								                           5
							

These results show row values, subgroup totals, and group totals in one set of results. Note that these results, unlike the results of other SQL queries, are not a relation and cannot be further manipulated by other SQL statements.

Oracle SQL Plus provides tools for creating similar reports.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.151.126