You want to calculate group summaries but display the results only for those groups that match certain criteria.
You’re familiar with the use of WHERE
to specify conditions that individual
rows must satisfy to be selected by a statement. It’s natural,
therefore, to use
WHERE
to write
conditions that involve summary values. The only trouble is that it
doesn’t work. If you want to identify drivers in the driver_log
table who drove more than three
days, you’d probably first think to write the statement like
this:
mysql>SELECT COUNT(*), name
->FROM driver_log
->WHERE COUNT(*) > 3
->GROUP BY name;
ERROR 1111 (HY000): Invalid use of group function
The problem here is that WHERE
specifies the initial constraints that
determine which rows to select, but the value of COUNT()
can be determined only after the rows have
been selected. The solution is to put the COUNT()
expression in a HAVING
clause instead. HAVING
is analogous to
WHERE
, but it applies to group
characteristics rather than to single rows. That is, HAVING
operates on the
already-selected-and-grouped set of rows, applying additional
constraints based on aggregate function results that aren’t known
during the initial selection process. The preceding query therefore
should be written like this:
mysql>SELECT COUNT(*), name
->FROM driver_log
->GROUP BY name
->HAVING COUNT(*) > 3;
+----------+-------+ | COUNT(*) | name | +----------+-------+ | 5 | Henry | +----------+-------+
When you use HAVING
, you can
still include a WHERE
clause—but
only to select rows, not to test summary values.
HAVING
can refer to aliases,
so the previous query can be rewritten like this:
mysql>SELECT COUNT(*) AS count, name
->FROM driver_log
->GROUP BY name
->HAVING count > 3;
+-------+-------+ | count | name | +-------+-------+ | 5 | Henry | +-------+-------+
3.15.144.170