aggregate function that
computes a single value over the entire group. More information
about the available aggregate functions can be found in <xref
linkend="functions-aggregate"/>.
</para>
<tip>
<para>
Grouping without aggregate expressions effectively calculates the
set of distinct values in a column. This can also be achieved
using the <literal>DISTINCT</literal> clause (see <xref
linkend="queries-distinct"/>).
</para>
</tip>
<para>
Here is another example: it calculates the total sales for each
product (rather than the total sales of all products):
<programlisting>
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
FROM products p LEFT JOIN sales s USING (product_id)
GROUP BY product_id, p.name, p.price;
</programlisting>
In this example, the columns <literal>product_id</literal>,
<literal>p.name</literal>, and <literal>p.price</literal> must be
in the <literal>GROUP BY</literal> clause since they are referenced in
the query select list (but see below). The column
<literal>s.units</literal> does not have to be in the <literal>GROUP
BY</literal> list since it is only used in an aggregate expression
(<literal>sum(...)</literal>), which represents the sales
of a product. For each product, the query returns a summary row about
all sales of the product.
</para>
<indexterm><primary>functional dependency</primary></indexterm>
<para>
If the products table is set up so that, say,
<literal>product_id</literal> is the primary key, then it would be
enough to group by <literal>product_id</literal> in the above example,
since name and price would be <firstterm>functionally
dependent</firstterm> on the product ID, and so there would be no
ambiguity about which name and price value to return for each product
ID group.
</para>
<para>
In strict SQL, <literal>GROUP BY</literal> can only group by columns of
the source table but <productname>PostgreSQL</productname> extends
this to also allow <literal>GROUP BY</literal> to group by columns in the
select list. Grouping by value expressions instead of simple
column names is also allowed.
</para>
<indexterm>
<primary>HAVING</primary>
</indexterm>
<para>
If a table has been grouped using <literal>GROUP BY</literal>,
but only certain groups are of interest, the
<literal>HAVING</literal> clause can be used, much like a
<literal>WHERE</literal> clause, to eliminate groups from the result.
The syntax is:
<synopsis>
SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
</synopsis>
Expressions in the <literal>HAVING</literal> clause can refer both to
grouped expressions and to ungrouped expressions (which necessarily
involve an aggregate function).
</para>
<para>
Example:
<screen>
<prompt>=></prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) > 3;</userinput>
x | sum
---+-----
a | 4
b | 5
(2 rows)
<prompt>=></prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x < 'c';</userinput>
x | sum
---+-----
a | 4
b | 5
(2 rows)
</screen>
</para>
<para>
Again, a more realistic example:
<programlisting>
SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
FROM products p LEFT JOIN sales s USING (product_id)
WHERE s.date > CURRENT_DATE - INTERVAL '4 weeks'
GROUP BY product_id, p.name, p.price, p.cost
HAVING sum(p.price * s.units) > 5000;
</programlisting>
In the example above, the <literal>WHERE</literal> clause is selecting
rows by a column that is not grouped (the expression is only true for
sales during the last four weeks), while the <literal>HAVING</literal>
clause restricts the output to groups with total gross sales over
5000. Note that