<literal>fdt</literal>. Notice the use of scalar subqueries as
value expressions. Just like any other query, the subqueries can
employ complex table expressions. Notice also how
<literal>fdt</literal> is referenced in the subqueries.
Qualifying <literal>c1</literal> as <literal>fdt.c1</literal> is only necessary
if <literal>c1</literal> is also the name of a column in the derived
input table of the subquery. But qualifying the column name adds
clarity even when it is not needed. This example shows how the column
naming scope of an outer query extends into its inner queries.
</para>
</sect2>
<sect2 id="queries-group">
<title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title>
<indexterm zone="queries-group">
<primary>GROUP BY</primary>
</indexterm>
<indexterm zone="queries-group">
<primary>grouping</primary>
</indexterm>
<para>
After passing the <literal>WHERE</literal> filter, the derived input
table might be subject to grouping, using the <literal>GROUP BY</literal>
clause, and elimination of group rows using the <literal>HAVING</literal>
clause.
</para>
<synopsis>
SELECT <replaceable>select_list</replaceable>
FROM ...
<optional>WHERE ...</optional>
GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
</synopsis>
<para>
The <link linkend="sql-groupby"><literal>GROUP BY</literal></link> clause is
used to group together those rows in a table that have the same
values in all the columns listed. The order in which the columns
are listed does not matter. The effect is to combine each set
of rows having common values into one group row that
represents all rows in the group. This is done to
eliminate redundancy in the output and/or compute aggregates that
apply to these groups. For instance:
<screen>
<prompt>=></prompt> <userinput>SELECT * FROM test1;</userinput>
x | y
---+---
a | 3
c | 2
b | 5
a | 1
(4 rows)
<prompt>=></prompt> <userinput>SELECT x FROM test1 GROUP BY x;</userinput>
x
---
a
b
c
(3 rows)
</screen>
</para>
<para>
In the second query, we could not have written <literal>SELECT *
FROM test1 GROUP BY x</literal>, because there is no single value
for the column <literal>y</literal> that could be associated with each
group. The grouped-by columns can be referenced in the select list since
they have a single value in each group.
</para>
<para>
In general, if a table is grouped, columns that are not
listed in <literal>GROUP BY</literal> cannot be referenced except in aggregate
expressions. An example with aggregate expressions is:
<screen>
<prompt>=></prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</userinput>
x | sum
---+-----
a | 4
b | 5
c | 2
(3 rows)
</screen>
Here <literal>sum</literal> is an aggregate function that
computes a single value over the entire group. More information
about the available aggregate functions can be found in <xref
linkend="functions-aggregate"/>.
</para>
<tip>
<para>
Grouping without aggregate expressions effectively calculates the
set of distinct values in a column. This can also be achieved
using the <literal>DISTINCT</literal> clause (see <xref
linkend="queries-distinct"/>).
</para>
</tip>
<para>
Here is another example: it calculates the total sales for each
product (rather than the total sales of all products):
<programlisting>
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
FROM products p LEFT JOIN sales s USING (product_id)
GROUP BY product_id, p.name, p.price;
</programlisting>
In this example, the columns <literal>product_id</literal>,
<literal>p.name</literal>, and <literal>p.price</literal> must be
in the <literal>GROUP BY</literal> clause since