SQL GROUP BY and HAVING Clauses

<literal>fdt</literal>. Notice the use of scalar subqueries as value expressions. Just like any other query, the subqueries can employ complex table expressions. Notice also how <literal>fdt</literal> is referenced in the subqueries. Qualifying <literal>c1</literal> as <literal>fdt.c1</literal> is only necessary if <literal>c1</literal> is also the name of a column in the derived input table of the subquery. But qualifying the column name adds clarity even when it is not needed. This example shows how the column naming scope of an outer query extends into its inner queries. </para> </sect2> <sect2 id="queries-group"> <title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title> <indexterm zone="queries-group"> <primary>GROUP BY</primary> </indexterm> <indexterm zone="queries-group"> <primary>grouping</primary> </indexterm> <para> After passing the <literal>WHERE</literal> filter, the derived input table might be subject to grouping, using the <literal>GROUP BY</literal> clause, and elimination of group rows using the <literal>HAVING</literal> clause. </para> <synopsis> SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>... </synopsis> <para> The <link linkend="sql-groupby"><literal>GROUP BY</literal></link> clause is used to group together those rows in a table that have the same values in all the columns listed. The order in which the columns are listed does not matter. The effect is to combine each set of rows having common values into one group row that represents all rows in the group. This is done to eliminate redundancy in the output and/or compute aggregates that apply to these groups. For instance: <screen> <prompt>=></prompt> <userinput>SELECT * FROM test1;</userinput> x | y ---+--- a | 3 c | 2 b | 5 a | 1 (4 rows) <prompt>=></prompt> <userinput>SELECT x FROM test1 GROUP BY x;</userinput> x --- a b c (3 rows) </screen> </para> <para> In the second query, we could not have written <literal>SELECT * FROM test1 GROUP BY x</literal>, because there is no single value for the column <literal>y</literal> that could be associated with each group. The grouped-by columns can be referenced in the select list since they have a single value in each group. </para> <para> In general, if a table is grouped, columns that are not listed in <literal>GROUP BY</literal> cannot be referenced except in aggregate expressions. An example with aggregate expressions is: <screen> <prompt>=></prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</userinput> x | sum ---+----- a | 4 b | 5 c | 2 (3 rows) </screen> Here <literal>sum</literal> is an aggregate function that computes a single value over the entire group. More information about the available aggregate functions can be found in <xref linkend="functions-aggregate"/>. </para> <tip> <para> Grouping without aggregate expressions effectively calculates the set of distinct values in a column. This can also be achieved using the <literal>DISTINCT</literal> clause (see <xref linkend="queries-distinct"/>). </para> </tip> <para> Here is another example: it calculates the total sales for each product (rather than the total sales of all products): <programlisting> SELECT product_id, p.name, (sum(s.units) * p.price) AS sales FROM products p LEFT JOIN sales s USING (product_id) GROUP BY product_id, p.name, p.price; </programlisting> In this example, the columns <literal>product_id</literal>, <literal>p.name</literal>, and <literal>p.price</literal> must be in the <literal>GROUP BY</literal> clause since

This section explains the GROUP BY and HAVING clauses in SQL queries. The GROUP BY clause is used to group rows with the same values in specified columns, eliminating redundancy and allowing for aggregate computations on these groups. It demonstrates how grouped columns can be referenced in the SELECT list, while non-grouped columns can only be used in aggregate expressions. The text provides examples of GROUP BY usage, including one that calculates distinct values and another that computes total sales by product. It also mentions the HAVING clause, which is used to filter group rows after grouping has occurred, though detailed explanation of HAVING is not provided in this excerpt.