Home Explore Blog CI



postgresql

15th chunk of `doc/src/sgml/queries.sgml`
98da10e8b78f75f158f04d6471eb8374aff9f4bb021458f70000000100000fa3
 aggregate function that
    computes a single value over the entire group.  More information
    about the available aggregate functions can be found in <xref
    linkend="functions-aggregate"/>.
   </para>

   <tip>
    <para>
     Grouping without aggregate expressions effectively calculates the
     set of distinct values in a column.  This can also be achieved
     using the <literal>DISTINCT</literal> clause (see <xref
     linkend="queries-distinct"/>).
    </para>
   </tip>

   <para>
    Here is another example:  it calculates the total sales for each
    product (rather than the total sales of all products):
<programlisting>
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
    FROM products p LEFT JOIN sales s USING (product_id)
    GROUP BY product_id, p.name, p.price;
</programlisting>
    In this example, the columns <literal>product_id</literal>,
    <literal>p.name</literal>, and <literal>p.price</literal> must be
    in the <literal>GROUP BY</literal> clause since they are referenced in
    the query select list (but see below).  The column
    <literal>s.units</literal> does not have to be in the <literal>GROUP
    BY</literal> list since it is only used in an aggregate expression
    (<literal>sum(...)</literal>), which represents the sales
    of a product.  For each product, the query returns a summary row about
    all sales of the product.
   </para>

   <indexterm><primary>functional dependency</primary></indexterm>

   <para>
    If the products table is set up so that, say,
    <literal>product_id</literal> is the primary key, then it would be
    enough to group by <literal>product_id</literal> in the above example,
    since name and price would be <firstterm>functionally
    dependent</firstterm> on the product ID, and so there would be no
    ambiguity about which name and price value to return for each product
    ID group.
   </para>

   <para>
    In strict SQL, <literal>GROUP BY</literal> can only group by columns of
    the source table but <productname>PostgreSQL</productname> extends
    this to also allow <literal>GROUP BY</literal> to group by columns in the
    select list.  Grouping by value expressions instead of simple
    column names is also allowed.
   </para>

   <indexterm>
    <primary>HAVING</primary>
   </indexterm>

   <para>
    If a table has been grouped using <literal>GROUP BY</literal>,
    but only certain groups are of interest, the
    <literal>HAVING</literal> clause can be used, much like a
    <literal>WHERE</literal> clause, to eliminate groups from the result.
    The syntax is:
<synopsis>
SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
</synopsis>
    Expressions in the <literal>HAVING</literal> clause can refer both to
    grouped expressions and to ungrouped expressions (which necessarily
    involve an aggregate function).
   </para>

   <para>
    Example:
<screen>
<prompt>=&gt;</prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) &gt; 3;</userinput>
 x | sum
---+-----
 a |   4
 b |   5
(2 rows)

<prompt>=&gt;</prompt> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x &lt; 'c';</userinput>
 x | sum
---+-----
 a |   4
 b |   5
(2 rows)
</screen>
   </para>

   <para>
    Again, a more realistic example:
<programlisting>
SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
    FROM products p LEFT JOIN sales s USING (product_id)
    WHERE s.date &gt; CURRENT_DATE - INTERVAL '4 weeks'
    GROUP BY product_id, p.name, p.price, p.cost
    HAVING sum(p.price * s.units) &gt; 5000;
</programlisting>
    In the example above, the <literal>WHERE</literal> clause is selecting
    rows by a column that is not grouped (the expression is only true for
    sales during the last four weeks), while the <literal>HAVING</literal>
    clause restricts the output to groups with total gross sales over
    5000.  Note that

Title: SQL Grouping and Aggregate Functions
Summary
This section delves deeper into SQL's GROUP BY clause and its interaction with aggregate functions. It explains that GROUP BY is used to group rows with identical values in specified columns, allowing for aggregate calculations on these groups. The text provides examples of using GROUP BY with aggregate functions like SUM, and discusses how non-grouped columns can only be used in aggregate expressions. It also introduces the concept of functional dependency in grouping and mentions PostgreSQL's extension of the GROUP BY clause. The HAVING clause is explained as a way to filter grouped results, similar to the WHERE clause but applied after grouping. Several examples demonstrate the practical use of GROUP BY and HAVING in complex queries, including calculating sales and profits for products.