Home Explore Blog CI



postgresql

21th chunk of `doc/src/sgml/syntax.sgml`
cf1123eb44e9b300244ca562e42ccd8ea6e5cfc06ce118130000000100000fa7
 </para>

   <para>
    When dealing with multiple-argument aggregate functions, note that the
    <literal>ORDER BY</literal> clause goes after all the aggregate arguments.
    For example, write this:
<programlisting>
SELECT string_agg(a, ',' ORDER BY a) FROM table;
</programlisting>
    not this:
<programlisting>
SELECT string_agg(a ORDER BY a, ',') FROM table;  -- incorrect
</programlisting>
    The latter is syntactically valid, but it represents a call of a
    single-argument aggregate function with two <literal>ORDER BY</literal> keys
    (the second one being rather useless since it's a constant).
   </para>

   <para>
    If <literal>DISTINCT</literal> is specified with an
    <replaceable>order_by_clause</replaceable>, <literal>ORDER
    BY</literal> expressions can only reference columns in the
    <literal>DISTINCT</literal> list.  For example:
<programlisting>
WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
 array_agg
-----------
 {4,3,2,1}
</programlisting>
   </para>

   <para>
    Placing <literal>ORDER BY</literal> within the aggregate's regular argument
    list, as described so far, is used when ordering the input rows for
    general-purpose and statistical aggregates, for which ordering is
    optional.  There is a
    subclass of aggregate functions called <firstterm>ordered-set
    aggregates</firstterm> for which an <replaceable>order_by_clause</replaceable>
    is <emphasis>required</emphasis>, usually because the aggregate's computation is
    only sensible in terms of a specific ordering of its input rows.
    Typical examples of ordered-set aggregates include rank and percentile
    calculations.  For an ordered-set aggregate,
    the <replaceable>order_by_clause</replaceable> is written
    inside <literal>WITHIN GROUP (...)</literal>, as shown in the final syntax
    alternative above.  The expressions in
    the <replaceable>order_by_clause</replaceable> are evaluated once per
    input row just like regular aggregate arguments, sorted as per
    the <replaceable>order_by_clause</replaceable>'s requirements, and fed
    to the aggregate function as input arguments.  (This is unlike the case
    for a non-<literal>WITHIN GROUP</literal> <replaceable>order_by_clause</replaceable>,
    which is not treated as argument(s) to the aggregate function.)  The
    argument expressions preceding <literal>WITHIN GROUP</literal>, if any, are
    called <firstterm>direct arguments</firstterm> to distinguish them from
    the <firstterm>aggregated arguments</firstterm> listed in
    the <replaceable>order_by_clause</replaceable>.  Unlike regular aggregate
    arguments, direct arguments are evaluated only once per aggregate call,
    not once per input row.  This means that they can contain variables only
    if those variables are grouped by <literal>GROUP BY</literal>; this restriction
    is the same as if the direct arguments were not inside an aggregate
    expression at all.  Direct arguments are typically used for things like
    percentile fractions, which only make sense as a single value per
    aggregation calculation.  The direct argument list can be empty; in this
    case, write just <literal>()</literal> not <literal>(*)</literal>.
    (<productname>PostgreSQL</productname> will actually accept either spelling, but
    only the first way conforms to the SQL standard.)
   </para>

   <para>
    <indexterm>
     <primary>median</primary>
     <seealso>percentile</seealso>
    </indexterm>
    An example of an ordered-set aggregate call is:

<programlisting>
SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY income) FROM households;
 percentile_cont
-----------------
           50489
</programlisting>

   which obtains the 50th percentile, or median, value of
   the <structfield>income</structfield> column from table <structname>households</structname>.
   Here, <literal>0.5</literal> is a direct argument; it would make no sense
   for the percentile

Title: ORDER BY Clause in Aggregate Functions: DISTINCT and WITHIN GROUP
Summary
This section explains the usage of the ORDER BY clause with DISTINCT in aggregate functions, where ORDER BY expressions can only reference columns in the DISTINCT list. It then introduces ordered-set aggregates, which require an ORDER BY clause within WITHIN GROUP. The expressions in this clause are evaluated per input row and sorted. Arguments preceding WITHIN GROUP are called direct arguments, evaluated once per aggregate call, while those within are aggregated arguments. An example of an ordered-set aggregate call using percentile_cont is provided.