Home Explore Blog CI



postgresql

8th chunk of `doc/src/sgml/xaggr.sgml`
18e0e9bae9d566e9bc9740bda2f338f184a518f6a56e53d00000000100000efb
 aggregated input(s).  As with normal
   aggregates, <literal>finalfunc_extra</literal> is only really useful if the
   aggregate is polymorphic; then the extra dummy argument(s) are needed
   to connect the final function's result type to the aggregate's input
   type(s).
  </para>

  <para>
   Currently, ordered-set aggregates cannot be used as window functions,
   and therefore there is no need for them to support moving-aggregate mode.
  </para>

 </sect2>

  <sect2 id="xaggr-partial-aggregates">
  <title>Partial Aggregation</title>

  <indexterm>
   <primary>aggregate function</primary>
   <secondary>partial aggregation</secondary>
  </indexterm>

  <para>
   Optionally, an aggregate function can support <firstterm>partial
   aggregation</firstterm>.  The idea of partial aggregation is to run the aggregate's
   state transition function over different subsets of the input data
   independently, and then to combine the state values resulting from those
   subsets to produce the same state value that would have resulted from
   scanning all the input in a single operation.  This mode can be used for
   parallel aggregation by having different worker processes scan different
   portions of a table.  Each worker produces a partial state value, and at
   the end those state values are combined to produce a final state value.
   (In the future this mode might also be used for purposes such as combining
   aggregations over local and remote tables; but that is not implemented
   yet.)
  </para>

  <para>
   To support partial aggregation, the aggregate definition must provide
   a <firstterm>combine function</firstterm>, which takes two values of the
   aggregate's state type (representing the results of aggregating over two
   subsets of the input rows) and produces a new value of the state type,
   representing what the state would have been after aggregating over the
   combination of those sets of rows.  It is unspecified what the relative
   order of the input rows from the two sets would have been.  This means
   that it's usually impossible to define a useful combine function for
   aggregates that are sensitive to input row order.
  </para>

  <para>
   As simple examples, <literal>MAX</literal> and <literal>MIN</literal> aggregates can be
   made to support partial aggregation by specifying the combine function as
   the same greater-of-two or lesser-of-two comparison function that is used
   as their transition function.  <literal>SUM</literal> aggregates just need an
   addition function as combine function.  (Again, this is the same as their
   transition function, unless the state value is wider than the input data
   type.)
  </para>

  <para>
   The combine function is treated much like a transition function that
   happens to take a value of the state type, not of the underlying input
   type, as its second argument.  In particular, the rules for dealing
   with null values and strict functions are similar.  Also, if the aggregate
   definition specifies a non-null <literal>initcond</literal>, keep in mind that
   that will be used not only as the initial state for each partial
   aggregation run, but also as the initial state for the combine function,
   which will be called to combine each partial result into that state.
  </para>

  <para>
   If the aggregate's state type is declared as <type>internal</type>, it is
   the combine function's responsibility that its result is allocated in
   the correct memory context for aggregate state values.  This means in
   particular that when the first input is <literal>NULL</literal> it's invalid
   to simply return the second input, as that value will be in the wrong
   context and will not have sufficient lifespan.
  </para>

  <para>
   When the aggregate's state type is declared as <type>internal</type>,

Title: Partial Aggregation in Aggregate Functions
Summary
Partial aggregation allows an aggregate function to process different subsets of input data independently and then combine the results to produce the same state value as if the entire dataset was processed at once, requiring a combine function that takes two state values and returns a new state value, and is useful for parallel aggregation and potentially combining aggregations over local and remote tables