Home Explore Blog CI



postgresql

7th chunk of `doc/src/sgml/parallel.sgml`
866cf21f1caa474f06405c28d63c969596552e00e035c8da0000000100000fa3

  </para>

 </sect2>

 <sect2 id="parallel-append">
  <title>Parallel Append</title>

  <para>
    Whenever <productname>PostgreSQL</productname> needs to combine rows
    from multiple sources into a single result set, it uses an
    <literal>Append</literal> or <literal>MergeAppend</literal> plan node.
    This commonly happens when implementing <literal>UNION ALL</literal> or
    when scanning a partitioned table.  Such nodes can be used in parallel
    plans just as they can in any other plan.  However, in a parallel plan,
    the planner may instead use a <literal>Parallel Append</literal> node.
  </para>

  <para>
    When an <literal>Append</literal> node is used in a parallel plan, each
    process will execute the child plans in the order in which they appear,
    so that all participating processes cooperate to execute the first child
    plan until it is complete and then move to the second plan at around the
    same time.  When a <literal>Parallel Append</literal> is used instead, the
    executor will instead spread out the participating processes as evenly as
    possible across its child plans, so that multiple child plans are executed
    simultaneously.  This avoids contention, and also avoids paying the startup
    cost of a child plan in those processes that never execute it.
  </para>

  <para>
    Also, unlike a regular <literal>Append</literal> node, which can only have
    partial children when used within a parallel plan, a <literal>Parallel
    Append</literal> node can have both partial and non-partial child plans.
    Non-partial children will be scanned by only a single process, since
    scanning them more than once would produce duplicate results.  Plans that
    involve appending multiple result sets can therefore achieve
    coarse-grained parallelism even when efficient partial plans are not
    available.  For example, consider a query against a partitioned table
    that can only be implemented efficiently by using an index that does
    not support parallel scans.  The planner might choose a <literal>Parallel
    Append</literal> of regular <literal>Index Scan</literal> plans; each
    individual index scan would have to be executed to completion by a single
    process, but different scans could be performed at the same time by
    different processes.
  </para>

  <para>
    <xref linkend="guc-enable-parallel-append" /> can be used to disable
    this feature.
  </para>
 </sect2>

 <sect2 id="parallel-plan-tips">
  <title>Parallel Plan Tips</title>

  <para>
    If a query that is expected to do so does not produce a parallel plan,
    you can try reducing <xref linkend="guc-parallel-setup-cost"/> or
    <xref linkend="guc-parallel-tuple-cost"/>.  Of course, this plan may turn
    out to be slower than the serial plan that the planner preferred, but
    this will not always be the case.  If you don't get a parallel
    plan even with very small values of these settings (e.g., after setting
    them both to zero), there may be some reason why the query planner is
    unable to generate a parallel plan for your query.  See
    <xref linkend="when-can-parallel-query-be-used"/> and
    <xref linkend="parallel-safety"/> for information on why this may be
    the case.
  </para>

  <para>
    When executing a parallel plan, you can use <literal>EXPLAIN (ANALYZE,
    VERBOSE)</literal> to display per-worker statistics for each plan node.
    This may be useful in determining whether the work is being evenly
    distributed between all plan nodes and more generally in understanding the
    performance characteristics of the plan.
  </para>

 </sect2>
 </sect1>

 <sect1 id="parallel-safety">
  <title>Parallel Safety</title>

  <para>
    The planner classifies operations involved in a query as either
    <firstterm>parallel safe</firstterm>, <firstterm>parallel restricted</firstterm>,
    or <firstterm>parallel unsafe</firstterm>.  A parallel safe operation is one that
    does not conflict with the

Title: Parallel Query Execution in PostgreSQL
Summary
PostgreSQL's parallel append feature allows for combining rows from multiple sources into a single result set, using a Parallel Append node to spread out processes across child plans and avoid contention. The planner can also use non-partial child plans to achieve coarse-grained parallelism. Additionally, tips are provided for troubleshooting parallel plan generation, including reducing setup and tuple costs, and using EXPLAIN (ANALYZE, VERBOSE) to display per-worker statistics.