Home Explore Blog CI



postgresql

4th chunk of `doc/src/sgml/parallel.sgml`
d55cf000c538ce8b574f5a8ac5d888ba6c5b5c21d76d8eb00000000100000fa5
 use a parallel plan. This is a
        limitation of the current implementation, but it may not be desirable
        to remove this limitation, since it could result in a single query
        using a very large number of processes.
      </para>
    </listitem>
  </itemizedlist>

  <para>
    Even when a parallel query plan is generated for a particular query, there
    are several circumstances under which it will be impossible to execute
    that plan in parallel at execution time.  If this occurs, the leader
    will execute the portion of the plan below the <literal>Gather</literal>
    node entirely by itself, almost as if the <literal>Gather</literal> node were
    not present.  This will happen if any of the following conditions are met:
  </para>

  <itemizedlist>
    <listitem>
      <para>
        No background workers can be obtained because of the limitation that
        the total number of background workers cannot exceed
        <xref linkend="guc-max-worker-processes"/>.
      </para>
    </listitem>

    <listitem>
      <para>
        No background workers can be obtained because of the limitation that
        the total number of background workers launched for purposes of
        parallel query cannot exceed <xref linkend="guc-max-parallel-workers"/>.
      </para>
    </listitem>

    <listitem>
      <para>
        The client sends an Execute message with a non-zero fetch count.
        See the discussion of the
        <link linkend="protocol-flow-ext-query">extended query protocol</link>.
        Since <link linkend="libpq">libpq</link> currently provides no way to
        send such a message, this can only occur when using a client that
        does not rely on libpq.  If this is a frequent
        occurrence, it may be a good idea to set
        <xref linkend="guc-max-parallel-workers-per-gather"/> to zero in
        sessions where it is likely, so as to avoid generating query plans
        that may be suboptimal when run serially.
      </para>
    </listitem>
  </itemizedlist>
 </sect1>

 <sect1 id="parallel-plans">
  <title>Parallel Plans</title>

  <para>
    Because each worker executes the parallel portion of the plan to
    completion, it is not possible to simply take an ordinary query plan
    and run it using multiple workers.  Each worker would produce a full
    copy of the output result set, so the query would not run any faster
    than normal but would produce incorrect results.  Instead, the parallel
    portion of the plan must be what is known internally to the query
    optimizer as a <firstterm>partial plan</firstterm>; that is, it must be constructed
    so that each process that executes the plan will generate only a
    subset of the output rows in such a way that each required output row
    is guaranteed to be generated by exactly one of the cooperating processes.
    Generally, this means that the scan on the driving table of the query
    must be a parallel-aware scan.
  </para>

 <sect2 id="parallel-scans">
  <title>Parallel Scans</title>

  <para>
    The following types of parallel-aware table scans are currently supported.

  <itemizedlist>
    <listitem>
      <para>
        In a <emphasis>parallel sequential scan</emphasis>, the table's blocks will
        be divided into ranges and shared among the cooperating processes.  Each
        worker process will complete the scanning of its given range of blocks before
        requesting an additional range of blocks.
      </para>
    </listitem>
    <listitem>
      <para>
        In a <emphasis>parallel bitmap heap scan</emphasis>, one process is chosen
        as the leader.  That process performs a scan of one or more indexes
        and builds a bitmap indicating which table blocks need to be visited.
        These blocks are then divided among the cooperating processes as in
        a parallel sequential scan.  In other words, the heap scan is performed
        in parallel, but the underlying index scan is not.
      </para>

Title: Parallel Query Execution and Plan Generation
Summary
When a parallel query plan is generated, it may not always be possible to execute it in parallel due to limitations such as lack of available background workers, and in such cases the leader process will execute the plan serially. The planner constructs parallel plans as partial plans, where each worker generates a subset of the output rows, and supports parallel-aware table scans such as parallel sequential scans and parallel bitmap heap scans.