Home Explore Blog CI



postgresql

3rd chunk of `doc/src/sgml/perform.sgml`
dc4bb99d13fcb221b028809762d8dfce8c2db7856ba3bbbf0000000100000fa1
 transmit them to the client, which
    could be important factors in the real elapsed time; but the planner
    ignores those costs because it cannot change them by altering the
    plan.  (Every correct plan will output the same row set, we trust.)
   </para>

   <para>
    The <literal>rows</literal> value is a little tricky because it is
    not the number of rows processed or scanned by the
    plan node, but rather the number emitted by the node.  This is often
    less than the number scanned, as a result of filtering by any
    <literal>WHERE</literal>-clause conditions that are being applied at the node.
    Ideally the top-level rows estimate will approximate the number of rows
    actually returned, updated, or deleted by the query.
   </para>

   <para>
    Returning to our example:

<screen>
EXPLAIN SELECT * FROM tenk1;

                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..445.00 rows=10000 width=244)
</screen>
   </para>

   <para>
    These numbers are derived very straightforwardly.  If you do:

<programlisting>
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
</programlisting>

    you will find that <classname>tenk1</classname> has 345 disk
    pages and 10000 rows.  The estimated cost is computed as (disk pages read *
    <xref linkend="guc-seq-page-cost"/>) + (rows scanned *
    <xref linkend="guc-cpu-tuple-cost"/>).  By default,
    <varname>seq_page_cost</varname> is 1.0 and <varname>cpu_tuple_cost</varname> is 0.01,
    so the estimated cost is (345 * 1.0) + (10000 * 0.01) = 445.
   </para>

   <para>
    Now let's modify the query to add a <literal>WHERE</literal> condition:

<screen>
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 7000;

                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..470.00 rows=7000 width=244)
   Filter: (unique1 &lt; 7000)
</screen>

    Notice that the <command>EXPLAIN</command> output shows the <literal>WHERE</literal>
    clause being applied as a <quote>filter</quote> condition attached to the Seq
    Scan plan node.  This means that
    the plan node checks the condition for each row it scans, and outputs
    only the ones that pass the condition.
    The estimate of output rows has been reduced because of the
    <literal>WHERE</literal> clause.
    However, the scan will still have to visit all 10000 rows, so the cost
    hasn't decreased; in fact it has gone up a bit (by 10000 * <xref
    linkend="guc-cpu-operator-cost"/>, to be exact) to reflect the extra CPU
    time spent checking the <literal>WHERE</literal> condition.
   </para>

   <para>
    The actual number of rows this query would select is 7000, but the <literal>rows</literal>
    estimate is only approximate.  If you try to duplicate this experiment,
    you may well get a slightly different estimate; moreover, it can
    change after each <command>ANALYZE</command> command, because the
    statistics produced by <command>ANALYZE</command> are taken from a
    randomized sample of the table.
   </para>

   <para>
    Now, let's make the condition more restrictive:

<screen>
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100;

                                  QUERY PLAN
-------------------------------------------------------------------&zwsp;-----------
 Bitmap Heap Scan on tenk1  (cost=5.06..224.98 rows=100 width=244)
   Recheck Cond: (unique1 &lt; 100)
   -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=100 width=0)
         Index Cond: (unique1 &lt; 100)
</screen>

    Here the planner has decided to use a two-step plan: the child plan
    node visits an index to find the locations of rows matching the index
    condition, and then the upper plan node actually fetches those rows
    from the table itself.  Fetching rows separately is much more
    expensive than reading them sequentially, but because not all the pages
    of

Title: Understanding Query Plans with EXPLAIN
Summary
This section explains how to use the EXPLAIN command to understand query plans, including how costs are estimated, how the planner chooses between different plan nodes, and how to interpret the output of EXPLAIN, using examples to illustrate the effects of adding WHERE clauses and index conditions on query plans.