Home Explore Blog CI



postgresql

19th chunk of `doc/src/sgml/perform.sgml`
782a71e9184f5b3f6aa311caa0f15b929e31175b485243d50000000100000fb2
 mentioned for <literal>LIMIT</literal>.
    Also, if the outer (first) child contains rows with duplicate key values,
    the inner (second) child is backed up and rescanned for the portion of its
    rows matching that key value.  <command>EXPLAIN ANALYZE</command> counts these
    repeated emissions of the same inner rows as if they were real additional
    rows.  When there are many outer duplicates, the reported actual row count
    for the inner child plan node can be significantly larger than the number
    of rows that are actually in the inner relation.
   </para>

   <para>
    BitmapAnd and BitmapOr nodes always report their actual row counts as zero,
    due to implementation limitations.
   </para>

   <para>
    Normally, <command>EXPLAIN</command> will display every plan node
    created by the planner.  However, there are cases where the executor
    can determine that certain nodes need not be executed because they
    cannot produce any rows, based on parameter values that were not
    available at planning time.  (Currently this can only happen for child
    nodes of an Append or MergeAppend node that is scanning a partitioned
    table.)  When this happens, those plan nodes are omitted from
    the <command>EXPLAIN</command> output and a <literal>Subplans
    Removed: <replaceable>N</replaceable></literal> annotation appears
    instead.
   </para>
  </sect2>

 </sect1>

 <sect1 id="planner-stats">
  <title>Statistics Used by the Planner</title>

  <indexterm zone="planner-stats">
   <primary>statistics</primary>
   <secondary>of the planner</secondary>
  </indexterm>

  <sect2 id="planner-stats-single-column">
   <title>Single-Column Statistics</title>
  <para>
   As we saw in the previous section, the query planner needs to estimate
   the number of rows retrieved by a query in order to make good choices
   of query plans.  This section provides a quick look at the statistics
   that the system uses for these estimates.
  </para>

  <para>
   One component of the statistics is the total number of entries in
   each table and index, as well as the number of disk blocks occupied
   by each table and index.  This information is kept in the table
   <link linkend="catalog-pg-class"><structname>pg_class</structname></link>,
   in the columns <structfield>reltuples</structfield> and
   <structfield>relpages</structfield>.  We can look at it with
   queries similar to this one:

<screen>
SELECT relname, relkind, reltuples, relpages
FROM pg_class
WHERE relname LIKE 'tenk1%';

       relname        | relkind | reltuples | relpages
----------------------+---------+-----------+----------
 tenk1                | r       |     10000 |      345
 tenk1_hundred        | i       |     10000 |       11
 tenk1_thous_tenthous | i       |     10000 |       30
 tenk1_unique1        | i       |     10000 |       30
 tenk1_unique2        | i       |     10000 |       30
(5 rows)
</screen>

   Here we can see that <structname>tenk1</structname> contains 10000
   rows, as do its indexes, but the indexes are (unsurprisingly) much
   smaller than the table.
  </para>

  <para>
   For efficiency reasons, <structfield>reltuples</structfield>
   and <structfield>relpages</structfield> are not updated on-the-fly,
   and so they usually contain somewhat out-of-date values.
   They are updated by <command>VACUUM</command>, <command>ANALYZE</command>, and a
   few DDL commands such as <command>CREATE INDEX</command>.  A <command>VACUUM</command>
   or <command>ANALYZE</command> operation that does not scan the entire table
   (which is commonly the case) will incrementally update the
   <structfield>reltuples</structfield> count on the basis of the part
   of the table it did scan, resulting in an approximate value.
   In any case, the planner
   will scale the values it finds in <structname>pg_class</structname>
   to match the current physical table size, thus obtaining a closer
   approximation.
  </para>

  <indexterm>
   <primary>pg_statistic</primary>

Title: Statistics Used by the Planner
Summary
The query planner uses statistics to estimate the number of rows retrieved by a query, including total entries in tables and indexes, and disk block occupancy, which are stored in the pg_class catalog and updated by VACUUM, ANALYZE, and DDL commands, providing a basis for the planner to make informed decisions about query plans.