880 Ramp+
| | | Sp Railroad +
| | | I- 580 +
| | | I- 680 Ramp+
| | | I- 80 Ramp+
| | | 14th St +
| | | I- 880 +
| | | Mac Arthur Blvd+
| | | Mission Blvd+
...
name | t | -0.5125 | I- 580 Ramp+
| | | I- 880 Ramp+
| | | I- 580 +
| | | I- 680 Ramp+
| | | I- 80 Ramp+
| | | Sp Railroad +
| | | I- 880 +
| | | State Hwy 13 Ramp+
| | | I- 80 +
| | | State Hwy 24 Ramp+
...
thepath | f | 0 |
thepath | t | 0 |
(4 rows)
</screen>
Note that two rows are displayed for the same column, one corresponding
to the complete inheritance hierarchy starting at the
<literal>road</literal> table (<literal>inherited</literal>=<literal>t</literal>),
and another one including only the <literal>road</literal> table itself
(<literal>inherited</literal>=<literal>f</literal>).
(For brevity, we have only shown the first ten most-common values for
the <literal>name</literal> column.)
</para>
<para>
The amount of information stored in <structname>pg_statistic</structname>
by <command>ANALYZE</command>, in particular the maximum number of entries in the
<structfield>most_common_vals</structfield> and <structfield>histogram_bounds</structfield>
arrays for each column, can be set on a
column-by-column basis using the <command>ALTER TABLE SET STATISTICS</command>
command, or globally by setting the
<xref linkend="guc-default-statistics-target"/> configuration variable.
The default limit is presently 100 entries. Raising the limit
might allow more accurate planner estimates to be made, particularly for
columns with irregular data distributions, at the price of consuming
more space in <structname>pg_statistic</structname> and slightly more
time to compute the estimates. Conversely, a lower limit might be
sufficient for columns with simple data distributions.
</para>
<para>
Further details about the planner's use of statistics can be found in
<xref linkend="planner-stats-details"/>.
</para>
</sect2>
<sect2 id="planner-stats-extended">
<title>Extended Statistics</title>
<indexterm zone="planner-stats-extended">
<primary>statistics</primary>
<secondary>of the planner</secondary>
</indexterm>
<indexterm>
<primary>correlation</primary>
<secondary>in the query planner</secondary>
</indexterm>
<indexterm>
<primary>pg_statistic_ext</primary>
</indexterm>
<indexterm>
<primary>pg_statistic_ext_data</primary>
</indexterm>
<para>
It is common to see slow queries running bad execution plans because
multiple columns used in the query clauses are correlated.
The planner normally assumes that multiple conditions
are independent of each other,
an assumption that does not hold when column values are correlated.
Regular statistics, because of their per-individual-column nature,
cannot capture any knowledge about cross-column correlation.
However, <productname>PostgreSQL</productname> has the ability to compute
<firstterm>multivariate