1000)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
Index Cond: (unique1 < 1000)
</programlisting>
The planner examines the <literal>WHERE</literal> clause condition
and looks up the selectivity function for the operator
<literal><</literal> in <structname>pg_operator</structname>.
This is held in the column <structfield>oprrest</structfield>,
and the entry in this case is <function>scalarltsel</function>.
The <function>scalarltsel</function> function retrieves the histogram for
<structfield>unique1</structfield> from
<structname>pg_statistic</structname>. For manual queries it is more
convenient to look in the simpler <structname>pg_stats</structname>
view:
<programlisting>
SELECT histogram_bounds FROM pg_stats
WHERE tablename='tenk1' AND attname='unique1';
histogram_bounds
------------------------------------------------------
{0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
</programlisting>
Next the fraction of the histogram occupied by <quote>< 1000</quote>
is worked out. This is the selectivity. The histogram divides the range
into equal frequency buckets, so all we have to do is locate the bucket
that our value is in and count <emphasis>part</emphasis> of it and
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
the second bucket (993–1997). Assuming a linear distribution of
values inside each bucket, we can calculate the selectivity as:
<programlisting>
selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
= (1 + (1000 - 993)/(1997 - 993))/10
= 0.100697
</programlisting>
that is, one whole bucket plus a linear fraction of the second, divided by
the number of buckets. The estimated number of rows can now be calculated as
the product of the selectivity and the cardinality of
<structname>tenk1</structname>:
<programlisting>
rows = rel_cardinality * selectivity
= 10000 * 0.100697
= 1007 (rounding off)
</programlisting>
</para>
<para>
Next let's consider an example with an equality condition in its
<literal>WHERE</literal> clause:
<programlisting>
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
QUERY PLAN
----------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244)
Filter: (stringu1 = 'CRAAAA'::name)
</programlisting>
Again the planner examines the <literal>WHERE</literal> clause condition
and looks up the selectivity function for <literal>=</literal>, which is
<function>eqsel</function>. For equality estimation the histogram is
not useful; instead the list of <firstterm>most
common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the
selectivity. Let's have a look at the MCVs, with some additional columns
that will be useful later:
<programlisting>
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
WHERE tablename='tenk1' AND attname='stringu1';
null_frac | 0
n_distinct | 676
most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA}
most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003}
</programlisting>
Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is
merely the corresponding entry in the list of most common frequencies
(<acronym>MCF</acronym>s):
<programlisting>
selectivity = mcf[3]
= 0.003
</programlisting>
As before, the estimated number of rows is just the product of this with the
cardinality of <structname>tenk1</structname>:
<programlisting>
rows = 10000 * 0.003
= 30
</programlisting>
</para>
<para>
Now consider the same query, but with a constant that is not in the
<acronym>MCV</acronym> list:
<programlisting>
EXPLAIN SELECT * FROM