signature length of 80 bits, with attributes
i1 and i2 mapped to 2 bits, and attribute i3 mapped to 4 bits. We could
have omitted the <literal>length</literal>, <literal>col1</literal>,
and <literal>col2</literal> specifications since those have the default values.
</para>
<para>
Here is a more complete example of bloom index definition and usage, as
well as a comparison with equivalent btree indexes. The bloom index is
considerably smaller than the btree index, and can perform better.
</para>
<programlisting>
=# CREATE TABLE tbloom AS
SELECT
(random() * 1000000)::int as i1,
(random() * 1000000)::int as i2,
(random() * 1000000)::int as i3,
(random() * 1000000)::int as i4,
(random() * 1000000)::int as i5,
(random() * 1000000)::int as i6
FROM
generate_series(1,10000000);
SELECT 10000000
</programlisting>
<para>
A sequential scan over this large table takes a long time:
<programlisting>
=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------&zwsp;-----------------------------------
Seq Scan on tbloom (cost=0.00..213744.00 rows=250 width=24) (actual time=357.059..357.059 rows=0.00 loops=1)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 10000000
Buffers: shared hit=63744
Planning Time: 0.346 ms
Execution Time: 357.076 ms
(6 rows)
</programlisting>
</para>
<para>
Even with the btree index defined the result will still be a
sequential scan:
<programlisting>
=# CREATE INDEX btreeidx ON tbloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX
=# SELECT pg_size_pretty(pg_relation_size('btreeidx'));
pg_size_pretty
----------------
386 MB
(1 row)
=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------&zwsp;-----------------------------------
Seq Scan on tbloom (cost=0.00..213744.00 rows=2 width=24) (actual time=351.016..351.017 rows=0.00 loops=1)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 10000000
Buffers: shared hit=63744
Planning Time: 0.138 ms
Execution Time: 351.035 ms
(6 rows)
</programlisting>
</para>
<para>
Having the bloom index defined on the table is better than btree in
handling this type of search:
<programlisting>
=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX
=# SELECT pg_size_pretty(pg_relation_size('bloomidx'));
pg_size_pretty
----------------
153 MB
(1 row)
=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------&zwsp;--------------------------------------------------
Bitmap Heap Scan on tbloom (cost=1792.00..1799.69 rows=2 width=24) (actual time=22.605..22.606 rows=0.00 loops=1)
Recheck Cond: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Index Recheck: 2300
Heap Blocks: exact=2256
Buffers: shared hit=21864
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300.00 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
Index Searches: 1
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
(11 rows)
</programlisting>
</para>
<para>
Now, the main problem with the btree search is that btree is inefficient
when the search conditions do not constrain the leading index column(s).
A better