tuples, B-Tree posting list tuples
do not need to expand every time a new duplicate is inserted; they
are merely an alternative physical representation of the original
logical contents of the leaf page. This design prioritizes
consistent performance with mixed read-write workloads. Most
client applications will at least see a moderate performance
benefit from using deduplication. Deduplication is enabled by
default.
</para>
<para>
<command>CREATE INDEX</command> and <command>REINDEX</command>
apply deduplication to create posting list tuples, though the
strategy they use is slightly different. Each group of duplicate
ordinary tuples encountered in the sorted input taken from the
table is merged into a posting list tuple
<emphasis>before</emphasis> being added to the current pending leaf
page. Individual posting list tuples are packed with as many
<acronym>TID</acronym>s as possible. Leaf pages are written out in
the usual way, without any separate deduplication pass. This
strategy is well-suited to <command>CREATE INDEX</command> and
<command>REINDEX</command> because they are once-off batch
operations.
</para>
<para>
Write-heavy workloads that don't benefit from deduplication due to
having few or no duplicate values in indexes will incur a small,
fixed performance penalty (unless deduplication is explicitly
disabled). The <literal>deduplicate_items</literal> storage
parameter can be used to disable deduplication within individual
indexes. There is never any performance penalty with read-only
workloads, since reading posting list tuples is at least as
efficient as reading the standard tuple representation. Disabling
deduplication isn't usually helpful.
</para>
<para>
It is sometimes possible for unique indexes (as well as unique
constraints) to use deduplication. This allows leaf pages to
temporarily <quote>absorb</quote> extra version churn duplicates.
Deduplication in unique indexes augments bottom-up index deletion,
especially in cases where a long-running transaction holds a
snapshot that blocks garbage collection. The goal is to buy time
for the bottom-up index deletion strategy to become effective
again. Delaying page splits until a single long-running
transaction naturally goes away can allow a bottom-up deletion pass
to succeed where an earlier deletion pass failed.
</para>
<tip>
<para>
A special heuristic is applied to determine whether a
deduplication pass in a unique index should take place.