read-only
workloads, since reading posting list tuples is at least as
efficient as reading the standard tuple representation. Disabling
deduplication isn't usually helpful.
</para>
<para>
It is sometimes possible for unique indexes (as well as unique
constraints) to use deduplication. This allows leaf pages to
temporarily <quote>absorb</quote> extra version churn duplicates.
Deduplication in unique indexes augments bottom-up index deletion,
especially in cases where a long-running transaction holds a
snapshot that blocks garbage collection. The goal is to buy time
for the bottom-up index deletion strategy to become effective
again. Delaying page splits until a single long-running
transaction naturally goes away can allow a bottom-up deletion pass
to succeed where an earlier deletion pass failed.
</para>
<tip>
<para>
A special heuristic is applied to determine whether a
deduplication pass in a unique index should take place. It can
often skip straight to splitting a leaf page, avoiding a
performance penalty from wasting cycles on unhelpful deduplication
passes. If you're concerned about the overhead of deduplication,
consider setting <literal>deduplicate_items = off</literal>
selectively. Leaving deduplication enabled in unique indexes has
little downside.
</para>
</tip>
<para>
Deduplication cannot be used in all cases due to
implementation-level restrictions. Deduplication safety is
determined when <command>CREATE INDEX</command> or
<command>REINDEX</command> is run.
</para>
<para>
Note that deduplication is deemed unsafe and cannot be used in the
following cases involving semantically significant differences
among equal datums:
</para>
<para>
<itemizedlist>
<listitem>
<para>
<type>text</type>, <type>varchar</type>, and <type>char</type>
cannot use deduplication when a
<emphasis>nondeterministic</emphasis> collation is used. Case
and accent differences must be preserved among equal datums.
</para>
</listitem>
<listitem>
<para>
<type>numeric</type> cannot use deduplication. Numeric display
scale must be preserved among equal datums.
</para>
</listitem>
<listitem>
<para>
<type>jsonb</type> cannot use deduplication, since the
<type>jsonb</type> B-Tree operator class uses
<type>numeric</type> internally.
</para>
</listitem>
<listitem>
<para>
<type>float4</type> and <type>float8</type> cannot use
deduplication. These types have distinct representations for
<literal>-0</literal> and <literal>0</literal>, which are
nevertheless considered equal. This difference must be
preserved.
</para>
</listitem>
</itemizedlist>
</para>
<para>
There is one further implementation-level restriction that may be
lifted in a future version of
<productname>PostgreSQL</productname>:
</para>
<para>
<itemizedlist>
<listitem>
<para>
Container types (such as composite types, arrays, or range
types) cannot use deduplication.
</para>
</listitem>
</itemizedlist>
</para>
<para>
There is one further implementation-level restriction that applies
regardless of the operator class or collation used:
</para>
<para>
<itemizedlist>
<listitem>
<para>
<literal>INCLUDE</literal> indexes can never use deduplication.
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
</sect1>