three support functions use
the opclass's indexed data type for the <literal>query</literal> argument, even
though the actual type might be something else depending on the operator.
</para>
</sect2>
<sect2 id="gin-implementation">
<title>Implementation</title>
<para>
Internally, a <acronym>GIN</acronym> index contains a B-tree index
constructed over keys, where each key is an element of one or more indexed
items (a member of an array, for example) and where each tuple in a leaf
page contains either a pointer to a B-tree of heap pointers (a
<quote>posting tree</quote>), or a simple list of heap pointers (a <quote>posting
list</quote>) when the list is small enough to fit into a single index tuple along
with the key value. <xref linkend="gin-internals-figure"/> illustrates
these components of a GIN index.
</para>
<para>
As of <productname>PostgreSQL</productname> 9.1, null key values can be
included in the index. Also, placeholder nulls are included in the index
for indexed items that are null or contain no keys according to
<function>extractValue</function>. This allows searches that should find empty
items to do so.
</para>
<para>
Multicolumn <acronym>GIN</acronym> indexes are implemented by building
a single B-tree over composite values (column number, key value). The
key values for different columns can be of different types.
</para>
<figure id="gin-internals-figure">
<title>GIN Internals</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/gin.svg" format="SVG" width="100%"/>
</imageobject>
</mediaobject>
</figure>
<sect3 id="gin-fast-update">
<title>GIN Fast Update Technique</title>
<para>
Updating a <acronym>GIN</acronym> index tends to be slow because of the
intrinsic nature of inverted indexes: inserting or updating one heap row
can cause many inserts into the index (one for each key extracted
from the indexed item).
<acronym>GIN</acronym> is capable of postponing much of this work by inserting
new tuples into a temporary, unsorted list of pending entries.
When the table is vacuumed or autoanalyzed, or when
<function>gin_clean_pending_list</function> function is called, or if the
pending list becomes larger than
<xref linkend="guc-gin-pending-list-limit"/>, the entries are moved to the
main <acronym>GIN</acronym> data structure using the same bulk insert
techniques used during initial index creation. This greatly improves
<acronym>GIN</acronym> index update speed, even counting the additional
vacuum overhead. Moreover the overhead work can be done by a background
process instead of in foreground query processing.
</para>
<para>
The main disadvantage of this approach is that searches must scan the list
of pending entries in addition to searching the regular index, and so
a large list of pending entries will slow searches significantly.
Another disadvantage is that, while most updates are fast, an update
that causes the pending list to become <quote>too large</quote> will incur an
immediate cleanup cycle and thus be much slower than other updates.
Proper use of autovacuum can minimize both of these problems.
</para>
<para>
If consistent response time is more important than update speed,
use of pending entries can be disabled by turning off the
<literal>fastupdate</literal> storage parameter for a
<acronym>GIN</acronym> index. See <xref linkend="sql-createindex"/>
for details.
</para>
</sect3>
<sect3 id="gin-partial-match">
<title>Partial Match Algorithm</title>
<para>
GIN can support <quote>partial match</quote> queries, in which the query
does not determine an exact match for one or more keys, but the possible
matches fall within a reasonably narrow range of key values (within the
key sorting order determined by the <function>compare</function> support method).
The <function>extractQuery</function> method, instead of