Home Explore Blog CI



postgresql

18th chunk of `doc/src/sgml/gist.sgml`
c56515c9744696c5bfcefce952f2e246e8ca72cb5d27614a0000000100000ed1
 fit in cache, a lot of random I/O will be
   needed.  <productname>PostgreSQL</productname> supports two alternative
   methods for initial build of a GiST index: <firstterm>sorted</firstterm>
   and <firstterm>buffered</firstterm> modes.
  </para>

  <para>
   The sorted method is only available if each of the opclasses used by the
   index provides a <function>sortsupport</function> function, as described
   in <xref linkend="gist-extensibility"/>.  If they do, this method is
   usually the best, so it is used by default.
  </para>

  <para>
   The buffered method works by not inserting tuples directly into the index
   right away.  It can dramatically reduce the amount of random I/O needed
   for non-ordered data sets.  For well-ordered data sets the benefit is
   smaller or non-existent, because only a small number of pages receive new
   tuples at a time, and those pages fit in cache even if the index as a
   whole does not.
  </para>

  <para>
   The buffered method needs to call the <function>penalty</function>
   function more often than the simple method does, which consumes some
   extra CPU resources. Also, the buffers need temporary disk space, up to
   the size of the resulting index. Buffering can also influence the quality
   of the resulting index, in both positive and negative directions. That
   influence depends on various factors, like the distribution of the input
   data and the operator class implementation.
  </para>

  <para>
   If sorting is not possible, then by default a GiST index build switches
   to the buffering method when the index size reaches
   <xref linkend="guc-effective-cache-size"/>.  Buffering can be manually
   forced or prevented by the <literal>buffering</literal> parameter to the
   CREATE INDEX command.  The default behavior is good for most cases, but
   turning buffering off might speed up the build somewhat if the input data
   is ordered.
  </para>

 </sect3>
</sect2>

<sect2 id="gist-examples">
 <title>Examples</title>

 <para>
  The <productname>PostgreSQL</productname> source distribution includes
  several examples of index methods implemented using
  <acronym>GiST</acronym>.  The core system currently provides text search
  support (indexing for <type>tsvector</type> and <type>tsquery</type>) as well as
  R-Tree equivalent functionality for some of the built-in geometric data types
  (see <filename>src/backend/access/gist/gistproc.c</filename>).  The following
  <filename>contrib</filename> modules also contain <acronym>GiST</acronym>
  operator classes:

 <variablelist>
  <varlistentry>
   <term><filename>btree_gist</filename></term>
   <listitem>
    <para>B-tree equivalent functionality for several data types</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>cube</filename></term>
   <listitem>
    <para>Indexing for multidimensional cubes</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>hstore</filename></term>
   <listitem>
    <para>Module for storing (key, value) pairs</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>intarray</filename></term>
   <listitem>
    <para>RD-Tree for one-dimensional array of int4 values</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>ltree</filename></term>
   <listitem>
    <para>Indexing for tree-like structures</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>pg_trgm</filename></term>
   <listitem>
    <para>Text similarity using trigram matching</para>
   </listitem>
  </varlistentry>

  <varlistentry>
   <term><filename>seg</filename></term>
   <listitem>
    <para>Indexing for <quote>float ranges</quote></para>
   </listitem>
  </varlistentry>
 </variablelist>
 </para>

</sect2>

</sect1>

Title: GiST Index Build Methods and Examples
Summary
This passage discusses the build methods for GiST indexes, including sorted and buffered modes, and their effects on performance, as well as providing examples of GiST index implementations in PostgreSQL, including various contrib modules such as btree_gist, cube, and pg_trgm.