Home Explore Blog CI



postgresql

8th chunk of `doc/src/sgml/gist.sgml`
e37a8b1da81e5d77b71a1ad31b192d98b0bba1462a4888b10000000100000fa2
 DatumGetDataType(ent[i].key);
        out = my_union_implementation(out, tmp);
    }

    PG_RETURN_DATA_TYPE_P(out);
}
</programlisting>
      </para>

      <para>
        As you can see, in this skeleton we're dealing with a data type
        where <literal>union(X, Y, Z) = union(union(X, Y), Z)</literal>. It's easy
        enough to support data types where this is not the case, by
        implementing the proper union algorithm in this
        <acronym>GiST</acronym> support method.
      </para>

      <para>
        The result of the <function>union</function> function must be a value of the
        index's storage type, whatever that is (it might or might not be
        different from the indexed column's type).  The <function>union</function>
        function should return a pointer to newly <function>palloc()</function>ed
        memory. You can't just return the input value as-is, even if there is
        no type change.
      </para>

      <para>
       As shown above, the <function>union</function> function's
       first <type>internal</type> argument is actually
       a <structname>GistEntryVector</structname> pointer.  The second argument is a
       pointer to an integer variable, which can be ignored.  (It used to be
       required that the <function>union</function> function store the size of its
       result value into that variable, but this is no longer necessary.)
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><function>compress</function></term>
     <listitem>
      <para>
       Converts a data item into a format suitable for physical storage in
       an index page.
       If the <function>compress</function> method is omitted, data items are stored
       in the index without modification.
      </para>

      <para>
        The <acronym>SQL</acronym> declaration of the function must look like this:

<programlisting>
CREATE OR REPLACE FUNCTION my_compress(internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
</programlisting>

        And the matching code in the C module could then follow this skeleton:

<programlisting>
PG_FUNCTION_INFO_V1(my_compress);

Datum
my_compress(PG_FUNCTION_ARGS)
{
    GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
    GISTENTRY  *retval;

    if (entry-&gt;leafkey)
    {
        /* replace entry-&gt;key with a compressed version */
        compressed_data_type *compressed_data = palloc(sizeof(compressed_data_type));

        /* fill *compressed_data from entry-&gt;key ... */

        retval = palloc(sizeof(GISTENTRY));
        gistentryinit(*retval, PointerGetDatum(compressed_data),
                      entry-&gt;rel, entry-&gt;page, entry-&gt;offset, FALSE);
    }
    else
    {
        /* typically we needn't do anything with non-leaf entries */
        retval = entry;
    }

    PG_RETURN_POINTER(retval);
}
</programlisting>
      </para>

      <para>
       You have to adapt <replaceable>compressed_data_type</replaceable> to the specific
       type you're converting to in order to compress your leaf nodes, of
       course.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><function>decompress</function></term>
     <listitem>
      <para>
       Converts the stored representation of a data item into a format that
       can be manipulated by the other GiST methods in the operator class.
       If the <function>decompress</function> method is omitted, it is assumed that
       the other GiST methods can work directly on the stored data format.
       (<function>decompress</function> is not necessarily the reverse of
       the <function>compress</function> method; in particular,
       if <function>compress</function> is lossy then it's impossible
       for <function>decompress</function> to exactly reconstruct the original
       data.  <function>decompress</function> is not necessarily equivalent
       to <function>fetch</function>, either, since the other GiST methods

Title: GiST Index Support Methods: Union, Compress, and Decompress
Summary
This passage discusses the union, compress, and decompress methods in a GiST index, explaining how they are used to consolidate information in the tree, convert data items into a format suitable for physical storage, and convert the stored representation back into a format that can be manipulated by other GiST methods, providing example code and guidance on implementation and memory management.