GiST Index Support Methods: Decompress, Penalty, and Picksplit

the specific type you're converting to in order to compress your leaf nodes, of course. </para> </listitem> </varlistentry> <varlistentry> <term><function>decompress</function></term> <listitem> <para> Converts the stored representation of a data item into a format that can be manipulated by the other GiST methods in the operator class. If the <function>decompress</function> method is omitted, it is assumed that the other GiST methods can work directly on the stored data format. (<function>decompress</function> is not necessarily the reverse of the <function>compress</function> method; in particular, if <function>compress</function> is lossy then it's impossible for <function>decompress</function> to exactly reconstruct the original data. <function>decompress</function> is not necessarily equivalent to <function>fetch</function>, either, since the other GiST methods might not require full reconstruction of the data.) </para> <para> The <acronym>SQL</acronym> declaration of the function must look like this: <programlisting> CREATE OR REPLACE FUNCTION my_decompress(internal) RETURNS internal AS 'MODULE_PATHNAME' LANGUAGE C STRICT; </programlisting> And the matching code in the C module could then follow this skeleton: <programlisting> PG_FUNCTION_INFO_V1(my_decompress); Datum my_decompress(PG_FUNCTION_ARGS) { PG_RETURN_POINTER(PG_GETARG_POINTER(0)); } </programlisting> The above skeleton is suitable for the case where no decompression is needed. (But, of course, omitting the method altogether is even easier, and is recommended in such cases.) </para> </listitem> </varlistentry> <varlistentry> <term><function>penalty</function></term> <listitem> <para> Returns a value indicating the <quote>cost</quote> of inserting the new entry into a particular branch of the tree. Items will be inserted down the path of least <function>penalty</function> in the tree. Values returned by <function>penalty</function> should be non-negative. If a negative value is returned, it will be treated as zero. </para> <para> The <acronym>SQL</acronym> declaration of the function must look like this: <programlisting> CREATE OR REPLACE FUNCTION my_penalty(internal, internal, internal) RETURNS internal AS 'MODULE_PATHNAME' LANGUAGE C STRICT; -- in some cases penalty functions need not be strict </programlisting> And the matching code in the C module could then follow this skeleton: <programlisting> PG_FUNCTION_INFO_V1(my_penalty); Datum my_penalty(PG_FUNCTION_ARGS) { GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINTER(0); GISTENTRY *newentry = (GISTENTRY *) PG_GETARG_POINTER(1); float *penalty = (float *) PG_GETARG_POINTER(2); data_type *orig = DatumGetDataType(origentry->key); data_type *new = DatumGetDataType(newentry->key); *penalty = my_penalty_implementation(orig, new); PG_RETURN_POINTER(penalty); } </programlisting> For historical reasons, the <function>penalty</function> function doesn't just return a <type>float</type> result; instead it has to store the value at the location indicated by the third argument. The return value per se is ignored, though it's conventional to pass back the address of that argument. </para> <para> The <function>penalty</function> function is crucial to good performance of the index. It'll get used at insertion time to determine which branch to follow when choosing where to add the new entry in the tree. At query time, the more balanced the index, the quicker the lookup. </para> </listitem> </varlistentry> <varlistentry> <term><function>picksplit</function></term> <listitem>

This passage continues discussing GiST index support methods, covering the decompress method for converting stored data back into a manipulable format, the penalty method for determining the cost of inserting new entries into the tree, and introduces the picksplit method, providing example code and explanations for each method's purpose and implementation.