Home Explore Blog CI



postgresql

10th chunk of `doc/src/sgml/gist.sgml`
c33030b5059c93db44e5aff635f0a500e96d3968afb4ecaf0000000100000fa8
 DatumGetDataType(origentry->key);
    data_type  *new = DatumGetDataType(newentry->key);

    *penalty = my_penalty_implementation(orig, new);
    PG_RETURN_POINTER(penalty);
}
</programlisting>

        For historical reasons, the <function>penalty</function> function doesn't
        just return a <type>float</type> result; instead it has to store the value
        at the location indicated by the third argument.  The return
        value per se is ignored, though it's conventional to pass back the
        address of that argument.
      </para>

      <para>
        The <function>penalty</function> function is crucial to good performance of
        the index. It'll get used at insertion time to determine which branch
        to follow when choosing where to add the new entry in the tree. At
        query time, the more balanced the index, the quicker the lookup.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><function>picksplit</function></term>
     <listitem>
      <para>
       When an index page split is necessary, this function decides which
       entries on the page are to stay on the old page, and which are to move
       to the new page.
      </para>

      <para>
        The <acronym>SQL</acronym> declaration of the function must look like this:

<programlisting>
CREATE OR REPLACE FUNCTION my_picksplit(internal, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
</programlisting>

        And the matching code in the C module could then follow this skeleton:

<programlisting>
PG_FUNCTION_INFO_V1(my_picksplit);

Datum
my_picksplit(PG_FUNCTION_ARGS)
{
    GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
    GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
    OffsetNumber maxoff = entryvec-&gt;n - 1;
    GISTENTRY  *ent = entryvec-&gt;vector;
    int         i,
                nbytes;
    OffsetNumber *left,
               *right;
    data_type  *tmp_union;
    data_type  *unionL;
    data_type  *unionR;
    GISTENTRY **raw_entryvec;

    maxoff = entryvec-&gt;n - 1;
    nbytes = (maxoff + 1) * sizeof(OffsetNumber);

    v-&gt;spl_left = (OffsetNumber *) palloc(nbytes);
    left = v-&gt;spl_left;
    v-&gt;spl_nleft = 0;

    v-&gt;spl_right = (OffsetNumber *) palloc(nbytes);
    right = v-&gt;spl_right;
    v-&gt;spl_nright = 0;

    unionL = NULL;
    unionR = NULL;

    /* Initialize the raw entry vector. */
    raw_entryvec = (GISTENTRY **) malloc(entryvec-&gt;n * sizeof(void *));
    for (i = FirstOffsetNumber; i &lt;= maxoff; i = OffsetNumberNext(i))
        raw_entryvec[i] = &amp;(entryvec-&gt;vector[i]);

    for (i = FirstOffsetNumber; i &lt;= maxoff; i = OffsetNumberNext(i))
    {
        int         real_index = raw_entryvec[i] - entryvec-&gt;vector;

        tmp_union = DatumGetDataType(entryvec-&gt;vector[real_index].key);
        Assert(tmp_union != NULL);

        /*
         * Choose where to put the index entries and update unionL and unionR
         * accordingly. Append the entries to either v-&gt;spl_left or
         * v-&gt;spl_right, and care about the counters.
         */

        if (my_choice_is_left(unionL, curl, unionR, curr))
        {
            if (unionL == NULL)
                unionL = tmp_union;
            else
                unionL = my_union_implementation(unionL, tmp_union);

            *left = real_index;
            ++left;
            ++(v-&gt;spl_nleft);
        }
        else
        {
            /*
             * Same on the right
             */
        }
    }

    v-&gt;spl_ldatum = DataTypeGetDatum(unionL);
    v-&gt;spl_rdatum = DataTypeGetDatum(unionR);
    PG_RETURN_POINTER(v);
}
</programlisting>

       Notice that the <function>picksplit</function> function's result is delivered
       by modifying the passed-in <structname>v</structname> structure.  The return
       value per se is ignored, though it's conventional to pass back the
       address of <structname>v</structname>.

Title: GiST Index Support Methods: Penalty and Picksplit
Summary
This passage discusses the penalty function, which determines the cost of inserting new entries into a GiST index, and the picksplit function, which decides how to split index pages when necessary, providing example code and explanations for each method's purpose and implementation, highlighting their importance for good index performance.