Home Explore Blog CI



postgresql

25th chunk of `doc/src/sgml/xfunc.sgml`
5e45afd68d295abb07c3578836016c8d9d3b40c3c244517e0000000100000fa0
 write C-language functions, you need to know how
     <productname>PostgreSQL</productname> internally represents base
     data types and how they can be passed to and from functions.
     Internally, <productname>PostgreSQL</productname> regards a base
     type as a <quote>blob of memory</quote>.  The user-defined
     functions that you define over a type in turn define the way that
     <productname>PostgreSQL</productname> can operate on it.  That
     is, <productname>PostgreSQL</productname> will only store and
     retrieve the data from disk and use your user-defined functions
     to input, process, and output the data.
    </para>

    <para>
     Base types can have one of three internal formats:

     <itemizedlist>
      <listitem>
       <para>
        pass by value, fixed-length
       </para>
      </listitem>
      <listitem>
       <para>
        pass by reference, fixed-length
       </para>
      </listitem>
      <listitem>
       <para>
        pass by reference, variable-length
       </para>
      </listitem>
     </itemizedlist>
    </para>

    <para>
     By-value  types  can  only be 1, 2, or 4 bytes in length
     (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
     You should be careful to define your types such that they will be the
     same size (in bytes) on all architectures.  For example, the
     <literal>long</literal> type is dangerous because it is 4 bytes on some
     machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
     on most Unix machines.  A reasonable implementation of the
     <type>int4</type> type on Unix machines might be:

<programlisting>
/* 4-byte integer, passed by value */
typedef int int4;
</programlisting>

     (The actual PostgreSQL C code calls this type <type>int32</type>, because
     it is a convention in C that <type>int<replaceable>XX</replaceable></type>
     means <replaceable>XX</replaceable> <emphasis>bits</emphasis>.  Note
     therefore also that the C type <type>int8</type> is 1 byte in size.  The
     SQL type <type>int8</type> is called <type>int64</type> in C.  See also
     <xref linkend="xfunc-c-type-table"/>.)
    </para>

    <para>
     On  the  other hand, fixed-length types of any size can
     be passed by-reference.  For example, here is a  sample
     implementation of a <productname>PostgreSQL</productname> type:

<programlisting>
/* 16-byte structure, passed by reference */
typedef struct
{
    double  x, y;
} Point;
</programlisting>

     Only  pointers  to  such types can be used when passing
     them in and out of <productname>PostgreSQL</productname> functions.
     To return a value of such a type, allocate the right amount of
     memory with <literal>palloc</literal>, fill in the allocated memory,
     and return a pointer to it.  (Also, if you just want to return the
     same value as one of your input arguments that's of the same data type,
     you can skip the extra <literal>palloc</literal> and just return the
     pointer to the input value.)
    </para>

    <para>
     Finally, all variable-length types must also be  passed
     by  reference.   All  variable-length  types must begin
     with an opaque length field of exactly 4 bytes, which will be set
     by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
     be  stored within that type must be located in the memory
     immediately  following  that  length  field.   The
     length field contains the total length of the structure,
     that is,  it  includes  the  size  of  the  length  field
     itself.
    </para>

    <para>
     Another important point is to avoid leaving any uninitialized bits
     within data type values; for example, take care to zero out any
     alignment padding bytes that might be present in structs.  Without
     this, logically-equivalent constants of your data type might be
     seen as unequal by the planner, leading to inefficient (though not
     incorrect)

Title: Internal Representation of Base Data Types in PostgreSQL C-Language Functions
Summary
PostgreSQL treats base types as memory blobs, using user-defined functions for operations. Base types can be passed by value (1, 2, 4 or 8 bytes) or by reference (fixed or variable length). For fixed-length types passed by reference, pointers are used for input and output in PostgreSQL functions. Memory allocation is done with palloc for returning values. Variable-length types must also be passed by reference, starting with a 4-byte length field set by SET_VARSIZE. It is important to avoid uninitialized bits within data type values.