Home Explore Blog CI



postgresql

9th chunk of `doc/src/sgml/btree.sgml`
99d1e36761053867b8fde5c91a0f840c671eba2f4354bb7e0000000100000fa7
 options.  The options can be accessed from other support
     functions using the <literal>PG_HAS_OPCLASS_OPTIONS()</literal> and
     <literal>PG_GET_OPCLASS_OPTIONS()</literal> macros.
    </para>
    <para>
     Currently, no B-Tree operator class has an <function>options</function>
     support function.  B-tree doesn't allow flexible representation of keys
     like GiST, SP-GiST, GIN and BRIN do.  So, <function>options</function>
     probably doesn't have much application in the current B-tree index
     access method.  Nevertheless, this support function was added to B-tree
     for uniformity, and will probably find uses during further
     evolution of B-tree in <productname>PostgreSQL</productname>.
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><function>skipsupport</function></term>
   <listitem>
    <para>
     Optionally, a btree operator family may provide a <firstterm>skip
      support</firstterm> function, registered under support function number 6.
     These functions give the B-tree code a way to iterate through every
     possible value that can be represented by an operator class's underlying
     input type, in key space order.  This is used by the core code when it
     applies the skip scan optimization.  The APIs involved in this are
     defined in <filename>src/include/utils/skipsupport.h</filename>.
    </para>
    <para>
     Operator classes that do not provide a skip support function are still
     eligible to use skip scan.  The core code can still use its fallback
     strategy, though that might be suboptimal for some discrete types.  It
     usually doesn't make sense (and may not even be feasible) for operator
     classes on continuous types to provide a skip support function.
    </para>
    <para>
     It is not sensible for an operator family to register a cross-type
     <function>skipsupport</function> function, and attempting to do so will
     result in an error.  This is because determining the next indexable value
     must happen by incrementing a value copied from an index tuple.  The
     values generated must all be of the same underlying data type (the
     <quote>skipped</quote> index column's opclass input type).
    </para>
   </listitem>
  </varlistentry>
 </variablelist>

</sect2>

<sect2 id="btree-implementation">
 <title>Implementation</title>

 <para>
  This section covers B-Tree index implementation details that may be
  of use to advanced users.  See
  <filename>src/backend/access/nbtree/README</filename> in the source
  distribution for a much more detailed, internals-focused description
  of the B-Tree implementation.
 </para>
 <sect3 id="btree-structure">
  <title>B-Tree Structure</title>
  <para>
   <productname>PostgreSQL</productname> B-Tree indexes are
   multi-level tree structures, where each level of the tree can be
   used as a doubly-linked list of pages.  A single metapage is stored
   in a fixed position at the start of the first segment file of the
   index.  All other pages are either leaf pages or internal pages.
   Leaf pages are the pages on the lowest level of the tree.  All
   other levels consist of internal pages.  Each leaf page contains
   tuples that point to table rows.  Each internal page contains
   tuples that point to the next level down in the tree.  Typically,
   over 99% of all pages are leaf pages.  Both internal pages and leaf
   pages use the standard page format described in <xref
    linkend="storage-page-layout"/>.
  </para>
  <para>
   New leaf pages are added to a B-Tree index when an existing leaf
   page cannot fit an incoming tuple.  A <firstterm>page
    split</firstterm> operation makes room for items that originally
   belonged on the overflowing page by moving a portion of the items
   to a new page.  Page splits must also insert a new
   <firstterm>downlink</firstterm> to the new page in the parent page,
   which may cause the parent to split in turn.  Page splits
   <quote>cascade upwards</quote>

Title: Skip Support Function, B-Tree Implementation, and B-Tree Structure
Summary
This section discusses the optional 'skip support' function in B-tree operator families, used for iterating through all possible values of an operator class's input type in key space order, aiding the skip scan optimization. It notes that operator classes without this function can still use skip scan via a fallback strategy. The section emphasizes that cross-type skipsupport functions are not sensible and will result in an error. The section then transitions to the implementation details of B-Tree indexes in PostgreSQL, referencing the README file for internals-focused descriptions, and explains the multi-level tree structure, including the metapage, leaf pages, and internal pages. It describes leaf pages containing tuples that point to table rows, and internal pages containing tuples that point to the next level down in the tree. The section concludes by explaining how new leaf pages are added and how page splits cause new downlinks in the parent page.