Home Explore Blog CI



postgresql

3rd chunk of `doc/src/sgml/hash.sgml`
4fce6b158c970883a9e4e082aea3e8e2a97b974c9d79dfca0000000100000c7f
 minimizing the overflow chain. If an
  overflow page becomes empty, overflow pages can be recycled for reuse
  in other buckets, though we never return them to the operating system.
  There is currently no provision to shrink a hash index, other than by
  rebuilding it with REINDEX.
  There is no provision for reducing the number of buckets, either.
 </para>

 <para>
  Hash indexes may expand the number of bucket pages as the number of
  rows indexed grows. The hash key-to-bucket-number mapping is chosen so that
  the index can be incrementally expanded. When a new bucket is to be added to
  the index, exactly one existing bucket will need to be "split", with some of
  its tuples being transferred to the new bucket according to the updated
  key-to-bucket-number mapping.
 </para>

 <para>
  The expansion occurs in the foreground, which could increase execution
  time for user inserts. Thus, hash indexes may not be suitable for tables
  with rapidly increasing number of rows.
 </para>

</sect2>

<sect2 id="hash-implementation">
 <title>Implementation</title>

 <para>
  There are four kinds of pages in a hash index: the meta page (page zero),
  which contains statically allocated control information; primary bucket
  pages; overflow pages; and bitmap pages, which keep track of overflow
  pages that have been freed and are available for re-use. For addressing
  purposes, bitmap pages are regarded as a subset of the overflow pages.
 </para>

 <para>
  Both scanning the index and inserting tuples require locating the bucket
  where a given tuple ought to be located. To do this, we need the bucket
  count, highmask, and lowmask from the metapage; however, it's undesirable
  for performance reasons to have to have to lock and pin the metapage for
  every such operation. Instead, we retain a cached copy of the metapage
  in each backend's relcache entry. This will produce the correct bucket
  mapping as long as the target bucket hasn't been split since the last
  cache refresh.
 </para>

 <para>
  Primary bucket pages and overflow pages are allocated independently since
  any given index might need more or fewer overflow pages relative to its
  number of buckets. The hash code uses an interesting set of addressing
  rules to support a variable number of overflow pages while not having to
  move primary bucket pages around after they are created.
 </para>

 <para>
  Each row in the table indexed is represented by a single index tuple in
  the hash index. Hash index tuples are stored in bucket pages, and if
  they exist, overflow pages. We speed up searches by keeping the index entries
  in any one index page sorted by hash code, thus allowing binary search to be
  used within an index page. Note however that there is *no* assumption about
  the relative ordering of hash codes across different index pages of a bucket.
 </para>

 <para>
  The bucket splitting algorithms to expand the hash index are too complex to
  be worthy of mention here, though are described in more detail in
  <filename>src/backend/access/hash/README</filename>.
  The split algorithm is crash safe and can be restarted if not completed
  successfully.
 </para>

</sect2>

</sect1>

Title: Hash Index Implementation Details
Summary
A hash index consists of four types of pages: meta, primary bucket, overflow, and bitmap pages, with a cached copy of the metapage in each backend's relcache entry for performance, and uses a complex bucket splitting algorithm to expand the index, which is crash safe and can be restarted if not completed successfully.