Home Explore Blog CI



postgresql

2nd chunk of `doc/src/sgml/hash.sgml`
d07b58ecf027f11b9ddc103f6eb569f8bddace6e90bad4fd0000000100000890
 we need to
  scan through all of the overflow pages. Thus an unbalanced hash index
  might actually be worse than a B-tree in terms of number of block
  accesses required, for some data.
 </para>

 <para>
  As a result of the overflow cases, we can say that hash indexes are
  most suitable for unique, nearly unique data or data with a low number
  of rows per hash bucket.
  One possible way to avoid problems is to exclude highly non-unique
  values from the index using a partial index condition, but this may
  not be suitable in many cases.
 </para>

 <para>
  Like B-Trees, hash indexes perform simple index tuple deletion. This
  is a deferred maintenance operation that deletes index tuples that are
  known to be safe to delete (those whose item identifier's LP_DEAD bit
  is already set). If an insert finds no space is available on a page we
  try to avoid creating a new overflow page by attempting to remove dead
  index tuples. Removal cannot occur if the page is pinned at that time.
  Deletion of dead index pointers also occurs during VACUUM.
 </para>

 <para>
  If it can, VACUUM will also try to squeeze the index tuples onto as
  few overflow pages as possible, minimizing the overflow chain. If an
  overflow page becomes empty, overflow pages can be recycled for reuse
  in other buckets, though we never return them to the operating system.
  There is currently no provision to shrink a hash index, other than by
  rebuilding it with REINDEX.
  There is no provision for reducing the number of buckets, either.
 </para>

 <para>
  Hash indexes may expand the number of bucket pages as the number of
  rows indexed grows. The hash key-to-bucket-number mapping is chosen so that
  the index can be incrementally expanded. When a new bucket is to be added to
  the index, exactly one existing bucket will need to be "split", with some of
  its tuples being transferred to the new bucket according to the updated
  key-to-bucket-number mapping.
 </para>

 <para>
  The expansion occurs in the foreground, which could increase execution
  time for user inserts. Thus, hash indexes may not be suitable for tables
  with rapidly increasing number of rows.
 </para>

</sect2>

Title: Hash Index Maintenance and Limitations
Summary
Hash indexes in PostgreSQL have limitations such as potential performance issues with unbalanced data, and require maintenance operations like tuple deletion and vacuuming to remove dead tuples and minimize overflow pages, but may not be suitable for tables with rapidly increasing rows due to foreground expansion.