Details of Index Uniqueness Checks in PostgreSQL

<title>Index Uniqueness Checks</title> <para> <productname>PostgreSQL</productname> enforces SQL uniqueness constraints using <firstterm>unique indexes</firstterm>, which are indexes that disallow multiple entries with identical keys. An access method that supports this feature sets <structfield>amcanunique</structfield> true. (At present, only b-tree supports it.) Columns listed in the <literal>INCLUDE</literal> clause are not considered when enforcing uniqueness. </para> <para> Because of MVCC, it is always necessary to allow duplicate entries to exist physically in an index: the entries might refer to successive versions of a single logical row. The behavior we actually want to enforce is that no MVCC snapshot could include two rows with equal index keys. This breaks down into the following cases that must be checked when inserting a new row into a unique index: <itemizedlist> <listitem> <para> If a conflicting valid row has been deleted by the current transaction, it's okay. (In particular, since an UPDATE always deletes the old row version before inserting the new version, this will allow an UPDATE on a row without changing the key.) </para> </listitem> <listitem> <para> If a conflicting row has been inserted by an as-yet-uncommitted transaction, the would-be inserter must wait to see if that transaction commits. If it rolls back then there is no conflict. If it commits without deleting the conflicting row again, there is a uniqueness violation. (In practice we just wait for the other transaction to end and then redo the visibility check in toto.) </para> </listitem> <listitem> <para> Similarly, if a conflicting valid row has been deleted by an as-yet-uncommitted transaction, the would-be inserter must wait for that transaction to commit or abort, and then repeat the test. </para> </listitem> </itemizedlist> </para> <para> Furthermore, immediately before reporting a uniqueness violation according to the above rules, the access method must recheck the liveness of the row being inserted. If it is committed dead then no violation should be reported. (This case cannot occur during the ordinary scenario of inserting a row that's just been created by the current transaction. It can happen during <command>CREATE UNIQUE INDEX CONCURRENTLY</command>, however.) </para> <para> We require the index access method to apply these tests itself, which means that it must reach into the heap to check the commit status of any row that is shown to have a duplicate key according to the index contents. This is without a doubt ugly and non-modular, but it saves redundant work: if we did a separate probe then the index lookup for a conflicting row would be essentially repeated while finding the place to insert the new row's index entry. What's more, there is no obvious way to avoid race conditions unless the conflict check is an integral part of insertion of the new index entry. </para> <para> If the unique constraint is deferrable, there is additional complexity: we need to be able to insert an index entry for a new row, but defer any uniqueness-violation error until end of statement or even later. To avoid unnecessary repeat searches of the index, the index access method should do a preliminary uniqueness check during the initial insertion. If this shows that there is definitely no conflicting live tuple, we are done. Otherwise, we schedule a recheck to occur when it is time to enforce the constraint. If, at the time of the recheck, both the inserted tuple and some other tuple with the same key are live, then the error must be reported. (Note that for this purpose, <quote>live</quote> actually means <quote>any tuple in the index

PostgreSQL enforces SQL uniqueness constraints using unique indexes. Due to MVCC, duplicate entries can exist physically, but no MVCC snapshot should include two rows with equal index keys. Checks are required during insertion, considering deleted rows in the current transaction, uncommitted transactions, and rechecking liveness before reporting violations. The index access method performs these tests, accessing the heap for commit status. For deferrable unique constraints, a preliminary check is done during insertion, with a recheck scheduled later to enforce the constraint if necessary.