INCLUDE Feature, Suffix Truncation, Expression Indexes and Partial Indexes in PostgreSQL

fail. In any case, non-key columns duplicate data from the index's table and bloat the size of the index, thus potentially slowing searches. And remember that there is little point in including payload columns in an index unless the table changes slowly enough that an index-only scan is likely to not need to access the heap. If the heap tuple must be visited anyway, it costs nothing more to get the column's value from there. Other restrictions are that expressions are not currently supported as included columns, and that only B-tree, GiST and SP-GiST indexes currently support included columns. </para> <para> Before <productname>PostgreSQL</productname> had the <literal>INCLUDE</literal> feature, people sometimes made covering indexes by writing the payload columns as ordinary index columns, that is writing <programlisting> CREATE INDEX tab_x_y ON tab(x, y); </programlisting> even though they had no intention of ever using <literal>y</literal> as part of a <literal>WHERE</literal> clause. This works fine as long as the extra columns are trailing columns; making them be leading columns is unwise for the reasons explained in <xref linkend="indexes-multicolumn"/>. However, this method doesn't support the case where you want the index to enforce uniqueness on the key column(s). </para> <para> <firstterm>Suffix truncation</firstterm> always removes non-key columns from upper B-Tree levels. As payload columns, they are never used to guide index scans. The truncation process also removes one or more trailing key column(s) when the remaining prefix of key column(s) happens to be sufficient to describe tuples on the lowest B-Tree level. In practice, covering indexes without an <literal>INCLUDE</literal> clause often avoid storing columns that are effectively payload in the upper levels. However, explicitly defining payload columns as non-key columns <emphasis>reliably</emphasis> keeps the tuples in upper levels small. </para> <para> In principle, index-only scans can be used with expression indexes. For example, given an index on <literal>f(x)</literal> where <literal>x</literal> is a table column, it should be possible to execute <programlisting> SELECT f(x) FROM tab WHERE f(x) < 1; </programlisting> as an index-only scan; and this is very attractive if <literal>f()</literal> is an expensive-to-compute function. However, <productname>PostgreSQL</productname>'s planner is currently not very smart about such cases. It considers a query to be potentially executable by index-only scan only when all <emphasis>columns</emphasis> needed by the query are available from the index. In this example, <literal>x</literal> is not needed except in the context <literal>f(x)</literal>, but the planner does not notice that and concludes that an index-only scan is not possible. If an index-only scan seems sufficiently worthwhile, this can be worked around by adding <literal>x</literal> as an included column, for example <programlisting> CREATE INDEX tab_f_x ON tab (f(x)) INCLUDE (x); </programlisting> An additional caveat, if the goal is to avoid recalculating <literal>f(x)</literal>, is that the planner won't necessarily match uses of <literal>f(x)</literal> that aren't in indexable <literal>WHERE</literal> clauses to the index column. It will usually get this right in simple queries such as shown above, but not in queries that involve joins. These deficiencies may be remedied in future versions of <productname>PostgreSQL</productname>. </para> <para> Partial indexes also have interesting interactions with index-only scans. Consider the partial index shown in <xref linkend="indexes-partial-ex3"/>: <programlisting> CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target) WHERE success; </programlisting> In principle, we could do an index-only

Before the INCLUDE feature, covering indexes were created by including payload columns as regular index columns. While this works, it doesn't support uniqueness on key columns. Suffix truncation removes non-key columns from upper B-Tree levels, reducing index size. INCLUDE reliably keeps upper-level tuples small. Index-only scans can be used with expression indexes, but the planner might not recognize the possibility unless all columns are available from the index; adding x as INCLUDE can help. The planner might not match uses of f(x) to the index column in complex queries. Partial indexes also interact with index-only scans, potentially allowing them for specific subsets of data.