Tips for Examining and Optimizing Index Usage

Usage</title> <indexterm zone="indexes-examine"> <primary>index</primary> <secondary>examining usage</secondary> </indexterm> <para> Although indexes in <productname>PostgreSQL</productname> do not need maintenance or tuning, it is still important to check which indexes are actually used by the real-life query workload. Examining index usage for an individual query is done with the <xref linkend="sql-explain"/> command; its application for this purpose is illustrated in <xref linkend="using-explain"/>. It is also possible to gather overall statistics about index usage in a running server, as described in <xref linkend="monitoring-stats"/>. </para> <para> It is difficult to formulate a general procedure for determining which indexes to create. There are a number of typical cases that have been shown in the examples throughout the previous sections. A good deal of experimentation is often necessary. The rest of this section gives some tips for that: </para> <itemizedlist> <listitem> <para> Always run <xref linkend="sql-analyze"/> first. This command collects statistics about the distribution of the values in the table. This information is required to estimate the number of rows returned by a query, which is needed by the planner to assign realistic costs to each possible query plan. In absence of any real statistics, some default values are assumed, which are almost certain to be inaccurate. Examining an application's index usage without having run <command>ANALYZE</command> is therefore a lost cause. See <xref linkend="vacuum-for-statistics"/> and <xref linkend="autovacuum"/> for more information. </para> </listitem> <listitem> <para> Use real data for experimentation. Using test data for setting up indexes will tell you what indexes you need for the test data, but that is all. </para> <para> It is especially fatal to use very small test data sets. While selecting 1000 out of 100000 rows could be a candidate for an index, selecting 1 out of 100 rows will hardly be, because the 100 rows probably fit within a single disk page, and there is no plan that can beat sequentially fetching 1 disk page. </para> <para> Also be careful when making up test data, which is often unavoidable when the application

PostgreSQL indexes don't require tuning, but it's crucial to check which indexes are used by the query workload. Use the EXPLAIN command to examine individual query index usage and gather statistics about overall index usage. It is difficult to make a general procedure for creating the best indexes, so experimentation is required. Some tips: Run ANALYZE first to collect statistics about data distribution, use real data for experimentation (test data is not useful), and avoid using very small test datasets.