Using EXPLAIN for Query Performance

<chapter id="performance-tips"> <title>Performance Tips</title> <indexterm zone="performance-tips"> <primary>performance</primary> </indexterm> <para> Query performance can be affected by many things. Some of these can be controlled by the user, while others are fundamental to the underlying design of the system. This chapter provides some hints about understanding and tuning <productname>PostgreSQL</productname> performance. </para> <sect1 id="using-explain"> <title>Using <command>EXPLAIN</command></title> <indexterm zone="using-explain"> <primary>EXPLAIN</primary> </indexterm> <indexterm zone="using-explain"> <primary>query plan</primary> </indexterm> <para> <productname>PostgreSQL</productname> devises a <firstterm>query plan</firstterm> for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex <firstterm>planner</firstterm> that tries to choose good plans. You can use the <link linkend="sql-explain"><command>EXPLAIN</command></link> command to see what query plan the planner creates for any query. Plan-reading is an art that requires some experience to master, but this section attempts to cover the basics. </para> <para> Examples in this section are drawn from the regression test database after doing a <command>VACUUM ANALYZE</command>, using v18 development sources. You should be able to get similar results if you try the examples yourself, but your estimated costs and row counts might vary slightly because <command>ANALYZE</command>'s statistics are random samples rather than exact, and because costs are inherently somewhat platform-dependent. </para> <para> The examples use <command>EXPLAIN</command>'s default <quote>text</quote> output format, which is compact and convenient for humans to read. If you want to feed <command>EXPLAIN</command>'s output to a program for further analysis, you should use one of its machine-readable output formats (XML, JSON, or YAML) instead. </para> <sect2 id="using-explain-basics"> <title><command>EXPLAIN</command> Basics</title> <para> The structure of a query plan is a tree of <firstterm>plan nodes</firstterm>. Nodes at the bottom level of the tree are scan nodes: they return raw rows from a table. There are different types of scan nodes for different table access methods: sequential scans, index scans, and bitmap index scans. There are also non-table row sources, such as <literal>VALUES</literal> clauses and set-returning functions in <literal>FROM</literal>, which have their own scan node types. If the query requires joining, aggregation, sorting, or other operations on the raw rows, then there will be additional nodes above the scan nodes to perform these operations. Again, there is usually more than one possible way to do these operations, so different node types can appear here too. The output of <command>EXPLAIN</command> has one line for each node in the plan tree, showing the basic node type plus the cost estimates that the planner made for the execution of that plan node. Additional lines might appear, indented from the node's summary line, to show additional properties of the node. The very first line (the summary line for the topmost node) has the estimated total execution cost for the plan; it is this number that the planner seeks to minimize. </para> <para> Here is a trivial example, just to show what the output looks like: <screen> EXPLAIN SELECT * FROM tenk1; QUERY PLAN ------------------------------------------------------------- Seq Scan on tenk1 (cost=0.00..445.00 rows=10000 width=244) </screen> </para> <para>

This section explains how to use the EXPLAIN command in PostgreSQL to analyze and understand query plans, which is critical for good performance, and provides examples and basics of reading query plans.