Query Rewriting and the Planner/Optimizer

<itemizedlist> <listitem> <para> The first one worked using <firstterm>row level</firstterm> processing and was implemented deep in the <firstterm>executor</firstterm>. The rule system was called whenever an individual row had been accessed. This implementation was removed in 1995 when the last official release of the <productname>Berkeley Postgres</productname> project was transformed into <productname>Postgres95</productname>. </para> </listitem> <listitem> <para> The second implementation of the rule system is a technique called <firstterm>query rewriting</firstterm>. The <firstterm>rewrite system</firstterm> is a module that exists between the <firstterm>parser stage</firstterm> and the <firstterm>planner/optimizer</firstterm>. This technique is still implemented. </para> </listitem> </itemizedlist> </para> <para> The query rewriter is discussed in some detail in <xref linkend="rules"/>, so there is no need to cover it here. We will only point out that both the input and the output of the rewriter are query trees, that is, there is no change in the representation or level of semantic detail in the trees. Rewriting can be thought of as a form of macro expansion. </para> </sect1> <sect1 id="planner-optimizer"> <title>Planner/Optimizer</title> <para> The task of the <firstterm>planner/optimizer</firstterm> is to create an optimal execution plan. A given SQL query (and hence, a query tree) can be actually executed in a wide variety of different ways, each of which will produce the same set of results. If it is computationally feasible, the query optimizer will examine each of these possible execution plans, ultimately selecting the execution plan that is expected to run the fastest. </para> <note> <para> In some situations, examining each possible way in which a query can be executed would take an excessive amount of time and memory. In particular, this occurs when executing queries involving large numbers of join operations. In order to determine a reasonable (not necessarily optimal) query plan in a reasonable amount of time, <productname>PostgreSQL</productname> uses a <firstterm>Genetic Query Optimizer</firstterm> (see <xref linkend="geqo"/>) when the number of joins exceeds a threshold (see <xref linkend="guc-geqo-threshold"/>). </para> </note> <para> The planner's search procedure actually works with data structures called <firstterm>paths</firstterm>, which are simply cut-down representations of plans containing only as much information as the planner needs to make its decisions. After the cheapest path is determined, a full-fledged <firstterm>plan tree</firstterm> is built to pass to the executor. This represents the desired execution plan in sufficient detail for the executor to run it. In the rest of this section we'll ignore the distinction between paths and plans. </para> <sect2 id="planner-optimizer-generating-possible-plans"> <title>Generating Possible Plans</title> <para> The planner/optimizer starts by generating plans for scanning each individual relation (table) used in the query. The possible plans are determined by the available indexes on each relation. There is always the possibility of performing a sequential scan on a relation, so a sequential scan plan is always created. Assume an index is defined on a relation (for example a B-tree index) and a query contains the restriction <literal>relation.attribute OPR constant</literal>. If <literal>relation.attribute</literal> happens to match the key of the B-tree index and <literal>OPR</literal> is one of the operators listed in the index's <firstterm>operator class</firstterm>, another plan

This section elaborates on query rewriting in PostgreSQL, which operates between the parser and the planner/optimizer, manipulating query trees. It then discusses the role of the planner/optimizer in creating optimal execution plans for SQL queries. The planner evaluates various execution plans to select the fastest one. For complex queries with many joins, PostgreSQL employs a Genetic Query Optimizer to find a reasonable plan efficiently. The planner uses 'paths' to represent plans, eventually building a full 'plan tree' for the executor. The planner begins by generating plans for scanning individual relations based on available indexes, including the option of a sequential scan. The presence of indexes allows for alternative plans based on matching attributes and operators.