Home Explore Blog CI



postgresql

27th chunk of `doc/src/sgml/perform.sgml`
d2be9e381a2a770bffd38ae0615c790a0f2342e766cee4180000000100000fa0
 join orders grows
   exponentially as the number of tables expands.  Beyond ten or so input
   tables it's no longer practical to do an exhaustive search of all the
   possibilities, and even for six or seven tables planning might take an
   annoyingly long time.  When there are too many input tables, the
   <productname>PostgreSQL</productname> planner will switch from exhaustive
   search to a <firstterm>genetic</firstterm> probabilistic search
   through a limited number of possibilities.  (The switch-over threshold is
   set by the <xref linkend="guc-geqo-threshold"/> run-time
   parameter.)
   The genetic search takes less time, but it won't
   necessarily find the best possible plan.
  </para>

  <para>
   When the query involves outer joins, the planner has less freedom
   than it does for plain (inner) joins. For example, consider:
<programlisting>
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
</programlisting>
   Although this query's restrictions are superficially similar to the
   previous example, the semantics are different because a row must be
   emitted for each row of A that has no matching row in the join of B and C.
   Therefore the planner has no choice of join order here: it must join
   B to C and then join A to that result.  Accordingly, this query takes
   less time to plan than the previous query.  In other cases, the planner
   might be able to determine that more than one join order is safe.
   For example, given:
<programlisting>
SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
</programlisting>
   it is valid to join A to either B or C first.  Currently, only
   <literal>FULL JOIN</literal> completely constrains the join order.  Most
   practical cases involving <literal>LEFT JOIN</literal> or <literal>RIGHT JOIN</literal>
   can be rearranged to some extent.
  </para>

  <para>
   Explicit inner join syntax (<literal>INNER JOIN</literal>, <literal>CROSS
   JOIN</literal>, or unadorned <literal>JOIN</literal>) is semantically the same as
   listing the input relations in <literal>FROM</literal>, so it does not
   constrain the join order.
  </para>

  <para>
   Even though most kinds of <literal>JOIN</literal> don't completely constrain
   the join order, it is possible to instruct the
   <productname>PostgreSQL</productname> query planner to treat all
   <literal>JOIN</literal> clauses as constraining the join order anyway.
   For example, these three queries are logically equivalent:
<programlisting>
SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
</programlisting>
   But if we tell the planner to honor the <literal>JOIN</literal> order,
   the second and third take less time to plan than the first.  This effect
   is not worth worrying about for only three tables, but it can be a
   lifesaver with many tables.
  </para>

  <para>
   To force the planner to follow the join order laid out by explicit
   <literal>JOIN</literal>s,
   set the <xref linkend="guc-join-collapse-limit"/> run-time parameter to 1.
   (Other possible values are discussed below.)
  </para>

  <para>
   You do not need to constrain the join order completely in order to
   cut search time, because it's OK to use <literal>JOIN</literal> operators
   within items of a plain <literal>FROM</literal> list.  For example, consider:
<programlisting>
SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
</programlisting>
   With <varname>join_collapse_limit</varname> = 1, this
   forces the planner to join A to B before joining them to other tables,
   but doesn't constrain its choices otherwise.  In this example, the
   number of possible join orders is reduced by a factor of 5.
  </para>

  <para>
   Constraining the planner's search in this way is a useful technique
   both for reducing planning time and for directing the planner

Title: Controlling Join Order in PostgreSQL
Summary
The PostgreSQL planner's join order can be controlled using explicit JOIN syntax, which can reduce planning time and improve performance, especially for complex queries with multiple tables, and can also handle outer joins with less freedom than inner joins, with options to force the planner to follow a specified join order using run-time parameters like join_collapse_limit.