Home Explore Blog CI



postgresql

7th chunk of `doc/src/sgml/perform.sgml`
6e7a16dcc7c06c9c3c389628e6d06361cdb50806820fbb7e0000000100000fa4
   at that node.
    The <literal>t1.unique2 = t2.unique2</literal> clause is not relevant yet,
    so it doesn't affect the row count of the outer scan.  The nested-loop
    join node will run its second,
    or <quote>inner</quote> child once for each row obtained from the outer child.
    Column values from the current outer row can be plugged into the inner
    scan; here, the <literal>t1.unique2</literal> value from the outer row is available,
    so we get a plan and costs similar to what we saw above for a simple
    <literal>SELECT ... WHERE t2.unique2 = <replaceable>constant</replaceable></literal> case.
    (The estimated cost is actually a bit lower than what was seen above,
    as a result of caching that's expected to occur during the repeated
    index scans on <literal>t2</literal>.)  The
    costs of the loop node are then set on the basis of the cost of the outer
    scan, plus one repetition of the inner scan for each outer row (10 * 7.90,
    here), plus a little CPU time for join processing.
   </para>

   <para>
    In this example the join's output row count is the same as the product
    of the two scans' row counts, but that's not true in all cases because
    there can be additional <literal>WHERE</literal> clauses that mention both tables
    and so can only be applied at the join point, not to either input scan.
    Here's an example:

<screen>
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 &lt; 10 AND t2.unique2 &lt; 10 AND t1.hundred &lt; t2.hundred;

                                         QUERY PLAN
-------------------------------------------------------------------&zwsp;--------------------------
 Nested Loop  (cost=4.65..49.36 rows=33 width=488)
   Join Filter: (t1.hundred &lt; t2.hundred)
   -&gt;  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.38 rows=10 width=244)
         Recheck Cond: (unique1 &lt; 10)
         -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 &lt; 10)
   -&gt;  Materialize  (cost=0.29..8.51 rows=10 width=244)
         -&gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
               Index Cond: (unique2 &lt; 10)
</screen>

    The condition <literal>t1.hundred &lt; t2.hundred</literal> can't be
    tested in the <literal>tenk2_unique2</literal> index, so it's applied at the
    join node.  This reduces the estimated output row count of the join node,
    but does not change either input scan.
   </para>

   <para>
    Notice that here the planner has chosen to <quote>materialize</quote> the inner
    relation of the join, by putting a Materialize plan node atop it.  This
    means that the <literal>t2</literal> index scan will be done just once, even
    though the nested-loop join node needs to read that data ten times, once
    for each row from the outer relation.  The Materialize node saves the data
    in memory as it's read, and then returns the data from memory on each
    subsequent pass.
   </para>

   <para>
    When dealing with outer joins, you might see join plan nodes with both
    <quote>Join Filter</quote> and plain <quote>Filter</quote> conditions attached.
    Join Filter conditions come from the outer join's <literal>ON</literal> clause,
    so a row that fails the Join Filter condition could still get emitted as
    a null-extended row.  But a plain Filter condition is applied after the
    outer-join rules and so acts to remove rows unconditionally.  In an inner
    join there is no semantic difference between these types of filters.
   </para>

   <para>
    If we change the query's selectivity a bit, we might get a very different
    join plan:

<screen>
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;

                                        QUERY PLAN
-------------------------------------------------------------------&zwsp;-----------------------
 Hash Join  (cost=226.23..709.73 rows=100 width=488)

Title: Query Planning with Nested-Loop Joins and Filters
Summary
This section explains how the planner optimizes queries with nested-loop joins, including the application of join filters, materialization of inner relations, and the use of index scans, as well as how the planner handles outer joins and adjusts the join plan based on the query's selectivity, including the potential use of hash joins instead of nested-loop joins.