Home Explore Blog CI



postgresql

31th chunk of `doc/src/sgml/queries.sgml`
f1c9e81978892e4b186d362a30ed20918bf8607b618021730000000100000fa5
 <structfield>path</structfield> to the loop-prone query:

<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth, <emphasis>is_cycle, path</emphasis>) AS (
    SELECT g.id, g.link, g.data, 0,
      <emphasis>false,
      ARRAY[g.id]</emphasis>
    FROM graph g
  UNION ALL
    SELECT g.id, g.link, g.data, sg.depth + 1,
      <emphasis>g.id = ANY(path),
      path || g.id</emphasis>
    FROM graph g, search_graph sg
    WHERE g.id = sg.link <emphasis>AND NOT is_cycle</emphasis>
)
SELECT * FROM search_graph;
</programlisting>

   Aside from preventing cycles, the array value is often useful in its own
   right as representing the <quote>path</quote> taken to reach any particular row.
  </para>

  <para>
   In the general case where more than one field needs to be checked to
   recognize a cycle, use an array of rows.  For example, if we needed to
   compare fields <structfield>f1</structfield> and <structfield>f2</structfield>:

<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth, <emphasis>is_cycle, path</emphasis>) AS (
    SELECT g.id, g.link, g.data, 0,
      <emphasis>false,
      ARRAY[ROW(g.f1, g.f2)]</emphasis>
    FROM graph g
  UNION ALL
    SELECT g.id, g.link, g.data, sg.depth + 1,
      <emphasis>ROW(g.f1, g.f2) = ANY(path),
      path || ROW(g.f1, g.f2)</emphasis>
    FROM graph g, search_graph sg
    WHERE g.id = sg.link <emphasis>AND NOT is_cycle</emphasis>
)
SELECT * FROM search_graph;
</programlisting>
  </para>

  <tip>
   <para>
    Omit the <literal>ROW()</literal> syntax in the common case where only one field
    needs to be checked to recognize a cycle.  This allows a simple array
    rather than a composite-type array to be used, gaining efficiency.
   </para>
  </tip>

  <para>
   There is built-in syntax to simplify cycle detection.  The above query can
   also be written like this:
<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth) AS (
    SELECT g.id, g.link, g.data, 1
    FROM graph g
  UNION ALL
    SELECT g.id, g.link, g.data, sg.depth + 1
    FROM graph g, search_graph sg
    WHERE g.id = sg.link
) <emphasis>CYCLE id SET is_cycle USING path</emphasis>
SELECT * FROM search_graph;
</programlisting>
   and it will be internally rewritten to the above form.  The
   <literal>CYCLE</literal> clause specifies first the list of columns to
   track for cycle detection, then a column name that will show whether a
   cycle has been detected, and finally the name of another column that will track the
   path.  The cycle and path columns will implicitly be added to the output
   rows of the CTE.
  </para>

  <tip>
   <para>
    The cycle path column is computed in the same way as the depth-first
    ordering column show in the previous section.  A query can have both a
    <literal>SEARCH</literal> and a <literal>CYCLE</literal> clause, but a
    depth-first search specification and a cycle detection specification would
    create redundant computations, so it's more efficient to just use the
    <literal>CYCLE</literal> clause and order by the path column.  If
    breadth-first ordering is wanted, then specifying both
    <literal>SEARCH</literal> and <literal>CYCLE</literal> can be useful.
   </para>
  </tip>

  <para>
   A helpful trick for testing queries
   when you are not certain if they might loop is to place a <literal>LIMIT</literal>
   in the parent query.  For example, this query would loop forever without
   the <literal>LIMIT</literal>:

<programlisting>
WITH RECURSIVE t(n) AS (
    SELECT 1
  UNION ALL
    SELECT n+1 FROM t
)
SELECT n FROM t <emphasis>LIMIT 100</emphasis>;
</programlisting>

   This works because <productname>PostgreSQL</productname>'s implementation
   evaluates only as many rows of a <literal>WITH</literal> query as are actually
   fetched by the parent query.  Using this trick in production is not
   recommended, because other systems might work differently.  Also, it
   usually won't work if you make the outer query sort the recursive

Title: Advanced Cycle Detection in SQL Recursive Queries
Summary
This section delves deeper into cycle detection techniques for SQL recursive queries. It expands on the previous example by demonstrating how to handle cases where multiple fields need to be checked for cycle detection. The text introduces a method using arrays of rows to track multiple fields, showing how to modify the query structure to accommodate this. It also presents a built-in syntax for simplifying cycle detection, using the CYCLE clause. This clause automatically adds cycle detection and path tracking to the query without manually modifying the CTE structure. The section concludes with tips on efficiently combining SEARCH and CYCLE clauses, and a practical trick for testing potentially infinite recursive queries using LIMIT. The content emphasizes the importance of proper cycle detection in preventing infinite loops and providing useful path information in recursive query results.