Home Explore Blog CI



postgresql

30th chunk of `doc/src/sgml/queries.sgml`
448e3726021b46ce1b92982c210d00973a89fc75a1ab25fa0000000100000fa0
 it is perhaps unsound to rely on it.  The order of the rows within each
     level is certainly undefined, so some explicit ordering might be desired
     in any case.
    </para>
   </tip>

   <para>
    There is built-in syntax to compute a depth- or breadth-first sort column.
    For example:

<programlisting>
WITH RECURSIVE search_tree(id, link, data) AS (
    SELECT t.id, t.link, t.data
    FROM tree t
  UNION ALL
    SELECT t.id, t.link, t.data
    FROM tree t, search_tree st
    WHERE t.id = st.link
) <emphasis>SEARCH DEPTH FIRST BY id SET ordercol</emphasis>
SELECT * FROM search_tree ORDER BY ordercol;

WITH RECURSIVE search_tree(id, link, data) AS (
    SELECT t.id, t.link, t.data
    FROM tree t
  UNION ALL
    SELECT t.id, t.link, t.data
    FROM tree t, search_tree st
    WHERE t.id = st.link
) <emphasis>SEARCH BREADTH FIRST BY id SET ordercol</emphasis>
SELECT * FROM search_tree ORDER BY ordercol;
</programlisting>
    This syntax is internally expanded to something similar to the above
    hand-written forms.  The <literal>SEARCH</literal> clause specifies whether
    depth- or breadth first search is wanted, the list of columns to track for
    sorting, and a column name that will contain the result data that can be
    used for sorting.  That column will implicitly be added to the output rows
    of the CTE.
   </para>
  </sect3>

  <sect3 id="queries-with-cycle">
   <title>Cycle Detection</title>

  <para>
   When working with recursive queries it is important to be sure that
   the recursive part of the query will eventually return no tuples,
   or else the query will loop indefinitely.  Sometimes, using
   <literal>UNION</literal> instead of <literal>UNION ALL</literal> can accomplish this
   by discarding rows that duplicate previous output rows.  However, often a
   cycle does not involve output rows that are completely duplicate: it may be
   necessary to check just one or a few fields to see if the same point has
   been reached before.  The standard method for handling such situations is
   to compute an array of the already-visited values.  For example, consider again
   the following query that searches a table <structname>graph</structname> using a
   <structfield>link</structfield> field:

<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth) AS (
    SELECT g.id, g.link, g.data, 0
    FROM graph g
  UNION ALL
    SELECT g.id, g.link, g.data, sg.depth + 1
    FROM graph g, search_graph sg
    WHERE g.id = sg.link
)
SELECT * FROM search_graph;
</programlisting>

   This query will loop if the <structfield>link</structfield> relationships contain
   cycles.  Because we require a <quote>depth</quote> output, just changing
   <literal>UNION ALL</literal> to <literal>UNION</literal> would not eliminate the looping.
   Instead we need to recognize whether we have reached the same row again
   while following a particular path of links.  We add two columns
   <structfield>is_cycle</structfield> and <structfield>path</structfield> to the loop-prone query:

<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth, <emphasis>is_cycle, path</emphasis>) AS (
    SELECT g.id, g.link, g.data, 0,
      <emphasis>false,
      ARRAY[g.id]</emphasis>
    FROM graph g
  UNION ALL
    SELECT g.id, g.link, g.data, sg.depth + 1,
      <emphasis>g.id = ANY(path),
      path || g.id</emphasis>
    FROM graph g, search_graph sg
    WHERE g.id = sg.link <emphasis>AND NOT is_cycle</emphasis>
)
SELECT * FROM search_graph;
</programlisting>

   Aside from preventing cycles, the array value is often useful in its own
   right as representing the <quote>path</quote> taken to reach any particular row.
  </para>

  <para>
   In the general case where more than one field needs to be checked to
   recognize a cycle, use an array of rows.  For example, if we needed to
   compare fields <structfield>f1</structfield> and <structfield>f2</structfield>:

<programlisting>
WITH RECURSIVE search_graph(id, link, data,

Title: Cycle Detection in SQL Recursive Queries
Summary
This section discusses the importance of cycle detection in recursive SQL queries to prevent infinite loops. It introduces a method using arrays to track visited values, demonstrating how to modify a recursive query to include cycle detection. The technique involves adding columns for 'is_cycle' and 'path' to the query structure. The 'path' column stores an array of previously visited node IDs, while 'is_cycle' checks if the current node has been visited before. The text provides SQL examples for implementing this approach, including cases where multiple fields need to be checked for cycle detection. It emphasizes that this method not only prevents cycles but also provides useful information about the path taken to reach each row in the result set.