<structfield>path</structfield> to the loop-prone query:
<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth, <emphasis>is_cycle, path</emphasis>) AS (
SELECT g.id, g.link, g.data, 0,
<emphasis>false,
ARRAY[g.id]</emphasis>
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1,
<emphasis>g.id = ANY(path),
path || g.id</emphasis>
FROM graph g, search_graph sg
WHERE g.id = sg.link <emphasis>AND NOT is_cycle</emphasis>
)
SELECT * FROM search_graph;
</programlisting>
Aside from preventing cycles, the array value is often useful in its own
right as representing the <quote>path</quote> taken to reach any particular row.
</para>
<para>
In the general case where more than one field needs to be checked to
recognize a cycle, use an array of rows. For example, if we needed to
compare fields <structfield>f1</structfield> and <structfield>f2</structfield>:
<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth, <emphasis>is_cycle, path</emphasis>) AS (
SELECT g.id, g.link, g.data, 0,
<emphasis>false,
ARRAY[ROW(g.f1, g.f2)]</emphasis>
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1,
<emphasis>ROW(g.f1, g.f2) = ANY(path),
path || ROW(g.f1, g.f2)</emphasis>
FROM graph g, search_graph sg
WHERE g.id = sg.link <emphasis>AND NOT is_cycle</emphasis>
)
SELECT * FROM search_graph;
</programlisting>
</para>
<tip>
<para>
Omit the <literal>ROW()</literal> syntax in the common case where only one field
needs to be checked to recognize a cycle. This allows a simple array
rather than a composite-type array to be used, gaining efficiency.
</para>
</tip>
<para>
There is built-in syntax to simplify cycle detection. The above query can
also be written like this:
<programlisting>
WITH RECURSIVE search_graph(id, link, data, depth) AS (
SELECT g.id, g.link, g.data, 1
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1
FROM graph g, search_graph sg
WHERE g.id = sg.link
) <emphasis>CYCLE id SET is_cycle USING path</emphasis>
SELECT * FROM search_graph;
</programlisting>
and it will be internally rewritten to the above form. The
<literal>CYCLE</literal> clause specifies first the list of columns to
track for cycle detection, then a column name that will show whether a
cycle has been detected, and finally the name of another column that will track the
path. The cycle and path columns will implicitly be added to the output
rows of the CTE.
</para>
<tip>
<para>
The cycle path column is computed in the same way as the depth-first
ordering column show in the previous section. A query can have both a
<literal>SEARCH</literal> and a <literal>CYCLE</literal> clause, but a
depth-first search specification and a cycle detection specification would
create redundant computations, so it's more efficient to just use the
<literal>CYCLE</literal> clause and order by the path column. If
breadth-first ordering is wanted, then specifying both
<literal>SEARCH</literal> and <literal>CYCLE</literal> can be useful.
</para>
</tip>
<para>
A helpful trick for testing queries
when you are not certain if they might loop is to place a <literal>LIMIT</literal>
in the parent query. For example, this query would loop forever without
the <literal>LIMIT</literal>:
<programlisting>
WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM t
)
SELECT n FROM t <emphasis>LIMIT 100</emphasis>;
</programlisting>
This works because <productname>PostgreSQL</productname>'s implementation
evaluates only as many rows of a <literal>WITH</literal> query as are actually
fetched by the parent query. Using this trick in production is not
recommended, because other systems might work differently. Also, it
usually won't work if you make the outer query sort the recursive