WITH Queries and Recursive CTEs in SQL

statements for use in a larger query. These statements, which are often referred to as Common Table Expressions or <acronym>CTE</acronym>s, can be thought of as defining temporary tables that exist just for one query. Each auxiliary statement in a <literal>WITH</literal> clause can be a <command>SELECT</command>, <command>INSERT</command>, <command>UPDATE</command>, <command>DELETE</command>, or <command>MERGE</command>; and the <literal>WITH</literal> clause itself is attached to a primary statement that can also be a <command>SELECT</command>, <command>INSERT</command>, <command>UPDATE</command>, <command>DELETE</command>, or <command>MERGE</command>. </para> <sect2 id="queries-with-select"> <title><command>SELECT</command> in <literal>WITH</literal></title> <para> The basic value of <command>SELECT</command> in <literal>WITH</literal> is to break down complicated queries into simpler parts. An example is: <programlisting> WITH regional_sales AS ( SELECT region, SUM(amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > (SELECT SUM(total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM(quantity) AS product_units, SUM(amount) AS product_sales FROM orders WHERE region IN (SELECT region FROM top_regions) GROUP BY region, product; </programlisting> which displays per-product sales totals in only the top sales regions. The <literal>WITH</literal> clause defines two auxiliary statements named <structname>regional_sales</structname> and <structname>top_regions</structname>, where the output of <structname>regional_sales</structname> is used in <structname>top_regions</structname> and the output of <structname>top_regions</structname> is used in the primary <command>SELECT</command> query. This example could have been written without <literal>WITH</literal>, but we'd have needed two levels of nested sub-<command>SELECT</command>s. It's a bit easier to follow this way. </para> </sect2> <sect2 id="queries-with-recursive"> <title>Recursive Queries</title> <para> <indexterm> <primary>RECURSIVE</primary> <secondary>in common table expressions</secondary> </indexterm> The optional <literal>RECURSIVE</literal> modifier changes <literal>WITH</literal> from a mere syntactic convenience into a feature that accomplishes things not otherwise possible in standard SQL. Using <literal>RECURSIVE</literal>, a <literal>WITH</literal> query can refer to its own output. A very simple example is this query to sum the integers from 1 through 100: <programlisting> WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n < 100 ) SELECT sum(n) FROM t; </programlisting> The general form of a recursive <literal>WITH</literal> query is always a <firstterm>non-recursive term</firstterm>, then <literal>UNION</literal> (or <literal>UNION ALL</literal>), then a <firstterm>recursive term</firstterm>, where only the recursive term can contain a reference to the query's own output. Such a query is executed as follows: </para> <procedure> <title>Recursive Query Evaluation</title> <step performance="required"> <para> Evaluate the non-recursive term. For <literal>UNION</literal> (but not <literal>UNION ALL</literal>), discard duplicate rows. Include all remaining rows in the result of the recursive query, and also place them in a temporary <firstterm>working table</firstterm>. </para> </step> <step performance="required"> <para> So long as the working table is not empty, repeat these steps: </para> <substeps> <step performance="required"> <para> Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. For <literal>UNION</literal>

This section explains the WITH clause in SQL, also known as Common Table Expressions (CTEs). It describes how WITH can be used with various SQL commands like SELECT, INSERT, UPDATE, DELETE, and MERGE. The text focuses on using SELECT within WITH clauses, demonstrating how it can simplify complex queries by breaking them into smaller parts. An example is provided showing how to calculate per-product sales totals in top sales regions. The section also introduces recursive queries using the RECURSIVE modifier, which allows a WITH query to refer to its own output. A simple example of summing integers from 1 to 100 is given, followed by a detailed explanation of how recursive queries are evaluated. This includes the process of evaluating non-recursive and recursive terms, handling duplicates, and using a working table for iterative computation.