🗄️ SQL Q66 / 127

What is recursive CTE?

AI-Powered Answer ✓ Answered

A Common Table Expression (CTE) is a temporary, named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. A recursive CTE is a special type of CTE that can reference itself, allowing it to process hierarchical data or perform iterative calculations within a single SQL query.

What is a Recursive CTE?

A recursive CTE is a powerful SQL feature designed to handle hierarchical data structures (like organizational charts, bill of materials, or file systems) or to generate a series of values. It works by repeatedly executing a part of the query until a termination condition is met.

It consists of two main parts, combined with a UNION ALL operator:

  • Anchor Member (Base Case): This is the non-recursive part of the CTE. It defines the initial result set and the starting point for the recursion.
  • Recursive Member: This part references the CTE itself. It processes the results from the previous iteration (or the anchor member in the first iteration) and generates new rows. This member must include a condition that eventually terminates the recursion to prevent an infinite loop.

Structure of a Recursive CTE

sql
WITH RECURSIVE <cte_name> AS (
    -- Anchor Member (Base Case)
    SELECT column1, column2, ...
    FROM initial_table
    WHERE initial_condition

    UNION ALL

    -- Recursive Member
    SELECT t.column1, t.column2, ...
    FROM next_table t
    INNER JOIN <cte_name> ON t.parent_id = <cte_name>.id
    WHERE recursive_condition
)
SELECT *
FROM <cte_name>;

Let's break down the components:

  • WITH RECURSIVE: This keyword (or just WITH in some systems if the CTE name is referenced) explicitly declares a recursive CTE.
  • Anchor Member: The first SELECT statement. It's executed once and forms the base result set. It does not reference the CTE itself.
  • UNION ALL: Combines the results of the anchor member and the recursive member. UNION ALL is used because it doesn't remove duplicates, which is often desired in hierarchical traversals, and it performs better than UNION.
  • Recursive Member: The second SELECT statement. This is where the magic happens. It *must* reference the CTE name itself in its FROM clause. Each time it executes, it takes the rows generated by the previous iteration (or the anchor member initially), processes them, and generates new rows. Crucially, it must include a join condition or a WHERE clause that ensures the recursion eventually stops.

Example: Organizational Hierarchy

Imagine an employees table with employee_id, employee_name, and manager_id. We want to find all employees reporting up to a specific manager.

sql
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    employee_name VARCHAR(50),
    manager_id INT
);

INSERT INTO employees (employee_id, employee_name, manager_id) VALUES
(1, 'Alice (CEO)', NULL),
(2, 'Bob (VP Sales)', 1),
(3, 'Charlie (VP Marketing)', 1),
(4, 'David (Sales Manager)', 2),
(5, 'Eve (Marketing Manager)', 3),
(6, 'Frank (Sales Rep)', 4),
(7, 'Grace (Sales Rep)', 4),
(8, 'Heidi (Marketing Analyst)', 5);

-- Find all employees reporting directly or indirectly to Alice (employee_id = 1)
WITH RECURSIVE OrganizationChart AS (
    -- Anchor Member: Start with the top-level manager (Alice)
    SELECT
        e.employee_id,
        e.employee_name,
        e.manager_id,
        0 AS level
    FROM
        employees e
    WHERE
        e.employee_id = 1 -- Alice's ID

    UNION ALL

    -- Recursive Member: Find employees whose manager is in the previous result set
    SELECT
        e.employee_id,
        e.employee_name,
        e.manager_id,
        oc.level + 1 AS level
    FROM
        employees e
    INNER JOIN
        OrganizationChart oc ON e.manager_id = oc.employee_id
)
SELECT
    employee_id,
    employee_name,
    manager_id,
    level
FROM
    OrganizationChart
ORDER BY
    level, employee_name;

Key Use Cases and Considerations

  • Hierarchical Queries: Traversing organizational structures, bill of materials, category trees, file systems.
  • Graph Traversal: Finding paths in networks (e.g., social networks, transportation routes).
  • Series Generation: Generating sequences of numbers, dates, or other iterative data.
  • Termination Condition: Absolutely critical to avoid infinite loops. The recursive member must eventually produce no new rows, causing the recursion to stop.
  • Performance: Recursive CTEs can be resource-intensive for very deep hierarchies or large datasets. Indexes on the manager_id (or equivalent foreign key) and employee_id (or primary key) are highly beneficial. Some systems (like SQL Server) have a MAXRECURSION option to limit recursion depth.