What is recursive CTE?
A Common Table Expression (CTE) is a temporary, named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. A recursive CTE is a special type of CTE that can reference itself, allowing it to process hierarchical data or perform iterative calculations within a single SQL query.
What is a Recursive CTE?
A recursive CTE is a powerful SQL feature designed to handle hierarchical data structures (like organizational charts, bill of materials, or file systems) or to generate a series of values. It works by repeatedly executing a part of the query until a termination condition is met.
It consists of two main parts, combined with a UNION ALL operator:
- Anchor Member (Base Case): This is the non-recursive part of the CTE. It defines the initial result set and the starting point for the recursion.
- Recursive Member: This part references the CTE itself. It processes the results from the previous iteration (or the anchor member in the first iteration) and generates new rows. This member must include a condition that eventually terminates the recursion to prevent an infinite loop.
Structure of a Recursive CTE
WITH RECURSIVE <cte_name> AS (
-- Anchor Member (Base Case)
SELECT column1, column2, ...
FROM initial_table
WHERE initial_condition
UNION ALL
-- Recursive Member
SELECT t.column1, t.column2, ...
FROM next_table t
INNER JOIN <cte_name> ON t.parent_id = <cte_name>.id
WHERE recursive_condition
)
SELECT *
FROM <cte_name>;
Let's break down the components:
WITH RECURSIVE: This keyword (or justWITHin some systems if the CTE name is referenced) explicitly declares a recursive CTE.- Anchor Member: The first
SELECTstatement. It's executed once and forms the base result set. It does not reference the CTE itself. UNION ALL: Combines the results of the anchor member and the recursive member.UNION ALLis used because it doesn't remove duplicates, which is often desired in hierarchical traversals, and it performs better thanUNION.- Recursive Member: The second
SELECTstatement. This is where the magic happens. It *must* reference the CTE name itself in itsFROMclause. Each time it executes, it takes the rows generated by the previous iteration (or the anchor member initially), processes them, and generates new rows. Crucially, it must include a join condition or aWHEREclause that ensures the recursion eventually stops.
Example: Organizational Hierarchy
Imagine an employees table with employee_id, employee_name, and manager_id. We want to find all employees reporting up to a specific manager.
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(50),
manager_id INT
);
INSERT INTO employees (employee_id, employee_name, manager_id) VALUES
(1, 'Alice (CEO)', NULL),
(2, 'Bob (VP Sales)', 1),
(3, 'Charlie (VP Marketing)', 1),
(4, 'David (Sales Manager)', 2),
(5, 'Eve (Marketing Manager)', 3),
(6, 'Frank (Sales Rep)', 4),
(7, 'Grace (Sales Rep)', 4),
(8, 'Heidi (Marketing Analyst)', 5);
-- Find all employees reporting directly or indirectly to Alice (employee_id = 1)
WITH RECURSIVE OrganizationChart AS (
-- Anchor Member: Start with the top-level manager (Alice)
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
0 AS level
FROM
employees e
WHERE
e.employee_id = 1 -- Alice's ID
UNION ALL
-- Recursive Member: Find employees whose manager is in the previous result set
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
oc.level + 1 AS level
FROM
employees e
INNER JOIN
OrganizationChart oc ON e.manager_id = oc.employee_id
)
SELECT
employee_id,
employee_name,
manager_id,
level
FROM
OrganizationChart
ORDER BY
level, employee_name;
Key Use Cases and Considerations
- Hierarchical Queries: Traversing organizational structures, bill of materials, category trees, file systems.
- Graph Traversal: Finding paths in networks (e.g., social networks, transportation routes).
- Series Generation: Generating sequences of numbers, dates, or other iterative data.
- Termination Condition: Absolutely critical to avoid infinite loops. The recursive member must eventually produce no new rows, causing the recursion to stop.
- Performance: Recursive CTEs can be resource-intensive for very deep hierarchies or large datasets. Indexes on the
manager_id(or equivalent foreign key) andemployee_id(or primary key) are highly beneficial. Some systems (like SQL Server) have aMAXRECURSIONoption to limit recursion depth.