Recursive CTEs — Fundamentals
Recursive CTEs — Fundamentals
Recursive CTEs are one of SQL's most powerful features. They let you work with hierarchical, tree-like, or sequence data without needing procedural loops. Understanding them unlocks an entirely new class of problems you can solve in pure SQL.
How Recursion Works in SQL
A recursive CTE has two parts separated by UNION ALL:
WITH RECURSIVE cte_name AS (
-- 1. Anchor member (base case — runs once)
SELECT ...
FROM some_table
WHERE <starting_condition>
UNION ALL
-- 2. Recursive member (runs repeatedly until no rows returned)
SELECT ...
FROM some_table
JOIN cte_name ON <join_to_previous_result>
)
SELECT * FROM cte_name;
Execution model:
- The anchor runs once and produces an initial result set (call it iteration 0)
- The recursive member runs using iteration 0 as input, producing iteration 1
- If iteration 1 has rows, run again using iteration 1 → iteration 2
- Continue until the recursive member produces 0 rows
- Final result = UNION ALL of all iterations
The RECURSIVE keyword is required even if only the CTE references itself.
Example 1: Generate a Number Series
The simplest recursive CTE — generate numbers 1 through 10:
WITH RECURSIVE nums AS (
SELECT 1 AS n -- anchor: start at 1
UNION ALL
SELECT n + 1 -- recursive: add 1 each time
FROM nums
WHERE n < 10 -- termination: stop at 10
)
SELECT n FROM nums;
This pattern is incredibly useful. PostgreSQL has generate_series(), but understanding this pattern is the foundation for all recursive CTEs.
Example 2: Generate a Date Series
Generate every day in Q1 2024:
WITH RECURSIVE date_series AS (
SELECT DATE '2024-01-01' AS dt
UNION ALL
SELECT (dt + INTERVAL '1 day')::DATE
FROM date_series
WHERE dt < DATE '2024-03-31'
)
SELECT dt FROM date_series;
Why this matters: When you join this series to real data, you ensure every date appears in your output — even dates with no ec_orders, no ec_events, or no prices. Gaps become explicit zeros instead of missing rows.
Example 3: Fill Gaps in Daily Order Counts
Without a date series, days with no ec_orders simply don't appear in GROUP BY results. With one, you can show zeros:
WITH RECURSIVE date_series AS (
SELECT MIN(DATE_TRUNC('day', created_at))::DATE AS dt
FROM ec_orders
UNION ALL
SELECT dt + 1
FROM date_series
WHERE dt < (SELECT MAX(DATE_TRUNC('day', created_at))::DATE FROM ec_orders)
),
daily_orders AS (
SELECT
DATE_TRUNC('day', created_at)::DATE AS order_date,
COUNT(*) AS order_count,
SUM(oi.quantity * oi.unit_price) AS revenue
FROM ec_orders o
JOIN ec_order_items oi ON oi.order_id = o.order_id
GROUP BY 1
)
SELECT
ds.dt AS date,
COALESCE(d.order_count, 0) AS ec_orders,
COALESCE(d.revenue, 0) AS revenue
FROM date_series ds
LEFT JOIN daily_orders d ON d.order_date = ds.dt
ORDER BY ds.dt;
Example 4: Stock Price Calendar (StockMarket)
Fill in missing trading days for a specific stock with the last known price (forward-fill):
WITH RECURSIVE date_series AS (
SELECT
MIN(price_date) AS dt,
MAX(price_date) AS end_dt
FROM stock_prices
WHERE company_id = 1
UNION ALL
SELECT dt + 1, end_dt
FROM date_series
WHERE dt < end_dt
),
actuals AS (
SELECT price_date, close_price
FROM stock_prices
WHERE company_id = 1
)
SELECT
ds.dt,
a.close_price,
CASE WHEN a.close_price IS NULL THEN 'missing' ELSE 'actual' END AS data_type
FROM date_series ds
LEFT JOIN actuals a ON a.price_date = ds.dt
ORDER BY ds.dt;
Note: True forward-fill requires a window function in a follow-up CTE. This query marks gaps — the next lesson covers filling them.
Example 5: Fibonacci Sequence
A classic recursion example to illustrate multi-column recursion:
WITH RECURSIVE fib AS (
SELECT 0 AS a, 1 AS b, 1 AS step
UNION ALL
SELECT b, a + b, step + 1
FROM fib
WHERE step < 15
)
SELECT step, a AS fibonacci_number FROM fib;
The trick: carry two values (a and b) so the next row can compute a + b.
Cycle Protection
Recursive CTEs on real data (especially graphs or hierarchies) can loop forever if there are cycles. PostgreSQL 16 introduced CYCLE detection:
WITH RECURSIVE
org_chart (id, parent_id, name) AS (
VALUES
(1, NULL::INT, 'CEO'),
(2, 1, 'VP Engineering'),
(3, 1, 'VP Sales'),
(4, 2, 'Engineering Manager'),
(5, 2, 'Senior Engineer'),
(6, 3, 'Sales Manager')
),
hierarchy AS (
SELECT id, parent_id, name, ARRAY[id] AS path
FROM org_chart
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.parent_id, c.name, h.path || c.id
FROM org_chart c
JOIN hierarchy h ON h.id = c.parent_id
WHERE NOT c.id = ANY(h.path) -- manual cycle guard via path array
)
SELECT * FROM hierarchy;
Alternatively in PostgreSQL 14+:
WITH RECURSIVE hierarchy AS (
...
) CYCLE id SET is_cycle USING path
SELECT * FROM hierarchy WHERE NOT is_cycle;