Recursive CTEs — Fundamentals

Recursive CTEs — Fundamentals

Recursive CTEs are one of SQL's most powerful features. They let you work with hierarchical, tree-like, or sequence data without needing procedural loops. Understanding them unlocks an entirely new class of problems you can solve in pure SQL.

How Recursion Works in SQL

A recursive CTE has two parts separated by UNION ALL:

WITH RECURSIVE cte_name AS (
    -- 1. Anchor member (base case — runs once)
    SELECT ...
    FROM some_table
    WHERE <starting_condition>

    UNION ALL

    -- 2. Recursive member (runs repeatedly until no rows returned)
    SELECT ...
    FROM some_table
    JOIN cte_name ON <join_to_previous_result>
)
SELECT * FROM cte_name;

Execution model:

  1. The anchor runs once and produces an initial result set (call it iteration 0)
  2. The recursive member runs using iteration 0 as input, producing iteration 1
  3. If iteration 1 has rows, run again using iteration 1 → iteration 2
  4. Continue until the recursive member produces 0 rows
  5. Final result = UNION ALL of all iterations

The RECURSIVE keyword is required even if only the CTE references itself.


Example 1: Generate a Number Series

The simplest recursive CTE — generate numbers 1 through 10:

WITH RECURSIVE nums AS (
    SELECT 1 AS n          -- anchor: start at 1

    UNION ALL

    SELECT n + 1           -- recursive: add 1 each time
    FROM nums
    WHERE n < 10           -- termination: stop at 10
)
SELECT n FROM nums;

This pattern is incredibly useful. PostgreSQL has generate_series(), but understanding this pattern is the foundation for all recursive CTEs.


Example 2: Generate a Date Series

Generate every day in Q1 2024:

WITH RECURSIVE date_series AS (
    SELECT DATE '2024-01-01' AS dt

    UNION ALL

    SELECT (dt + INTERVAL '1 day')::DATE
    FROM date_series
    WHERE dt < DATE '2024-03-31'
)
SELECT dt FROM date_series;

Why this matters: When you join this series to real data, you ensure every date appears in your output — even dates with no ec_orders, no ec_events, or no prices. Gaps become explicit zeros instead of missing rows.


Example 3: Fill Gaps in Daily Order Counts

Without a date series, days with no ec_orders simply don't appear in GROUP BY results. With one, you can show zeros:

WITH RECURSIVE date_series AS (
    SELECT MIN(DATE_TRUNC('day', created_at))::DATE AS dt
    FROM ec_orders

    UNION ALL

    SELECT dt + 1
    FROM date_series
    WHERE dt < (SELECT MAX(DATE_TRUNC('day', created_at))::DATE FROM ec_orders)
),

daily_orders AS (
    SELECT
        DATE_TRUNC('day', created_at)::DATE AS order_date,
        COUNT(*) AS order_count,
        SUM(oi.quantity * oi.unit_price) AS revenue
    FROM ec_orders o
    JOIN ec_order_items oi ON oi.order_id = o.order_id
    GROUP BY 1
)

SELECT
    ds.dt AS date,
    COALESCE(d.order_count, 0) AS ec_orders,
    COALESCE(d.revenue, 0)     AS revenue
FROM date_series ds
LEFT JOIN daily_orders d ON d.order_date = ds.dt
ORDER BY ds.dt;

Example 4: Stock Price Calendar (StockMarket)

Fill in missing trading days for a specific stock with the last known price (forward-fill):

WITH RECURSIVE date_series AS (
    SELECT
        MIN(price_date) AS dt,
        MAX(price_date) AS end_dt
    FROM stock_prices
    WHERE company_id = 1

    UNION ALL

    SELECT dt + 1, end_dt
    FROM date_series
    WHERE dt < end_dt
),

actuals AS (
    SELECT price_date, close_price
    FROM stock_prices
    WHERE company_id = 1
)

SELECT
    ds.dt,
    a.close_price,
    CASE WHEN a.close_price IS NULL THEN 'missing' ELSE 'actual' END AS data_type
FROM date_series ds
LEFT JOIN actuals a ON a.price_date = ds.dt
ORDER BY ds.dt;

Note: True forward-fill requires a window function in a follow-up CTE. This query marks gaps — the next lesson covers filling them.


Example 5: Fibonacci Sequence

A classic recursion example to illustrate multi-column recursion:

WITH RECURSIVE fib AS (
    SELECT 0 AS a, 1 AS b, 1 AS step

    UNION ALL

    SELECT b, a + b, step + 1
    FROM fib
    WHERE step < 15
)
SELECT step, a AS fibonacci_number FROM fib;

The trick: carry two values (a and b) so the next row can compute a + b.


Cycle Protection

Recursive CTEs on real data (especially graphs or hierarchies) can loop forever if there are cycles. PostgreSQL 16 introduced CYCLE detection:

WITH RECURSIVE
org_chart (id, parent_id, name) AS (
    VALUES
        (1, NULL::INT, 'CEO'),
        (2, 1, 'VP Engineering'),
        (3, 1, 'VP Sales'),
        (4, 2, 'Engineering Manager'),
        (5, 2, 'Senior Engineer'),
        (6, 3, 'Sales Manager')
),
hierarchy AS (
    SELECT id, parent_id, name, ARRAY[id] AS path
    FROM org_chart
    WHERE parent_id IS NULL

    UNION ALL

    SELECT c.id, c.parent_id, c.name, h.path || c.id
    FROM org_chart c
    JOIN hierarchy h ON h.id = c.parent_id
    WHERE NOT c.id = ANY(h.path)   -- manual cycle guard via path array
)
SELECT * FROM hierarchy;

Alternatively in PostgreSQL 14+:

WITH RECURSIVE hierarchy AS (
    ...
) CYCLE id SET is_cycle USING path
SELECT * FROM hierarchy WHERE NOT is_cycle;

Purchase this course to unlock the full lesson.

Sign up