MATERIALIZED CTEs and Performance
MATERIALIZED CTEs and Performance
CTEs look like they should cache results — you write the logic once and reference the name multiple times. But the default behavior may surprise you. Understanding when PostgreSQL materializes a CTE (and when it doesn't) is essential for writing performant queries.
The Default Behavior Has Changed
PostgreSQL < 12: CTEs were always materialized (evaluated once, result stored in memory, then referenced)
PostgreSQL 12+: CTEs are inlined by default — the optimizer merges the CTE into the outer query, as if you'd written a subquery
This is usually good! Inlining allows the optimizer to push conditions down into the CTE, use indexes, and apply other optimizations. But sometimes you need to force materialization.
When Inlining Helps
-- Without MATERIALIZED, the optimizer can push the WHERE clause inside
WITH recent_orders AS (
SELECT * FROM ec_orders WHERE created_at >= '2024-01-01'
)
SELECT * FROM recent_orders WHERE customer_id = 42;
-- The optimizer may rewrite this as:
-- SELECT * FROM ec_orders WHERE created_at >= '2024-01-01' AND customer_id = 42;
-- ...and use an index on (customer_id, created_at)
This is the optimizer doing its job — inlining is usually what you want.
When You Need MATERIALIZED
Use MATERIALIZED when:
- The CTE is referenced multiple times — prevents re-execution
- The CTE has side effects (rare in SELECT queries)
- You want a stable intermediate result — the optimizer shouldn't reshape it
- You're debugging — ensures each CTE step is evaluated independently
WITH expensive_aggregation AS MATERIALIZED (
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(oi.quantity * oi.unit_price) AS lifetime_value,
AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM ec_orders o
JOIN ec_order_items oi ON oi.order_id = o.order_id
GROUP BY customer_id
)
-- Reference the CTE twice — with MATERIALIZED, computed only once
SELECT 'high_value' AS segment, customer_id, lifetime_value
FROM expensive_aggregation
WHERE lifetime_value > 10000
UNION ALL
SELECT 'at_risk', customer_id, lifetime_value
FROM expensive_aggregation
WHERE lifetime_value < 500 AND order_count < 2;
Without MATERIALIZED, PostgreSQL might execute expensive_aggregation twice.
NOT MATERIALIZED — Force Inlining
Sometimes older code or functions wrap CTEs in ways that prevent inlining. Force it explicitly:
WITH customer_summary AS NOT MATERIALIZED (
SELECT customer_id, COUNT(*) AS cnt
FROM ec_orders
GROUP BY customer_id
)
SELECT * FROM customer_summary WHERE cnt > 5;
Example 1: Reused CTE — When Materialization Matters
This CTE computes all order totals and is referenced twice:
WITH order_totals AS MATERIALIZED (
SELECT
o.order_id,
o.customer_id,
o.created_at,
SUM(oi.quantity * oi.unit_price) AS total
FROM ec_orders o
JOIN ec_order_items oi ON oi.order_id = o.order_id
GROUP BY o.order_id, o.customer_id, o.created_at
),
high_value AS (
SELECT customer_id, COUNT(*) AS hv_orders
FROM order_totals
WHERE total > 500
GROUP BY customer_id
),
all_orders AS (
SELECT customer_id, COUNT(*) AS total_orders
FROM order_totals -- second reference to the same CTE
GROUP BY customer_id
)
SELECT
c.name AS customer,
ao.total_orders,
COALESCE(hv.hv_orders, 0) AS high_value_orders,
ROUND(100.0 * COALESCE(hv.hv_orders, 0) / ao.total_orders, 1) AS hv_pct
FROM all_orders ao
LEFT JOIN high_value hv ON hv.customer_id = ao.customer_id
JOIN ec_customers c ON c.customer_id = ao.customer_id
ORDER BY hv_pct DESC;
Example 2: EXPLAIN to See the Difference
The best way to understand what PostgreSQL is actually doing:
-- Without MATERIALIZED
EXPLAIN
WITH order_summary AS (
SELECT customer_id, COUNT(*) AS cnt
FROM ec_orders
GROUP BY customer_id
)
SELECT * FROM order_summary WHERE cnt > 3;
-- With MATERIALIZED
EXPLAIN
WITH order_summary AS MATERIALIZED (
SELECT customer_id, COUNT(*) AS cnt
FROM ec_orders
GROUP BY customer_id
)
SELECT * FROM order_summary WHERE cnt > 3;
Look for CTE Scan in the output — that's when PostgreSQL materializes. Without it, the CTE is inlined.
Example 3: CTE as an Optimization Fence (Legacy Technique)
Before PostgreSQL 12, developers used CTEs intentionally as "optimization fences" — to prevent the planner from doing certain joins in certain ec_orders. This is now less relevant (use MATERIALIZED explicitly instead), but you'll see it in older code:
-- Old pattern (pre-PG12): CTE as fence to force sequential evaluation
WITH step1 AS (
SELECT * FROM ec_orders WHERE status = 'completed' LIMIT 1000
),
step2 AS (
SELECT * FROM step1 WHERE customer_id < 500
)
SELECT * FROM step2;
-- Modern equivalent (explicit):
WITH step1 AS MATERIALIZED (
SELECT * FROM ec_orders WHERE status = 'completed' LIMIT 1000
),
step2 AS (
SELECT * FROM step1 WHERE customer_id < 500
)
SELECT * FROM step2;