DML with CTEs — UPDATE, DELETE, INSERT

DML with CTEs — UPDATE, DELETE, INSERT

CTEs aren't limited to SELECT queries. PostgreSQL allows you to use CTEs with data modification statements — UPDATE, DELETE, and INSERT. This unlocks some very powerful patterns: conditional updates, bulk operations based on complex logic, and multi-step data pipelines in a single statement.

Syntax Overview

-- CTE used with UPDATE
WITH source AS (
    SELECT ...
)
UPDATE target_table
SET column = source.value
FROM source
WHERE target_table.id = source.id;

-- CTE used with DELETE
WITH to_delete AS (
    SELECT id FROM table WHERE some_complex_condition
)
DELETE FROM table WHERE id IN (SELECT id FROM to_delete);

-- CTE used with INSERT (upsert pattern)
WITH new_data AS (
    SELECT ...
)
INSERT INTO target SELECT * FROM new_data;

Example 1: UPDATE Based on Aggregated Logic

Recalculate and sync total_amount on each order using the actual sum from its line items — a common ETL correction pattern:

WITH order_totals AS (
    SELECT
        order_id,
        ROUND(SUM(quantity * unit_price), 2) AS calculated_total
    FROM ec_order_items
    GROUP BY order_id
)
UPDATE ec_orders o
SET total_amount = ot.calculated_total
FROM order_totals ot
WHERE o.order_id = ot.order_id
  AND o.total_amount IS DISTINCT FROM ot.calculated_total;  -- only update rows that differ

This computes the correct total in a CTE, then uses it to drive an UPDATE — all in one atomic statement. IS DISTINCT FROM avoids unnecessary writes when the value hasn't changed.


Example 2: DELETE with Complex Conditions

Delete test/bot ec_orders (zero-value ec_orders placed within 1 second of registration):

WITH test_orders AS (
    SELECT o.order_id
    FROM ec_orders o
    JOIN ec_customers c ON c.customer_id = o.customer_id
    LEFT JOIN ec_order_items oi ON oi.order_id = o.order_id
    WHERE oi.order_id IS NULL   -- ec_orders with no items
       OR o.created_at < c.created_at + INTERVAL '5 seconds'  -- placed instantly after signup
)
DELETE FROM ec_orders
WHERE order_id IN (SELECT order_id FROM test_orders);

Always test with SELECT first: Run SELECT * FROM test_orders before wrapping in DELETE.


Example 3: INSERT with RETURNING

Use a CTE to capture the IDs of newly inserted rows for downstream use:

WITH new_customers AS (
    INSERT INTO ec_customers (name, email, country, acquisition_channel, created_at)
    VALUES
        ('Alice Chen', 'alice.dml@example.com', 'US', 'organic', NOW()),
        ('Bob Smith', 'bob.dml@example.com', 'UK', 'paid', NOW())
    RETURNING customer_id, email
),

welcome_events AS (
    INSERT INTO ec_events (customer_id, event_type, created_at)
    SELECT customer_id, 'signup', NOW()
    FROM new_customers
    RETURNING event_id, customer_id
)

SELECT nc.email, we.event_id
FROM new_customers nc
JOIN welcome_events we ON we.customer_id = nc.customer_id;

RETURNING makes inserted IDs available to downstream CTEs in the same statement.


Example 4: Upsert (INSERT ... ON CONFLICT) with CTE

Update existing records, insert new ones — the classic upsert pattern. stock_prices has a UNIQUE(company_id, price_date) constraint, making it a natural fit:

-- Re-ingest today's price feed — insert if new, update if already loaded
WITH todays_feed(company_id, price_date, open_price, high_price, low_price, close_price, volume) AS (
    VALUES
        (1, CURRENT_DATE, 182.50, 185.00, 181.20, 183.75, 2100000),
        (2, CURRENT_DATE,  94.30,  96.10,  93.80,  95.50, 1850000),
        (3, CURRENT_DATE, 210.00, 213.50, 209.40, 212.10, 3200000)
)
INSERT INTO stock_prices (company_id, price_date, open_price, high_price, low_price, close_price, volume)
SELECT company_id, price_date, open_price, high_price, low_price, close_price, volume
FROM todays_feed
ON CONFLICT (company_id, price_date) DO UPDATE
    SET open_price  = EXCLUDED.open_price,
        high_price  = EXCLUDED.high_price,
        low_price   = EXCLUDED.low_price,
        close_price = EXCLUDED.close_price,
        volume      = EXCLUDED.volume;

Example 5: Multi-Step Data Pipeline in One Transaction

Cancel old pending orders and refund their customers — identify, update, and report in one atomic statement using writable CTEs:

WITH stale_pending AS (
    -- Identify orders stuck in 'pending' for over 30 days
    SELECT order_id, customer_id
    FROM ec_orders
    WHERE status = 'pending'
      AND created_at < NOW() - INTERVAL '30 days'
),

cancelled AS (
    -- Cancel them
    UPDATE ec_orders
    SET status = 'cancelled'
    WHERE order_id IN (SELECT order_id FROM stale_pending)
    RETURNING order_id, customer_id
)

-- Return a summary of what happened
SELECT
    COUNT(*)                        AS orders_cancelled,
    COUNT(DISTINCT customer_id)     AS customers_affected
FROM cancelled;

Critical: All CTEs in a single SQL statement run in the same transaction. If any part fails, the whole thing rolls back.


Example 6: Bulk UPDATE from Finance Dataset

Update invoice status based on payment receipts:

WITH payment_totals AS (
    SELECT
        invoice_id,
        SUM(amount) AS total_paid
    FROM payments
    GROUP BY invoice_id
),

invoice_amounts AS (
    SELECT
        invoice_id,
        SUM(quantity * unit_price) AS total_invoiced
    FROM invoice_items
    GROUP BY invoice_id
),

status_update AS (
    SELECT
        i.invoice_id,
        CASE
            WHEN pt.total_paid IS NULL THEN 'unpaid'
            WHEN pt.total_paid >= ia.total_invoiced THEN 'paid'
            WHEN pt.total_paid > 0 THEN 'partial'
            ELSE 'unpaid'
        END AS new_status
    FROM invoices i
    LEFT JOIN payment_totals pt ON pt.invoice_id = i.invoice_id
    LEFT JOIN invoice_amounts ia ON ia.invoice_id = i.invoice_id
)

UPDATE invoices inv
SET status = su.new_status
FROM status_update su
WHERE inv.invoice_id = su.invoice_id
  AND inv.status IS DISTINCT FROM su.new_status;  -- only update if status actually changed

Purchase this course to unlock the full lesson.

Sign up