ROW_NUMBER() — Unique Row Numbering

ROW_NUMBER() assigns a unique sequential integer to each row within a window partition. No ties — every row gets a different number, even if rows are identical. It's the foundation of deduplication, pagination, and top-N queries.


Basic syntax

-- Syntax:
-- ROW_NUMBER() OVER (
--     PARTITION BY partition_col   -- reset counter for each group (optional)
--     ORDER BY     sort_col        -- determines which row gets number 1
-- )

The simplest possible use — number every order by date:

SELECT
    order_id,
    customer_id,
    created_at,
    status,
    ROW_NUMBER() OVER (ORDER BY created_at) AS rn
FROM orders;

Every row gets a unique number 1, 2, 3… ordered by created_at. If two orders have identical timestamps, the tie is broken arbitrarily (but consistently within a single query execution).


PARTITION BY — resetting the counter per group

Add PARTITION BY to number rows independently within each group:

-- Number each customer's orders chronologically
SELECT
    o.order_id,
    c.name         AS customer,
    o.created_at,
    o.status,
    ROW_NUMBER() OVER (
        PARTITION BY o.customer_id
        ORDER BY o.created_at
    ) AS order_num
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
ORDER BY c.name, order_num;

Customer 1's orders get numbers 1, 2, 3… Customer 2's orders independently get 1, 2, 3… The counter resets at each partition boundary.


Use case: identifying each customer's first order

Wrap in a CTE and filter rn = 1:

WITH numbered AS (
    SELECT
        o.order_id,
        o.customer_id,
        o.created_at,
        o.status,
        ROW_NUMBER() OVER (
            PARTITION BY o.customer_id
            ORDER BY o.created_at
        ) AS rn
    FROM orders o
)
SELECT
    n.order_id,
    c.name       AS customer,
    n.created_at AS first_order_date,
    n.status
FROM numbered n
JOIN customers c ON c.customer_id = n.customer_id
WHERE n.rn = 1
ORDER BY n.created_at;

This is the canonical "first per group" pattern. It works for any definition of "first" — change ORDER BY created_at to ORDER BY total_amount DESC to get each customer's largest order instead.


Use case: deduplication

When a table has duplicate rows and you want to keep exactly one per group:

-- Simulate a table with duplicate order_items entries
WITH dupes AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY order_id, product_id
            ORDER BY item_id
        ) AS rn
    FROM order_items
)
-- Keep only the first occurrence of each order+product combination
SELECT order_id, product_id, quantity, unit_price
FROM dupes
WHERE rn = 1;

This is the safest deduplication approach in PostgreSQL — it gives you full control over which duplicate survives (the one with the lowest item_id here).


Use case: pagination

-- Page 3, 20 rows per page (rows 10-60)
WITH paginated AS (
    SELECT
        p.name,
        p.category,
        p.price,
        SUM(oi.quantity * oi.unit_price) AS total_revenue,
        ROW_NUMBER() OVER (ORDER BY SUM(oi.quantity * oi.unit_price) DESC) AS rn
    FROM products p
    LEFT JOIN order_items oi ON oi.product_id = p.product_id
    GROUP BY p.product_id, p.name, p.category, p.price
)
SELECT name, category, price, total_revenue
FROM paginated
WHERE rn BETWEEN 10 AND 60;

In practice, OFFSET + LIMIT is simpler for basic pagination. ROW_NUMBER() becomes useful when you need stable, deterministic pagination on complex result sets or when you want to navigate to a specific row range after some transformation.


ROW_NUMBER() vs RANK() vs DENSE_RANK()

All three assign numbers to rows. The difference is in how they handle ties:

Purchase this course to unlock the full lesson.

Sign up