Top-N Per Group — Patterns and Performance

Retrieving the top N rows per group is one of the most frequently needed analytical queries — top 3 products per category, top 5 customers per country, the latest 2 orders per customer. There are several ways to solve it in PostgreSQL. Each has different readability and performance characteristics.


Method 1: Window function + CTE (most common)

-- Top 2 products by revenue in each category
WITH product_revenue AS (
    SELECT
        p.product_id,
        p.name,
        p.category,
        SUM(oi.quantity * oi.unit_price) AS revenue
    FROM order_items oi
    JOIN products p ON p.product_id = oi.product_id
    GROUP BY p.product_id, p.name, p.category
),
ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS rn
    FROM product_revenue
)
SELECT name, category, revenue
FROM ranked
WHERE rn <= 2
ORDER BY category, revenue DESC;

Use ROW_NUMBER() when you want exactly N rows even when there are ties — the tiebreaker is arbitrary. Use RANK() when ties should all be included (you might get more than N rows).


Method 2: DISTINCT ON (PostgreSQL-specific)

PostgreSQL's DISTINCT ON is a concise alternative for top-1 per group:

-- Highest-revenue product per category (exactly one per category)
SELECT DISTINCT ON (p.category)
    p.name,
    p.category,
    SUM(oi.quantity * oi.unit_price) AS revenue
FROM order_items oi
JOIN products p ON p.product_id = oi.product_id
GROUP BY p.product_id, p.name, p.category
ORDER BY p.category, SUM(oi.quantity * oi.unit_price) DESC;

DISTINCT ON (category) keeps one row per category — the first row after the ORDER BY within each category group. It's clean for top-1 but doesn't extend to top-N.

Purchase this course to unlock the full lesson.

Sign up