GROUPING SETS — Multiple GROUP BY Levels in One Query
GROUPING SETS — Multiple GROUP BY Levels in One Query
GROUPING SETS is one of the most powerful and underused features in PostgreSQL. It lets you compute multiple different GROUP BY combinations in a single query pass — producing subtotals, totals, and granular breakdowns all in one result set.
Without GROUPING SETS, building a report that shows revenue by year and quarter, year only, quarter only, and a grand total would require four separate queries joined with UNION ALL. With GROUPING SETS, it is one query. The database makes a single pass through the data and returns every grouping level simultaneously — which is both faster and more readable.
The problem GROUPING SETS solves
Imagine you need a sales report with three levels:
- Revenue per
(region, product_category)— the detailed rows - Revenue per
regiononly — regional subtotals - A grand total across everything
The naive approach:
-- Three separate queries — slow and repetitive
WITH sales(region, category, revenue) AS (
VALUES
('North', 'Electronics', 45200),
('North', 'Clothing', 28100),
('North', 'Electronics', 12000),
('South', 'Electronics', 31500),
('South', 'Clothing', 19800),
('South', 'Electronics', 8700),
('West', 'Electronics', 27300),
('West', 'Clothing', 15600),
('East', 'Electronics', 22100),
('East', 'Clothing', 11900)
)
SELECT region, category, SUM(revenue) FROM sales GROUP BY region, category
UNION ALL
SELECT region, NULL, SUM(revenue) FROM sales GROUP BY region
UNION ALL
SELECT NULL, NULL, SUM(revenue) FROM sales;
This is verbose, scans the sales table three times, and becomes unmaintainable if you add more dimensions. GROUPING SETS collapses this into one pass.
Example 1: Basic GROUPING SETS syntax
This query calculates revenue subtotals at three levels simultaneously: by region+category, by region alone, and a grand total. The NULL values in the result indicate that the column was "collapsed" for that grouping level — a region-level row has NULL for category because category was not part of that grouping.
WITH sales(region, category, revenue) AS (
VALUES
('North', 'Electronics', 45200),
('North', 'Clothing', 28100),
('North', 'Electronics', 12000),
('South', 'Electronics', 31500),
('South', 'Clothing', 19800),
('South', 'Electronics', 8700),
('West', 'Electronics', 27300),
('West', 'Clothing', 15600),
('East', 'Electronics', 22100),
('East', 'Clothing', 11900)
)
SELECT
region,
category,
ROUND(SUM(revenue), 2) AS total_revenue,
COUNT(*) AS transactions
FROM sales
GROUP BY GROUPING SETS (
(region, category), -- level 1: full detail
(region), -- level 2: regional subtotals
() -- level 3: grand total (empty set)
)
ORDER BY region NULLS LAST, category NULLS LAST;
What This Returns
Rows appear in three tiers:
- Detail rows: both
regionandcategoryare populated — e.g.North | Electronics | 45,200 - Subtotal rows:
categoryis NULL — e.g.North | NULL | 112,800(total for North across all categories) - Grand total row: both
regionandcategoryare NULL — e.g.NULL | NULL | 398,400
The NULLS LAST in ORDER BY pushes subtotal and grand total rows to the bottom within each group, which is the standard report format.
Example 2: The GROUPING() function — distinguishing NULLs
A critical problem: what if the region column itself can be NULL in the source data (unknown region)? You cannot tell whether a NULL in the output means "subtotal row" or "data was actually NULL". The GROUPING() function solves this — it returns 1 when a column was excluded from the current grouping, and 0 when it was included.
WITH sales(region, category, revenue) AS (
VALUES
('North', 'Electronics', 45200),
('North', 'Clothing', 28100),
('North', 'Electronics', 12000),
('South', 'Electronics', 31500),
('South', 'Clothing', 19800),
('South', 'Electronics', 8700),
('West', 'Electronics', 27300),
('West', 'Clothing', 15600),
('East', 'Electronics', 22100),
('East', 'Clothing', 11900)
)
SELECT
CASE WHEN GROUPING(region) = 1 THEN '(All Regions)' ELSE region END AS region,
CASE WHEN GROUPING(category) = 1 THEN '(All Categories)' ELSE category END AS category,
ROUND(SUM(revenue), 2) AS total_revenue,
GROUPING(region) AS is_region_subtotal,
GROUPING(category) AS is_category_subtotal,
GROUPING(region) + GROUPING(category) AS grouping_level
FROM sales
GROUP BY GROUPING SETS (
(region, category),
(region),
()
)
ORDER BY grouping_level, region, category;
What This Returns
| region | category | total_revenue | is_region_subtotal | is_category_subtotal | grouping_level |
|---|---|---|---|---|---|
| North | Electronics | 45,200 | 0 | 0 | 0 |
| North | Clothing | 28,100 | 0 | 0 | 0 |
| North | (All Categories) | 112,800 | 0 | 1 | 1 |
| (All Regions) | (All Categories) | 398,400 | 1 | 1 | 2 |
grouping_level = 0 means a fully detailed row. grouping_level = 2 means the grand total. This column is useful for ordering or filtering to specific levels of the report.
Example 3: GROUPING SETS with a real analytics table
Using the invoicing dataset — produce a single query that shows revenue broken down by (industry, country), by industry, by country, and as a grand total:
WITH invoice_revenue AS (
SELECT
c.industry,
c.country,
DATE_PART('year', i.issue_date) AS year,
SUM(ii.quantity * ii.unit_price) AS revenue
FROM clients c
JOIN invoices i ON i.client_id = c.client_id
JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
GROUP BY c.industry, c.country, 3
)
SELECT
COALESCE(CASE WHEN GROUPING(industry) = 1 THEN '(ALL)' ELSE industry END, industry) AS industry,
COALESCE(CASE WHEN GROUPING(country) = 1 THEN '(ALL)' ELSE country END, country) AS country,
COALESCE(CASE WHEN GROUPING(year) = 1 THEN 0 ELSE year END, 0) AS year,
ROUND(SUM(revenue), 0) AS total_revenue,
GROUPING(industry, country, year) AS grouping_key
FROM invoice_revenue
GROUP BY GROUPING SETS (
(industry, country, year), -- full detail
(industry, year), -- per industry per year
(country, year), -- per country per year
(year), -- annual totals
() -- grand total
)
ORDER BY grouping_key, year, industry, country;
What This Returns
This single query produces five levels of aggregation simultaneously. GROUPING(industry, country, year) encodes all three grouping flags into a single integer using bit positions — it equals 0 for full-detail rows and 7 for the grand total. You can filter on it to extract only the level you need:
-- Example: Pull only the annual totals level by filtering on grouping key
-- (This WHERE clause must be inside a subquery or CTE, not at top level)
WITH invoice_revenue (industry, country, year, revenue) AS (
VALUES
('Technology', 'US', 2023, 450000),
('Technology', 'UK', 2023, 280000),
('Finance', 'US', 2023, 310000),
('Finance', 'UK', 2024, 195000),
('Technology', 'US', 2024, 520000)
)
SELECT * FROM (
SELECT
industry, country, year::INT AS year,
ROUND(SUM(revenue), 0) AS total_revenue,
GROUPING(industry, country, year) AS grouping_key
FROM invoice_revenue
GROUP BY GROUPING SETS (
(industry, country, year),
(industry, year),
(country, year),
(year),
()
)
) sub
WHERE grouping_key = 6 -- industry=1, country=1, year=0 in binary (annual totals)
Example 4: Combining GROUPING SETS with window functions
This advanced pattern adds running totals within each grouping level, allowing you to see both the absolute amount and the cumulative total at each detail row:
WITH gs_result AS (
SELECT
c.industry,
DATE_TRUNC('quarter', i.issue_date)::DATE AS quarter,
SUM(ii.quantity * ii.unit_price) AS revenue,
GROUPING(c.industry) AS g_industry,
GROUPING(DATE_TRUNC('quarter', i.issue_date)::DATE) AS g_quarter
FROM clients c
JOIN invoices i ON i.client_id = c.client_id
JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
GROUP BY GROUPING SETS (
(c.industry, DATE_TRUNC('quarter', i.issue_date)::DATE),
(c.industry),
()
)
)
SELECT
CASE WHEN g_industry = 1 THEN 'GRAND TOTAL' ELSE industry END AS industry,
CASE WHEN g_quarter = 1 THEN 'ALL QUARTERS' ELSE quarter::TEXT END AS quarter,
ROUND(revenue, 2) AS revenue,
ROUND(SUM(revenue) OVER (
PARTITION BY industry, g_industry
ORDER BY quarter NULLS LAST
), 2) AS running_total
FROM gs_result
ORDER BY g_industry, g_quarter, industry, quarter;
The window function's PARTITION BY industry, g_industry ensures that the running total restarts for each industry group, and doesn't bleed across grouping levels.
Key Takeaway
GROUPING SETS eliminates the UNION ALL anti-pattern for multi-level reports. Use the GROUPING() function to distinguish aggregated NULLs from data NULLs, and GROUPING(col1, col2, ...) as a bit-encoded level indicator for filtering. Every BI tool output (subtotals, grand totals, cross-tabs) is ultimately producing what GROUPING SETS computes natively.