JSON Aggregation — row_to_json, json_agg, and Building API Responses
JSON Aggregation — row_to_json, json_agg, and Building API Responses
PostgreSQL can build complete JSON API responses entirely in SQL, without any application-layer serialisation. This is particularly useful for: REST API endpoints that return nested objects, data pipelines feeding document stores (MongoDB, Elasticsearch), and export jobs that need structured JSON output.
The key functions are row_to_json, json_agg / jsonb_agg, json_build_object, and json_strip_nulls.
Example 1: row_to_json — convert an entire row to JSON
row_to_json(row) converts a table row (or a sub-SELECT row) into a JSON object. Every column becomes a key.
-- Simple row to JSON
SELECT row_to_json(c.*) AS client_json
FROM clients c LIMIT 3;
-- Sub-SELECT allows you to shape the object (pick columns, rename)
SELECT row_to_json(t) AS shaped_json
FROM (
SELECT
c.client_id,
c.company_name AS name,
c.industry,
c.country,
COUNT(i.invoice_id) AS invoice_count,
ROUND(SUM(ii.quantity * ii.unit_price), 2) AS total_billed
FROM clients c
LEFT JOIN invoices i ON i.client_id = c.client_id
LEFT JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
GROUP BY c.client_id, c.company_name, c.industry, c.country
) t;
Example 2: Nested JSON — one-to-many relationships
The most common API response pattern: a parent object with an embedded array of child objects. This is the "N+1 query" problem solved in one SQL query:
-- Build a client document with embedded invoices
SELECT
c.client_id,
jsonb_build_object(
'id', c.client_id,
'name', c.company_name,
'industry', c.industry,
'country', c.country,
'invoices', COALESCE(
jsonb_agg(
jsonb_build_object(
'invoice_no', i.invoice_no,
'issue_date', i.issue_date,
'due_date', i.due_date,
'status', i.status,
'status', i.status
)
ORDER BY i.issue_date DESC
) FILTER (WHERE i.invoice_id IS NOT NULL),
'[]'::jsonb
)
) AS client_document
FROM clients c
LEFT JOIN invoices i ON i.client_id = c.client_id
LEFT JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
GROUP BY c.client_id, c.company_name, c.industry, c.country
ORDER BY c.company_name;
The COALESCE(..., '[]'::jsonb) ensures clients with no invoices return an empty array [] rather than NULL.
Example 3: Three-level nesting — client → invoices → items
Deep nesting requires CTEs to avoid complexity at each level:
WITH item_json AS (
-- Level 3: line items per invoice
SELECT
ii.invoice_id,
jsonb_agg(
jsonb_build_object(
'description', ii.description,
'quantity', ii.quantity,
'unit_price', ii.unit_price,
'line_total', ROUND(ii.quantity * ii.unit_price, 2)
) ORDER BY ii.item_id
) AS items_array,
ROUND(SUM(ii.quantity * ii.unit_price), 2) AS items_array_sum
FROM invoice_items ii
GROUP BY ii.invoice_id
),
invoice_json AS (
-- Level 2: invoices per client with embedded items
SELECT
i.client_id,
jsonb_agg(
jsonb_build_object(
'invoice_no', i.invoice_no,
'issue_date', i.issue_date,
'status', i.status,
'total', ROUND(COALESCE(ij.items_array_sum, 0), 2),
'items', COALESCE(ij.items_array, '[]'::jsonb)
) ORDER BY i.issue_date DESC
) AS invoices_array
FROM invoices i
LEFT JOIN item_json ij ON ij.invoice_id = i.invoice_id
LEFT JOIN LATERAL (
SELECT ROUND(SUM(quantity * unit_price), 2) AS items_array_sum
FROM invoice_items WHERE invoice_id = i.invoice_id
) s ON TRUE
GROUP BY i.client_id
)
-- Level 1: client with embedded invoices
SELECT
jsonb_strip_nulls(
jsonb_build_object(
'client_id', c.client_id,
'name', c.company_name,
'invoices', COALESCE(ij.invoices_array, '[]'::jsonb)
)
) AS full_document
FROM clients c
LEFT JOIN invoice_json ij ON ij.client_id = c.client_id
ORDER BY c.company_name
LIMIT 3;
jsonb_strip_nulls removes any keys with NULL values from the output — useful for clean API responses where you don't want "field": null in every document.
Example 4: JSON output for bulk export and data pipelines
Generating newline-delimited JSON (NDJSON) — the standard format for streaming to Elasticsearch, BigQuery, or S3:
-- Each row is a complete, standalone JSON document
-- Output one JSON object per line (NDJSON format)
SELECT
jsonb_build_object(
'_id', o.order_id,
'_index', 'orders',
'customer_id', o.customer_id,
'channel', c.acquisition_channel,
'country', c.country,
'created_at', o.created_at,
'status', o.status,
'total', o.total_amount,
'item_count', COUNT(oi.item_id),
'categories', jsonb_agg(DISTINCT to_jsonb(p.category)),
'products', jsonb_agg(DISTINCT to_jsonb(p.name))
)::TEXT AS ndjson_line
FROM ec_orders o
JOIN ec_customers c ON c.customer_id = o.customer_id
JOIN ec_order_items oi ON oi.order_id = o.order_id
JOIN ec_products p ON p.product_id = oi.product_id
WHERE o.created_at >= '2025-01-01'
GROUP BY o.order_id, o.customer_id, c.acquisition_channel, c.country,
o.created_at, o.status, o.total_amount
ORDER BY o.created_at DESC;
Pipe this query's output directly to a file with psql -c "SELECT ..." --no-align --tuples-only > export.ndjson for a complete data pipeline in one command.
Key Takeaway
PostgreSQL JSON aggregation builds nested documents natively — no ORM, no serialiser, no application code. The three-CTE pattern (level 3 → level 2 → level 1) cleanly handles nested objects. jsonb_agg aggregates into JSONB arrays; jsonb_strip_nulls cleans output; COALESCE(agg, '[]'::jsonb) ensures empty arrays rather than NULLs. This pattern produces identical output to what an ORM or application-layer serialiser would build, but with far less code and in a single database round trip.