JSON Aggregation — row_to_json, json_agg, and Building API Responses

PostgreSQL can build complete JSON API responses entirely in SQL, without any application-layer serialisation. This is particularly useful for: REST API endpoints that return nested objects, data pipelines feeding document stores (MongoDB, Elasticsearch), and export jobs that need structured JSON output.

The key functions are row_to_json, json_agg / jsonb_agg, json_build_object, and json_strip_nulls.

Example 1: row_to_json — convert an entire row to JSON

row_to_json(row) converts a table row (or a sub-SELECT row) into a JSON object. Every column becomes a key.

-- Simple row to JSON
SELECT row_to_json(c.*) AS client_json
FROM clients c LIMIT 3;

-- Sub-SELECT allows you to shape the object (pick columns, rename)
SELECT row_to_json(t) AS shaped_json
FROM (
    SELECT
        c.client_id,
        c.company_name AS name,
        c.industry,
        c.country,
        COUNT(i.invoice_id) AS invoice_count,
        ROUND(SUM(ii.quantity * ii.unit_price), 2) AS total_billed
    FROM clients c
    LEFT JOIN invoices i ON i.client_id = c.client_id
    LEFT JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
    GROUP BY c.client_id, c.company_name, c.industry, c.country
) t;

Example 2: Nested JSON — one-to-many relationships

The most common API response pattern: a parent object with an embedded array of child objects. This is the "N+1 query" problem solved in one SQL query:

-- Build a client document with embedded invoices
SELECT
    c.client_id,
    jsonb_build_object(
        'id',           c.client_id,
        'name',         c.company_name,
        'industry',     c.industry,
        'country',      c.country,
        'invoices',     COALESCE(
            jsonb_agg(
                jsonb_build_object(
                    'invoice_no',  i.invoice_no,
                    'issue_date',  i.issue_date,
                    'due_date',    i.due_date,
                    'status',      i.status,
                    'status',      i.status
                )
                ORDER BY i.issue_date DESC
            ) FILTER (WHERE i.invoice_id IS NOT NULL),
            '[]'::jsonb
        )
    ) AS client_document
FROM clients c
LEFT JOIN invoices i ON i.client_id = c.client_id
LEFT JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
GROUP BY c.client_id, c.company_name, c.industry, c.country
ORDER BY c.company_name;

The COALESCE(..., '[]'::jsonb) ensures clients with no invoices return an empty array [] rather than NULL.

Example 3: Three-level nesting — client → invoices → items

Deep nesting requires CTEs to avoid complexity at each level:

WITH item_json AS (
    -- Level 3: line items per invoice
    SELECT
        ii.invoice_id,
        jsonb_agg(
            jsonb_build_object(
                'description', ii.description,
                'quantity',    ii.quantity,
                'unit_price',  ii.unit_price,
                'line_total',  ROUND(ii.quantity * ii.unit_price, 2)
            ) ORDER BY ii.item_id
        ) AS items_array,
        ROUND(SUM(ii.quantity * ii.unit_price), 2) AS items_array_sum
    FROM invoice_items ii
    GROUP BY ii.invoice_id
),
invoice_json AS (
    -- Level 2: invoices per client with embedded items
    SELECT
        i.client_id,
        jsonb_agg(
            jsonb_build_object(
                'invoice_no', i.invoice_no,
                'issue_date', i.issue_date,
                'status',     i.status,
                'total',      ROUND(COALESCE(ij.items_array_sum, 0), 2),
                'items',      COALESCE(ij.items_array, '[]'::jsonb)
            ) ORDER BY i.issue_date DESC
        ) AS invoices_array
    FROM invoices i
    LEFT JOIN item_json ij ON ij.invoice_id = i.invoice_id
    LEFT JOIN LATERAL (
        SELECT ROUND(SUM(quantity * unit_price), 2) AS items_array_sum
        FROM invoice_items WHERE invoice_id = i.invoice_id
    ) s ON TRUE
    GROUP BY i.client_id
)
-- Level 1: client with embedded invoices
SELECT
    jsonb_strip_nulls(
        jsonb_build_object(
            'client_id', c.client_id,
            'name',      c.company_name,
            'invoices',  COALESCE(ij.invoices_array, '[]'::jsonb)
        )
    ) AS full_document
FROM clients c
LEFT JOIN invoice_json ij ON ij.client_id = c.client_id
ORDER BY c.company_name
LIMIT 3;

jsonb_strip_nulls removes any keys with NULL values from the output — useful for clean API responses where you don't want "field": null in every document.

Example 4: JSON output for bulk export and data pipelines

Generating newline-delimited JSON (NDJSON) — the standard format for streaming to Elasticsearch, BigQuery, or S3:

-- Each row is a complete, standalone JSON document
-- Output one JSON object per line (NDJSON format)
SELECT
    jsonb_build_object(
        '_id',          o.order_id,
        '_index',       'orders',
        'customer_id',  o.customer_id,
        'channel',      c.acquisition_channel,
        'country',      c.country,
        'created_at',   o.created_at,
        'status',       o.status,
        'total',        o.total_amount,
        'item_count',   COUNT(oi.item_id),
        'categories',   jsonb_agg(DISTINCT to_jsonb(p.category)),
        'products',     jsonb_agg(DISTINCT to_jsonb(p.name))
    )::TEXT AS ndjson_line
FROM ec_orders o
JOIN ec_customers c ON c.customer_id = o.customer_id
JOIN ec_order_items oi ON oi.order_id = o.order_id
JOIN ec_products p ON p.product_id = oi.product_id
WHERE o.created_at >= '2025-01-01'
GROUP BY o.order_id, o.customer_id, c.acquisition_channel, c.country,
         o.created_at, o.status, o.total_amount
ORDER BY o.created_at DESC;

Pipe this query's output directly to a file with psql -c "SELECT ..." --no-align --tuples-only > export.ndjson for a complete data pipeline in one command.

Key Takeaway

PostgreSQL JSON aggregation builds nested documents natively — no ORM, no serialiser, no application code. The three-CTE pattern (level 3 → level 2 → level 1) cleanly handles nested objects. jsonb_agg aggregates into JSONB arrays; jsonb_strip_nulls cleans output; COALESCE(agg, '[]'::jsonb) ensures empty arrays rather than NULLs. This pattern produces identical output to what an ORM or application-layer serialiser would build, but with far less code and in a single database round trip.

← Previous Next →