Querying and Filtering JSONB — Operators, Paths, and Indexes
Querying and Filtering JSONB — Operators, Paths, and Indexes
The power of JSONB comes from being able to query inside documents efficiently. Without indexes, every JSONB query requires a full sequential scan — examining every document on every row. With a GIN index, JSONB queries can use the index to find matching documents in milliseconds even across millions of rows.
The GIN index for JSONB
A GIN (Generalized Inverted Index) on a JSONB column indexes every key and value in every document. It makes the containment operator @> and key-existence operators ?, ?&, ?| index-accelerated.
-- Create a GIN index on the attributes column
CREATE INDEX IF NOT EXISTS idx_product_attributes ON product_catalog USING GIN (attributes);
-- Now these queries use the index:
SELECT * FROM product_catalog WHERE attributes @> '{"brand": "Apple"}';
SELECT * FROM product_catalog WHERE attributes ? 'author';
-- For path-specific queries on frequently accessed keys, use a functional index:
CREATE INDEX IF NOT EXISTS idx_product_brand ON product_catalog
USING BTREE ((attributes ->> 'brand'));
-- Now this query uses the btree index (faster for equality/range):
SELECT * FROM product_catalog WHERE attributes ->> 'brand' = 'Apple';
The GIN index covers @> and ? operators. For equality and range comparisons on a specific extracted field, a BTREE functional index on (attributes ->> 'key') is faster.
Example 1: Complex JSONB filter conditions
Real queries combine multiple JSONB conditions with regular column filters:
-- Find all electronics with at least 16GB RAM AND under 1.5kg AND under £2000
SELECT
name,
base_price,
attributes ->> 'brand' AS brand,
(attributes ->> 'ram_gb')::INT AS ram_gb,
(attributes ->> 'weight_kg')::NUMERIC AS weight_kg
FROM product_catalog
WHERE
category = 'Electronics'
AND (attributes ->> 'ram_gb')::INT >= 16
AND (attributes ->> 'weight_kg')::NUMERIC < 1.5
AND base_price < 2000
ORDER BY (attributes ->> 'ram_gb')::INT DESC, base_price;
-- Find all products where the brand is any of a set of values
SELECT name, category, attributes ->> 'brand' AS brand
FROM product_catalog
WHERE attributes ->> 'brand' = ANY(ARRAY['Apple', 'Lenovo', 'Dell'])
ORDER BY category, name;
-- Containment check for complex sub-documents
-- Find products whose attributes contain both a specific brand AND a specific color
SELECT name
FROM product_catalog
WHERE attributes @> '{"brand": "Apple", "color": "silver"}';
Example 2: Querying arrays inside JSONB
When a JSONB value is an array, you can check if the array contains a specific element using containment:
-- Find clothing products available in size 'L'
-- The array ["S","M","L","XL"] contains the string "L"
SELECT name, attributes -> 'sizes' AS available_sizes
FROM product_catalog
WHERE attributes @> '{"sizes": ["L"]}';
-- This is the key insight: @> checks deep containment
-- {"sizes": ["S","M","L"]} @> {"sizes": ["L"]} is TRUE
-- Get products available in both 'S' AND 'XL'
SELECT name
FROM product_catalog
WHERE attributes @> '{"sizes": ["S", "XL"]}';
-- Expand JSONB array to rows using jsonb_array_elements
SELECT
pc.name,
size.value #>> '{}' AS available_size -- #>> '{}' extracts the text value of a scalar
FROM product_catalog pc,
LATERAL jsonb_array_elements(pc.attributes -> 'sizes') AS size(value)
WHERE pc.category = 'Clothing'
ORDER BY pc.name, available_size;
The LATERAL jsonb_array_elements(...) pattern is the standard way to "explode" a JSONB array into rows — one row per array element, cross-joined with the parent row.
Example 3: JSONPath queries — flexible pattern matching
PostgreSQL 12+ supports the SQL/JSON path language, which provides more expressive querying than the -> operators:
-- jsonb_path_query: returns JSONB matching a path expression
-- Find all products where any attribute value contains "Apple"
SELECT name, jsonb_path_query_first(attributes, '$.brand') AS brand
FROM product_catalog
WHERE jsonb_path_exists(attributes, '$.brand ? (@ == "Apple")');
-- Find electronics where ram_gb is between 16 and 32
SELECT name, attributes ->> 'ram_gb' AS ram
FROM product_catalog
WHERE jsonb_path_exists(attributes, '$.ram_gb ? (@ >= 16 && @ <= 32)');
-- Extract all numeric values from attributes
SELECT name,
jsonb_path_query_array(attributes, '$.* ? (@.type() == "number")') AS numeric_values
FROM product_catalog;
-- Find products with any size starting with 'X'
SELECT name
FROM product_catalog
WHERE jsonb_path_exists(
attributes,
'$.sizes[*] ? (@ starts with "X")'
);
JSONPath syntax: $ = root document, .key = field access, [*] = all array elements, ? (condition) = filter.
Example 4: Performance comparison — indexed vs non-indexed JSONB
Understanding when queries use the GIN index vs fall back to sequential scan:
-- EXPLAIN to see index usage
EXPLAIN (ANALYZE, BUFFERS)
SELECT name FROM product_catalog
WHERE attributes @> '{"brand": "Apple"}';
-- With GIN index: "Bitmap Index Scan on idx_product_attributes"
-- Without index: "Seq Scan on product_catalog"
-- Queries that USE the GIN index:
-- attributes @> '{"key": "value"}' ← containment
-- attributes ? 'key' ← key existence
-- attributes ?& ARRAY['k1','k2'] ← all keys exist
-- attributes ?| ARRAY['k1','k2'] ← any key exists
-- Queries that do NOT use the GIN index (need BTREE functional index):
-- attributes ->> 'key' = 'value' ← equality on extracted text
-- attributes ->> 'key' > '100' ← range on extracted text
-- (attributes ->> 'key')::INT > 100 ← cast + comparison
-- Best practice: use @> for GIN-accelerated queries when checking exact values
-- Use functional BTREE indexes for frequently-filtered extracted fields
-- Create targeted functional indexes for hot query patterns
CREATE INDEX IF NOT EXISTS idx_product_brand_btree
ON product_catalog USING BTREE ((attributes ->> 'brand'));
CREATE INDEX IF NOT EXISTS idx_product_ram
ON product_catalog USING BTREE (((attributes ->> 'ram_gb')::INT))
WHERE category = 'Electronics'; -- partial index: only for electronics rows
The partial index WHERE category = 'Electronics' is both smaller and faster — it only indexes the rows where ram_gb is meaningful, reducing both index size and maintenance overhead.
Key Takeaway
GIN indexes make @> and ? operators fast. For equality and range filters on specific keys, add a functional BTREE index on (column ->> 'key'). Use LATERAL jsonb_array_elements() to explode arrays into rows. JSONPath (PostgreSQL 12+) provides regex-like matching inside documents. Always EXPLAIN your JSONB queries — a sequential scan on a large JSONB table is often catastrophic for performance.