PostgreSQL Native Arrays — Operations, Indexing, and Patterns
PostgreSQL Native Arrays — Operations, Indexing, and Patterns
PostgreSQL native arrays (integer[], text[], numeric[], etc.) are distinct from JSONB arrays. They are typed, efficient, and deeply integrated with the query planner. Unlike JSONB, native arrays can be indexed with GIN for fast containment/overlap queries, and their elements can be used directly in set operations with UNNEST, ANY, and ALL.
When to use native arrays vs JSONB
| Scenario | Use |
|---|---|
| List of integer IDs (e.g. related_product_ids) | integer[] |
| List of strings (tags, categories, permissions) | text[] |
| Variable-structure nested data | jsonb |
| Mixed types in one list | jsonb array |
| Need typed arithmetic on elements | Native array |
| Need GIN-indexed containment searches | Both work, native slightly faster |
Example 1: Array creation and basic operations
-- Create a table with array columns
CREATE TABLE IF NOT EXISTS product_tags (
product_id INT PRIMARY KEY,
name TEXT NOT NULL,
tags TEXT[],
related_ids INT[],
price_tiers NUMERIC[]
);
INSERT INTO product_tags VALUES
(1, 'MacBook Pro', ARRAY['laptop','apple','premium','portable'], ARRAY[2,3,5], ARRAY[1799,1999,2499]),
(2, 'MacBook Air', ARRAY['laptop','apple','lightweight'], ARRAY[1,3], ARRAY[1099,1299]),
(3, 'iPad Pro', ARRAY['tablet','apple','premium'], ARRAY[1,2], ARRAY[799,1099,1299]),
(4, 'ThinkPad X1', ARRAY['laptop','lenovo','premium','business'],ARRAY[5], ARRAY[1399,1599]),
(5, 'Dell XPS 15', ARRAY['laptop','dell','premium'], ARRAY[1,4], ARRAY[1499,1799,1999]);
-- Array literals: ARRAY['a','b'] and '{"a","b"}' are equivalent
-- Access element (1-based in PostgreSQL)
SELECT name, tags[1] AS first_tag FROM product_tags;
-- Slice: tags[2:3] returns elements 2 and 3
SELECT name, tags[2:3] AS middle_tags FROM product_tags;
-- Array length
SELECT name, array_length(tags, 1) AS tag_count FROM product_tags;
-- Append element
SELECT name, tags || ARRAY['new_tag'] AS extended_tags FROM product_tags WHERE product_id = 1;
-- Concatenate arrays
SELECT tags || ARRAY['sale','clearance'] FROM product_tags WHERE product_id = 1;
Example 2: Array membership and containment operators
-- product_tags created in Block 1
-- = ANY(array): is a value in the array?
-- This is the array equivalent of IN (...)
SELECT name FROM product_tags
WHERE 'apple' = ANY(tags);
-- @> contains: does the array contain all specified elements?
SELECT name FROM product_tags
WHERE tags @> ARRAY['laptop', 'premium']; -- must have BOTH
-- <@ is contained by: is this array a subset of tags?
SELECT name FROM product_tags
WHERE ARRAY['apple', 'laptop'] <@ tags; -- tags must contain both apple and laptop
-- && overlap: any element in common?
SELECT name FROM product_tags
WHERE tags && ARRAY['apple', 'lenovo']; -- has apple OR lenovo
-- NOT and negation
SELECT name FROM product_tags
WHERE NOT (tags @> ARRAY['apple']) -- no Apple products
AND 'laptop' = ANY(tags); -- but must be a laptop
-- ALL: every element in the array satisfies condition (less common)
SELECT name FROM product_tags
WHERE 1299 = ALL(price_tiers[1:1]); -- first price tier equals 1299
Example 3: GIN index for arrays
-- product_tags created in Block 1
-- Create GIN index for fast containment/overlap/membership queries
CREATE INDEX IF NOT EXISTS idx_product_tags_gin ON product_tags USING GIN (tags);
CREATE INDEX IF NOT EXISTS idx_related_ids_gin ON product_tags USING GIN (related_ids);
-- These queries now use the GIN index:
-- tags @> ARRAY['laptop']
-- tags && ARRAY['apple','lenovo']
-- 'premium' = ANY(tags) -- rewritten by planner to use @>
-- EXPLAIN to verify index use:
EXPLAIN SELECT name FROM product_tags WHERE tags @> ARRAY['laptop','premium'];
-- "Bitmap Index Scan on idx_product_tags_gin"
-- For equality on a specific element position, use a functional BTREE:
CREATE INDEX IF NOT EXISTS idx_first_tag ON product_tags USING BTREE ((tags[1:1]));
SELECT name FROM product_tags WHERE (tags[1:1])[1] = 'laptop';
Example 4: UNNEST — expanding arrays into rows
UNNEST is to native arrays what jsonb_array_elements is to JSONB arrays:
-- product_tags created in Block 1
-- Expand tags into rows
SELECT p.name, tag
FROM product_tags p, UNNEST(p.tags) AS tag
ORDER BY p.name, tag;
-- UNNEST WITH ORDINALITY for position-aware expansion
SELECT p.name, pos, tag
FROM product_tags p,
UNNEST(p.tags) WITH ORDINALITY AS t(tag, pos)
ORDER BY p.name, pos;
-- Multiple parallel UNNEST (PostgreSQL 9.4+)
-- When UNNESTing multiple arrays of equal length, rows are matched positionally
SELECT p.name, tier_num, price
FROM product_tags p,
UNNEST(price_tiers) WITH ORDINALITY AS t(price, tier_num)
ORDER BY p.name, tier_num;
-- Tag frequency analysis (same pattern as JSONB, but simpler syntax)
SELECT tag, COUNT(*) AS products
FROM product_tags, UNNEST(tags) AS tag
GROUP BY tag
ORDER BY products DESC;
-- Products that share the most tags with product 1
WITH prod1_tags AS (SELECT tags FROM product_tags WHERE product_id = 1)
SELECT
p.name,
(SELECT COUNT(*) FROM UNNEST(p.tags) AS t WHERE t = ANY(prod1_tags.tags))
AS shared_tags,
p.tags
FROM product_tags p, prod1_tags
WHERE p.product_id != 1
ORDER BY shared_tags DESC;
Example 5: Array functions for analytics
-- product_tags created in Block 1
-- array_agg: collect column values into an array (aggregate function)
SELECT
pr.category,
ARRAY_AGG(p.product_id ORDER BY p.product_id) AS product_ids,
ARRAY_AGG(DISTINCT tags[1] ORDER BY tags[1]) AS primary_tags
FROM product_tags p
JOIN products pr ON pr.product_id = p.product_id
GROUP BY pr.category;
-- array_remove: remove a specific value from array
SELECT name, array_remove(tags, 'premium') AS tags_without_premium
FROM product_tags WHERE 'premium' = ANY(tags);
-- array_replace: replace a value
SELECT name, array_replace(tags, 'apple', 'Apple Inc') AS updated_tags
FROM product_tags;
-- array_positions: find all positions of a value
SELECT name, array_positions(tags, 'laptop') AS laptop_positions
FROM product_tags;
-- array_to_string: convert array to delimited string
SELECT name, array_to_string(tags, ', ') AS tags_string
FROM product_tags ORDER BY name;
-- string_to_array: parse delimited string back to array
SELECT string_to_array('a,b,c,d', ',') AS elements;
-- intarray extension (if installed): integer array arithmetic
-- SELECT related_ids & ARRAY[1,2,3] -- intersection
-- SELECT related_ids | ARRAY[6,7,8] -- union
-- SELECT related_ids - 2 -- remove element 2
Key Takeaway
Native PostgreSQL arrays are the right choice for typed collections — IDs, tags, scores, prices. ANY(array) replaces IN (...) for dynamic lists. @> and && with GIN indexes provide fast containment and overlap searches. UNNEST expands into rows for analytics. The array_agg / UNNEST / array_agg cycle — collect, process, recollect — is a powerful pattern for in-place array transformations in SQL.