PostgreSQL Native Arrays — Operations, Indexing, and Patterns

PostgreSQL Native Arrays — Operations, Indexing, and Patterns

PostgreSQL native arrays (integer[], text[], numeric[], etc.) are distinct from JSONB arrays. They are typed, efficient, and deeply integrated with the query planner. Unlike JSONB, native arrays can be indexed with GIN for fast containment/overlap queries, and their elements can be used directly in set operations with UNNEST, ANY, and ALL.

When to use native arrays vs JSONB

Scenario Use
List of integer IDs (e.g. related_product_ids) integer[]
List of strings (tags, categories, permissions) text[]
Variable-structure nested data jsonb
Mixed types in one list jsonb array
Need typed arithmetic on elements Native array
Need GIN-indexed containment searches Both work, native slightly faster

Example 1: Array creation and basic operations

-- Create a table with array columns
CREATE TABLE IF NOT EXISTS product_tags (
    product_id  INT PRIMARY KEY,
    name        TEXT NOT NULL,
    tags        TEXT[],
    related_ids INT[],
    price_tiers NUMERIC[]
);

INSERT INTO product_tags VALUES
  (1, 'MacBook Pro', ARRAY['laptop','apple','premium','portable'], ARRAY[2,3,5], ARRAY[1799,1999,2499]),
  (2, 'MacBook Air', ARRAY['laptop','apple','lightweight'],        ARRAY[1,3],   ARRAY[1099,1299]),
  (3, 'iPad Pro',    ARRAY['tablet','apple','premium'],            ARRAY[1,2],   ARRAY[799,1099,1299]),
  (4, 'ThinkPad X1', ARRAY['laptop','lenovo','premium','business'],ARRAY[5],     ARRAY[1399,1599]),
  (5, 'Dell XPS 15', ARRAY['laptop','dell','premium'],             ARRAY[1,4],   ARRAY[1499,1799,1999]);

-- Array literals: ARRAY['a','b'] and '{"a","b"}' are equivalent
-- Access element (1-based in PostgreSQL)
SELECT name, tags[1] AS first_tag FROM product_tags;

-- Slice: tags[2:3] returns elements 2 and 3
SELECT name, tags[2:3] AS middle_tags FROM product_tags;

-- Array length
SELECT name, array_length(tags, 1) AS tag_count FROM product_tags;

-- Append element
SELECT name, tags || ARRAY['new_tag'] AS extended_tags FROM product_tags WHERE product_id = 1;

-- Concatenate arrays
SELECT tags || ARRAY['sale','clearance'] FROM product_tags WHERE product_id = 1;

Example 2: Array membership and containment operators

-- product_tags created in Block 1

-- = ANY(array): is a value in the array?
-- This is the array equivalent of IN (...)
SELECT name FROM product_tags
WHERE 'apple' = ANY(tags);

-- @> contains: does the array contain all specified elements?
SELECT name FROM product_tags
WHERE tags @> ARRAY['laptop', 'premium'];  -- must have BOTH

-- <@ is contained by: is this array a subset of tags?
SELECT name FROM product_tags
WHERE ARRAY['apple', 'laptop'] <@ tags;  -- tags must contain both apple and laptop

-- && overlap: any element in common?
SELECT name FROM product_tags
WHERE tags && ARRAY['apple', 'lenovo'];  -- has apple OR lenovo

-- NOT and negation
SELECT name FROM product_tags
WHERE NOT (tags @> ARRAY['apple'])  -- no Apple products
  AND 'laptop' = ANY(tags);         -- but must be a laptop

-- ALL: every element in the array satisfies condition (less common)
SELECT name FROM product_tags
WHERE 1299 = ALL(price_tiers[1:1]);  -- first price tier equals 1299

Example 3: GIN index for arrays

-- product_tags created in Block 1

-- Create GIN index for fast containment/overlap/membership queries
CREATE INDEX IF NOT EXISTS idx_product_tags_gin ON product_tags USING GIN (tags);
CREATE INDEX IF NOT EXISTS idx_related_ids_gin  ON product_tags USING GIN (related_ids);

-- These queries now use the GIN index:
-- tags @> ARRAY['laptop']
-- tags && ARRAY['apple','lenovo']
-- 'premium' = ANY(tags)  -- rewritten by planner to use @>

-- EXPLAIN to verify index use:
EXPLAIN SELECT name FROM product_tags WHERE tags @> ARRAY['laptop','premium'];
-- "Bitmap Index Scan on idx_product_tags_gin"

-- For equality on a specific element position, use a functional BTREE:
CREATE INDEX IF NOT EXISTS idx_first_tag ON product_tags USING BTREE ((tags[1:1]));
SELECT name FROM product_tags WHERE (tags[1:1])[1] = 'laptop';

Example 4: UNNEST — expanding arrays into rows

UNNEST is to native arrays what jsonb_array_elements is to JSONB arrays:

-- product_tags created in Block 1

-- Expand tags into rows
SELECT p.name, tag
FROM product_tags p, UNNEST(p.tags) AS tag
ORDER BY p.name, tag;

-- UNNEST WITH ORDINALITY for position-aware expansion
SELECT p.name, pos, tag
FROM product_tags p,
     UNNEST(p.tags) WITH ORDINALITY AS t(tag, pos)
ORDER BY p.name, pos;

-- Multiple parallel UNNEST (PostgreSQL 9.4+)
-- When UNNESTing multiple arrays of equal length, rows are matched positionally
SELECT p.name, tier_num, price
FROM product_tags p,
     UNNEST(price_tiers) WITH ORDINALITY AS t(price, tier_num)
ORDER BY p.name, tier_num;

-- Tag frequency analysis (same pattern as JSONB, but simpler syntax)
SELECT tag, COUNT(*) AS products
FROM product_tags, UNNEST(tags) AS tag
GROUP BY tag
ORDER BY products DESC;

-- Products that share the most tags with product 1
WITH prod1_tags AS (SELECT tags FROM product_tags WHERE product_id = 1)
SELECT
    p.name,
    (SELECT COUNT(*) FROM UNNEST(p.tags) AS t WHERE t = ANY(prod1_tags.tags))
        AS shared_tags,
    p.tags
FROM product_tags p, prod1_tags
WHERE p.product_id != 1
ORDER BY shared_tags DESC;

Example 5: Array functions for analytics

-- product_tags created in Block 1

-- array_agg: collect column values into an array (aggregate function)
SELECT
    pr.category,
    ARRAY_AGG(p.product_id ORDER BY p.product_id) AS product_ids,
    ARRAY_AGG(DISTINCT tags[1] ORDER BY tags[1]) AS primary_tags
FROM product_tags p
JOIN products pr ON pr.product_id = p.product_id
GROUP BY pr.category;

-- array_remove: remove a specific value from array
SELECT name, array_remove(tags, 'premium') AS tags_without_premium
FROM product_tags WHERE 'premium' = ANY(tags);

-- array_replace: replace a value
SELECT name, array_replace(tags, 'apple', 'Apple Inc') AS updated_tags
FROM product_tags;

-- array_positions: find all positions of a value
SELECT name, array_positions(tags, 'laptop') AS laptop_positions
FROM product_tags;

-- array_to_string: convert array to delimited string
SELECT name, array_to_string(tags, ', ') AS tags_string
FROM product_tags ORDER BY name;

-- string_to_array: parse delimited string back to array
SELECT string_to_array('a,b,c,d', ',') AS elements;

-- intarray extension (if installed): integer array arithmetic
-- SELECT related_ids & ARRAY[1,2,3]  -- intersection
-- SELECT related_ids | ARRAY[6,7,8]  -- union
-- SELECT related_ids - 2             -- remove element 2

Key Takeaway

Native PostgreSQL arrays are the right choice for typed collections — IDs, tags, scores, prices. ANY(array) replaces IN (...) for dynamic lists. @> and && with GIN indexes provide fast containment and overlap searches. UNNEST expands into rows for analytics. The array_agg / UNNEST / array_agg cycle — collect, process, recollect — is a powerful pattern for in-place array transformations in SQL.