Index Types and Strategies — B-tree, GIN, GiST, BRIN, and Hash
Index Types and Strategies — B-tree, GIN, GiST, BRIN, and Hash
PostgreSQL has five main index types, each optimised for a different access pattern. Choosing the wrong index type is one of the most common performance mistakes — and choosing the right one can reduce query time from minutes to milliseconds.
The five index types
| Type | Best for | Supports |
|---|---|---|
| B-tree | Equality, range, ordering on scalar values | =, <, >, <=, >=, BETWEEN, ORDER BY |
| GIN | Multi-valued types (arrays, JSONB, tsvector) | @>, ?, &&, @@ |
| GiST | Geometric types, full-text, ranges, nearest-neighbour | Overlap, containment, distance |
| BRIN | Very large tables with physical ordering correlation | Range queries on sequential data |
| Hash | Equality only (rarely useful; B-tree covers it with more) | = only |
Example 1: B-tree index strategies
B-tree is the default and handles the vast majority of indexing needs:
-- Setup: create a small `documents_with_search` table for the
-- search-vector index examples below.
CREATE TABLE IF NOT EXISTS documents_with_search (
id BIGSERIAL PRIMARY KEY,
name TEXT,
body TEXT,
search_vector TSVECTOR GENERATED ALWAYS AS
(to_tsvector('english', COALESCE(name,'') || ' ' || COALESCE(body,''))) STORED
);
-- Basic single-column index
CREATE INDEX IF NOT EXISTS idx_orders_date ON orders (created_at);
-- Speeds up: WHERE created_at > '2025-01-01' ORDER BY created_at
-- Descending index (useful when queries ORDER BY col DESC)
CREATE INDEX IF NOT EXISTS idx_orders_date_desc ON orders (created_at DESC);
-- Composite index — column order matters enormously
CREATE INDEX IF NOT EXISTS idx_orders_status_date ON orders (status, created_at);
-- Good for: WHERE status = 'completed' AND created_at > '2025-01-01'
-- Also usable for: WHERE status = 'completed' (index scan on first column)
-- NOT useful for: WHERE created_at > '2025-01-01' without status filter
-- Covering index — INCLUDE additional columns to avoid table heap access
CREATE INDEX IF NOT EXISTS idx_ec_orders_covering ON ec_orders (customer_id, created_at)
INCLUDE (total_amount, status);
-- For queries like:
-- SELECT total_amount, status FROM ec_orders WHERE customer_id = 5 ORDER BY created_at
-- This is an index-only scan — no heap access at all
-- Partial index — index only a subset of rows
CREATE INDEX IF NOT EXISTS idx_active_orders ON orders (customer_id, created_at)
WHERE status = 'completed';
-- Smaller index, faster for completed-order queries
-- Planner will use it when WHERE status = 'completed' is in the query
-- Expression/functional index
-- ❌ Note: DATE_TRUNC on TIMESTAMPTZ is NOT IMMUTABLE in PostgreSQL —
-- you'd need to cast at TIME ZONE 'UTC' first or use a generated column.
-- CREATE INDEX IF NOT EXISTS idx_order_month ON orders (DATE_TRUNC('month', created_at));
-- Required for: WHERE DATE_TRUNC('month', created_at) = '2025-01-01'
-- Without this, the function prevents index usage on created_at
Example 2: GIN indexes for arrays and JSONB
-- ❌ template (uses articles/events_log/product_tags created in your own setup)
-- GIN for native arrays
CREATE INDEX IF NOT EXISTS idx_tags_gin ON product_tags USING GIN (tags);
-- Supports: WHERE tags @> ARRAY['laptop','premium']
-- Supports: WHERE 'laptop' = ANY(tags) [planner rewrites to @>]
-- Supports: WHERE tags && ARRAY['apple','lenovo']
-- GIN for JSONB
CREATE INDEX IF NOT EXISTS idx_payload_gin ON events_log USING GIN (payload);
-- Supports: WHERE payload @> '{"event_type":"purchase"}'
-- Supports: WHERE payload ? 'session_id'
-- GIN for full-text search
CREATE INDEX IF NOT EXISTS idx_articles_fts ON articles USING GIN (search_vector);
-- Supports: WHERE search_vector @@ to_tsquery('english', 'postgresql')
-- GIN jsonb_path_ops opclass — smaller, faster for @> only (no ? support)
CREATE INDEX IF NOT EXISTS idx_payload_path ON events_log USING GIN (payload jsonb_path_ops);
-- Only supports @> but with smaller index footprint than default GIN
Example 3: BRIN indexes for time-series tables
BRIN (Block Range INdex) stores min/max values per block range rather than individual values. The index is tiny (kilobytes for billion-row tables) but only useful when the table is physically sorted by the indexed column — which is natural for append-only time-series data:
-- BRIN for a large time-series events table
-- Works because rows are inserted in roughly chronological order
CREATE INDEX IF NOT EXISTS idx_events_brin ON events USING BRIN (occurred_at)
WITH (pages_per_range = 128);
-- Each block range covers 128 pages
-- The index stores the min/max occurred_at for each block range
-- BRIN is useful for:
-- WHERE occurred_at > NOW() - INTERVAL '7 days' -- eliminates old block ranges
-- WHERE occurred_at BETWEEN '2025-01-01' AND '2025-02-01'
-- NOT useful for:
-- WHERE customer_id = 5 -- customer_id has no correlation with physical storage order
-- Check correlation to determine BRIN suitability
SELECT attname, correlation
FROM pg_stats
WHERE tablename = 'events'
ORDER BY ABS(correlation) DESC;
-- correlation near 1.0 (or -1.0) = good BRIN candidate
-- correlation near 0.0 = BRIN will be useless
Example 4: Index maintenance and monitoring
-- Find unused indexes (candidates for removal)
SELECT
schemaname,
tablename,
indexname,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%_pkey' -- keep primary keys
AND indexname NOT LIKE '%_unique' -- keep unique constraints
ORDER BY pg_relation_size(indexrelid) DESC;
-- Find indexes that are rarely used but large
SELECT
tablename,
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS size,
pg_size_pretty(pg_total_relation_size(tablename::regclass)) AS table_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;
-- Check for duplicate indexes (same column set, different names)
SELECT
a.tablename, a.indexname AS idx_a, b.indexname AS idx_b,
a.indexdef
FROM pg_indexes a JOIN pg_indexes b
ON a.tablename = b.tablename
AND a.indexdef = b.indexdef
AND a.indexname < b.indexname
ORDER BY a.tablename;
-- Index bloat — indexes grow over time with updates/deletes
-- REINDEX CONCURRENTLY rebuilds without locking
REINDEX INDEX CONCURRENTLY idx_orders_date;
-- Or VACUUM to reclaim dead tuples (run automatically by autovacuum)
VACUUM ANALYZE orders;
Key Takeaway
B-tree covers most needs (equality, range, ORDER BY). GIN covers multi-valued types (arrays, JSONB, tsvector). BRIN covers append-only time-series tables with minimal overhead. The composite index column order matters — most selective filter first. Covering indexes with INCLUDE eliminate heap access. Partial indexes reduce size and maintenance overhead. Regularly audit with pg_stat_user_indexes to drop unused indexes — every index costs write performance.