Table Partitioning — RANGE, LIST, and HASH

Partitioning splits a large logical table into smaller physical pieces called partitions, each stored separately. The parent table defines the schema; each partition holds a specific subset of the data. Queries against the parent table automatically access only the relevant partitions — a technique called partition pruning.

Why partition?

Query performance: a query with WHERE created_at >= '2025-01-01' only scans the 2025 partition instead of the entire table
Maintenance: DROP TABLE orders_2020 is instantaneous and doesn't affect other partitions; VACUUM and REINDEX can run per-partition
Data lifecycle: automatically archiving or dropping old data without table-level operations
Parallel I/O: queries across multiple partitions can use enable_partitionwise_aggregate for parallel execution per partition

Three partition strategies

RANGE: partition by a range of values (dates, IDs) — most common for time-series
LIST: partition by explicit values (country, status, category)
HASH: distribute rows evenly by hash of a column — useful for load balancing without a natural range/list split

Example 1: RANGE partitioning by date

-- Create the partitioned parent table
CREATE TABLE orders_partitioned (
    order_id        SERIAL,
    customer_id     INT NOT NULL,
    created_at      DATE NOT NULL,
    status          TEXT NOT NULL,
    total_amount    NUMERIC(12,2) NOT NULL,
    discount_amount NUMERIC(12,2) NOT NULL DEFAULT 0,
    promo_code      TEXT
) PARTITION BY RANGE (created_at);

-- Create partitions — one per year
CREATE TABLE orders_2023 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- DEFAULT partition catches anything not matched by other partitions
CREATE TABLE orders_overflow PARTITION OF orders_partitioned
    DEFAULT;

-- Create indexes on the parent — automatically propagated to all partitions
CREATE INDEX ON orders_partitioned (customer_id);
CREATE INDEX ON orders_partitioned (created_at, status);

-- Insert works exactly like a regular table — rows route automatically
INSERT INTO orders_partitioned (customer_id, created_at, status, total_amount)
VALUES (1, '2024-06-15', 'completed', 250.00);
-- This row automatically goes into orders_2024

-- Verify partition routing
SELECT tableoid::regclass AS partition, customer_id, created_at
FROM orders_partitioned WHERE order_id = (SELECT MAX(order_id) FROM orders_partitioned);

Partition pruning in action

-- Query plans only scan the relevant partition(s)
EXPLAIN SELECT * FROM orders_partitioned
WHERE created_at >= '2025-01-01' AND created_at < '2025-04-01';
-- Plan shows: "Seq Scan on orders_2025"
-- orders_2023 and orders_2024 are pruned entirely

-- Without partition key in WHERE: scans all partitions
EXPLAIN SELECT * FROM orders_partitioned WHERE status = 'completed';
-- Plan shows scans of all partitions (no pruning possible)

Example 2: LIST partitioning by category

-- Partition products by category
CREATE TABLE products_partitioned (
    product_id  SERIAL,
    name        TEXT NOT NULL,
    category    TEXT NOT NULL,
    price       NUMERIC(10,2) NOT NULL,
    attributes  JSONB
) PARTITION BY LIST (category);

CREATE TABLE products_electronics PARTITION OF products_partitioned
    FOR VALUES IN ('Electronics', 'Computers', 'Phones');

CREATE TABLE products_clothing PARTITION OF products_partitioned
    FOR VALUES IN ('Clothing', 'Shoes', 'Accessories');

CREATE TABLE products_books PARTITION OF products_partitioned
    FOR VALUES IN ('Books', 'Media');

CREATE TABLE products_other PARTITION OF products_partitioned
    DEFAULT;

-- Queries with category filter only scan relevant partition
EXPLAIN SELECT * FROM products_partitioned WHERE category = 'Electronics';
-- Only scans products_electronics

Example 3: Sub-partitioning (composite partitioning)

-- Partition by year, then sub-partition by status within each year
DROP TABLE IF EXISTS events_partitioned CASCADE;
CREATE TABLE events_partitioned (
    event_id    BIGSERIAL,
    event_type  TEXT NOT NULL,
    customer_id INT,
    occurred_at TIMESTAMPTZ NOT NULL,
    payload     JSONB
) PARTITION BY RANGE (occurred_at);

CREATE TABLE events_2025 PARTITION OF events_partitioned
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01')
    PARTITION BY RANGE (occurred_at);

-- Monthly sub-partitions of 2025
CREATE TABLE events_2025_01 PARTITION OF events_2025
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE events_2025_02 PARTITION OF events_2025
    FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
CREATE TABLE events_2025_03 PARTITION OF events_2025
    FOR VALUES FROM ('2025-03-01') TO ('2025-04-01');

-- Query with both year and month in WHERE: only scans one sub-partition
EXPLAIN SELECT * FROM events_partitioned
WHERE occurred_at >= '2025-03-01' AND occurred_at < '2025-04-01'
  AND event_type = 'purchase';
-- Accesses only events_2025_03

-- Drop an old partition (instant — no DELETE needed)
DROP TABLE events_2025_01;  -- instant, removes one month of data

Example 4: Partition management — archiving and maintenance

-- Attach an existing table as a partition (zero-copy archiving pattern)
-- Useful for loading historical data that was stored separately
CREATE TABLE orders_2022 (LIKE orders_partitioned INCLUDING ALL);
-- Load data into orders_2022...
ALTER TABLE orders_partitioned ATTACH PARTITION orders_2022
    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

-- Detach a partition (the data stays in the now-standalone table)
ALTER TABLE orders_partitioned DETACH PARTITION orders_2023;
-- orders_2023 is now a regular table with all its data intact
-- You can archive it to cold storage, rename it, or export it

-- Check partition sizes
SELECT
    child.relname AS partition_name,
    pg_size_pretty(pg_relation_size(child.oid)) AS size,
    pg_stat_user_tables.n_live_tup AS live_rows
FROM pg_inherits
JOIN pg_class parent ON parent.oid = pg_inherits.inhparent
JOIN pg_class child  ON child.oid  = pg_inherits.inhrelid
LEFT JOIN pg_stat_user_tables ON pg_stat_user_tables.relname = child.relname
WHERE parent.relname = 'orders_partitioned'
ORDER BY child.relname;

-- Enable partition-wise aggregation for better parallel performance
SET enable_partitionwise_aggregate = ON;
EXPLAIN SELECT DATE_TRUNC('year', created_at), SUM(total_amount)
FROM orders_partitioned GROUP BY 1;
-- With partitionwise_aggregate: each partition aggregates independently in parallel

Key Takeaway

Partition by the column most frequently used in WHERE clauses — typically a date column for time-series data. RANGE partitioning is the most common strategy. Partition pruning eliminates irrelevant partitions at plan time. ATTACH PARTITION and DETACH PARTITION are the zero-downtime data lifecycle operations that make partitioning so powerful for long-lived tables. Always create indexes on the parent — they propagate to all current and future partitions.

← Previous Next →