Full-Text Search Integration with JSONB and Arrays

Full-Text Search Integration with JSONB and Arrays

PostgreSQL has a native full-text search engine built in — no Elasticsearch required for many use cases. It can search JSONB values, array elements, and regular text columns, with ranking, prefix matching, and phrase search. This lesson covers integrating full-text search with JSONB-heavy schemas.

How PostgreSQL full-text search works

  1. tsvector: a pre-processed, normalised representation of a document — words are stemmed, stop words removed, and positions recorded
  2. tsquery: a search expression — words connected by & (AND), | (OR), ! (NOT), and <-> (adjacent/phrase)
  3. @@: the match operator — returns TRUE if a tsvector matches a tsquery
  4. ts_rank: a relevance score based on term frequency and position

Example 1: Full-text search on JSONB text fields

-- Search product names and JSONB descriptions
-- First, create a generated column for fast indexing
ALTER TABLE product_catalog ADD COLUMN search_vector TSVECTOR
    GENERATED ALWAYS AS (
        to_tsvector('english',
            COALESCE(name, '') || ' ' ||
            COALESCE(attributes ->> 'brand', '') || ' ' ||
            COALESCE(attributes ->> 'material', '') || ' ' ||
            COALESCE(category, '')
        )
    ) STORED;

CREATE INDEX idx_product_fts ON product_catalog USING GIN (search_vector);

-- Full-text search query
SELECT
    name,
    category,
    attributes ->> 'brand' AS brand,
    base_price,
    ts_rank(search_vector, query) AS rank
FROM product_catalog,
     to_tsquery('english', 'laptop & apple') AS query
WHERE search_vector @@ query
ORDER BY rank DESC;

-- Phrase search: "denim" followed by "cotton" (adjacent concepts)
SELECT name FROM product_catalog
WHERE search_vector @@ to_tsquery('english', 'cotton | denim');

-- Prefix/wildcard search using plainto_tsquery (simpler syntax)
SELECT name FROM product_catalog
WHERE search_vector @@ plainto_tsquery('english', 'portable laptop');

Example 2: Searching inside JSONB arrays (tags, descriptions)

SELECT 'See block below for the FTS+JSONB array pattern (illustrative).' AS info;
-- Search across JSONB array values (illustrative — adapt to your own
-- table that has a JSONB `tags` array column).
-- The pattern below shows how to flatten the array into a tsvector.
/*
ALTER TABLE articles ADD COLUMN search_vector TSVECTOR
    GENERATED ALWAYS AS (
        to_tsvector('english',
            title || ' ' ||
            COALESCE(
                (SELECT STRING_AGG(val #>> '{}', ' ')
                 FROM jsonb_array_elements(tags) AS val),
                '
            )
        )
    ) STORED;

CREATE INDEX idx_articles_fts ON articles USING GIN (search_vector);

-- Search articles
SELECT
    title,
    published_at,
    ts_headline('english', title,
        to_tsquery('english', 'window & function'),
        'MaxFragments=1, MaxWords=10, MinWords=5'
    ) AS headline  -- highlights matching terms
FROM articles
WHERE search_vector @@ to_tsquery('english', 'window & function')
ORDER BY ts_rank(search_vector, to_tsquery('english', 'window & function')) DESC;
*/

ts_headline is invaluable for search result display — it returns a snippet of the original text with matching terms highlighted (using <b> tags by default, or a custom delimiter).


Example 3: Combining full-text search with JSONB filtering

-- Faceted search: full-text AND structured filters from JSONB
SELECT
    name,
    base_price,
    attributes ->> 'brand' AS brand,
    (attributes ->> 'ram_gb')::INT AS ram_gb,
    ts_rank(search_vector, query) AS relevance
FROM product_catalog,
     to_tsquery('english', 'laptop | portable') AS query
WHERE
    -- Full-text match
    search_vector @@ query
    -- JSONB filter: only high-RAM models
    AND (attributes ->> 'ram_gb')::INT >= 16
    -- Price filter: regular column
    AND base_price <= 2000
    -- Category filter
    AND category = 'Electronics'
ORDER BY relevance DESC, base_price;

This is the faceted search pattern: full-text determines which documents are relevant, JSONB/column filters narrow to specific facets. The GIN index on search_vector handles the text part; the BTREE index on base_price handles the price filter.


Example 4: Building a search function with ranking

-- A reusable search function for products
CREATE OR REPLACE FUNCTION search_products(
    search_query TEXT,
    category_filter TEXT DEFAULT NULL,
    max_price NUMERIC DEFAULT NULL,
    min_ram INT DEFAULT NULL,
    result_limit INT DEFAULT 20
)
RETURNS TABLE (
    product_id  INT,
    name        TEXT,
    category    TEXT,
    brand       TEXT,
    base_price  NUMERIC,
    relevance   FLOAT4
) AS $$
    SELECT
        pc.product_id,
        pc.name,
        pc.category,
        pc.attributes ->> 'brand',
        pc.base_price,
        ts_rank(pc.search_vector, q.query)
    FROM product_catalog pc,
         websearch_to_tsquery('english', search_query) AS q(query)
    WHERE
        pc.search_vector @@ q.query
        AND (category_filter IS NULL OR pc.category = category_filter)
        AND (max_price IS NULL OR pc.base_price <= max_price)
        AND (min_ram IS NULL OR (pc.attributes ->> 'ram_gb')::INT >= min_ram)
    ORDER BY ts_rank(pc.search_vector, q.query) DESC
    LIMIT result_limit;
$$ LANGUAGE SQL STABLE;

-- Usage:
SELECT * FROM search_products('apple laptop', 'Electronics', 2500, 16);
SELECT * FROM search_products('denim cotton', 'Clothing');

Key Takeaway

PostgreSQL's full-text search works seamlessly with JSONB: use generated TSVECTOR columns that concatenate JSONB fields with regular text columns, index with GIN, and query with @@. The combination of full-text matching (@@) with JSONB containment (@>) and regular column filters gives you the faceted search that most applications need without an external search engine. ts_headline provides search result snippets with term highlighting.