Setting up a practice database

Setting up your practice database

This single lesson contains every CREATE TABLE and INSERT statement used anywhere in this course. Run it once on your own PostgreSQL instance and every example, every exercise, and every section will work without you ever having to hunt down setup scripts in individual lessons.

Lifetime access — pick this up whenever you like. You purchased this course on a one-time basis, so you have permanent access. You can sign in from the website, your phone, or your laptop and pick up exactly where you left off. There is no expiry, no recurring charge, and no rush. If you ever have a question, head to the user portal after signing in and send us a message — or use the contact form on absolutelearners.com. We answer every message personally.


How to run the SQL below

You only need a working PostgreSQL database. Three easy options:

Option 1 — Local PostgreSQL If you have Postgres installed, create a new database and connect with psql:

createdb shopmetrics
psql shopmetrics

Then paste the blocks below in order.

Option 2 — neon.tech (free cloud Postgres, no credit card)

  1. Sign up at neon.tech
  2. Create a project, open the SQL editor
  3. Paste the blocks below in order and hit Run

Option 3 — db-fiddle.com Quick experiments without an account. Paste schema + a query and you're off.

Whatever option you pick — type the queries yourself the first time around rather than copying and pasting. Building muscle memory with the syntax is the single biggest difference between people who recognise SQL and people who write SQL.


Block 1 — ShopMetrics e-commerce (used by Foundations & most early lessons)

-- Customers
CREATE TABLE IF NOT EXISTS customers (
    customer_id   SERIAL PRIMARY KEY,
    name          TEXT NOT NULL,
    email         TEXT UNIQUE NOT NULL,
    country       TEXT NOT NULL,
    created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Products
CREATE TABLE IF NOT EXISTS products (
    product_id    SERIAL PRIMARY KEY,
    name          TEXT NOT NULL,
    category      TEXT NOT NULL,
    price         NUMERIC(10,2) NOT NULL
);

-- Orders
CREATE TABLE IF NOT EXISTS orders (
    order_id      SERIAL PRIMARY KEY,
    customer_id   INT REFERENCES customers,
    created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    status        TEXT NOT NULL  -- 'completed', 'refunded', 'pending'
);

-- Order line items
CREATE TABLE IF NOT EXISTS order_items (
    item_id       SERIAL PRIMARY KEY,
    order_id      INT REFERENCES orders,
    product_id    INT REFERENCES products,
    quantity      INT NOT NULL,
    unit_price    NUMERIC(10,2) NOT NULL
);

-- Page events (for funnel analysis later)
CREATE TABLE IF NOT EXISTS events (
    event_id      SERIAL PRIMARY KEY,
    customer_id   INT REFERENCES customers,
    event_type    TEXT NOT NULL,  -- 'view', 'add_to_cart', 'purchase'
    page          TEXT,
    occurred_at   TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Customers (50 rows)
INSERT INTO customers (name, email, country, created_at)
SELECT
    'Customer ' || n,
    'user' || n || '@example.com',
    (ARRAY['US','UK','CA','AU','DE'])[1 + (n % 5)],
    NOW() - (random() * INTERVAL '2 years')
FROM generate_series(1, 50) AS n
ON CONFLICT (email) DO NOTHING;

-- Products (20 rows)
INSERT INTO products (name, category, price)
VALUES
    ('SQL Mastery Book',        'Books',     49.99),
    ('Python Crash Course',     'Books',     39.99),
    ('Mechanical Keyboard',     'Hardware', 129.00),
    ('Wireless Mouse',          'Hardware',  59.99),
    ('4K Monitor',              'Hardware', 499.00),
    ('VSCode Pro Theme Pack',   'Software',  12.00),
    ('Linux Command Poster',    'Merch',     19.99),
    ('Dev Sticker Pack',        'Merch',      9.99),
    ('Cloud Storage 1yr',       'Software',  99.00),
    ('Data Engineering Course', 'Courses',  199.00),
    ('React Deep Dive',         'Courses',   89.00),
    ('PostgreSQL internals',    'Books',     59.99),
    ('Standing Desk Mat',       'Hardware',  79.00),
    ('USB-C Hub',               'Hardware',  45.00),
    ('Noise Cancelling Buds',   'Hardware', 179.00),
    ('Git Cheatsheet Poster',   'Merch',     14.99),
    ('Bash Scripting Guide',    'Books',     29.99),
    ('CI/CD Pipeline Course',   'Courses',  149.00),
    ('API Design Handbook',     'Books',     44.99),
    ('Dark Mode Icon Pack',     'Software',   8.00)
ON CONFLICT DO NOTHING;

-- Orders (200 rows spread across 2 years)
INSERT INTO orders (customer_id, created_at, status)
SELECT
    1 + (random() * 49)::INT,
    NOW() - (random() * INTERVAL '2 years'),
    (ARRAY['completed','completed','completed','refunded','pending'])[1 + (random()*4)::INT]
FROM generate_series(1, 200)
ON CONFLICT DO NOTHING;

-- Order items (1-4 items per order)
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
SELECT
    o.order_id,
    1 + (random() * 19)::INT,
    1 + (random() * 3)::INT,
    p.price
FROM orders o
CROSS JOIN LATERAL (
    SELECT price FROM products
    ORDER BY random()
    LIMIT 1 + (random() * 3)::INT
) p
ON CONFLICT DO NOTHING;

-- Events
INSERT INTO events (customer_id, event_type, page, occurred_at)
SELECT
    1 + (random() * 49)::INT,
    (ARRAY['view','view','view','add_to_cart','add_to_cart','purchase'])[1 + (random()*5)::INT],
    '/product/' || (1 + (random() * 19)::INT),
    NOW() - (random() * INTERVAL '2 years')
FROM generate_series(1, 1000)
ON CONFLICT DO NOTHING;

Block 2 — Stock Market (companies, stock_prices, dividends)

Used in: CTEs, Window Functions, Stock Market Analytics, Time Series.

CREATE TABLE IF NOT EXISTS companies (
    company_id    SERIAL PRIMARY KEY,
    ticker        VARCHAR(10) UNIQUE NOT NULL,
    name          TEXT NOT NULL,
    sector        TEXT NOT NULL,
    market_cap_b  NUMERIC(10,2),
    founded_year  INT
);

CREATE TABLE IF NOT EXISTS stock_prices (
    price_id     SERIAL PRIMARY KEY,
    company_id   INT REFERENCES companies,
    price_date   DATE NOT NULL,
    open_price   NUMERIC(10,2) NOT NULL,
    high_price   NUMERIC(10,2) NOT NULL,
    low_price    NUMERIC(10,2) NOT NULL,
    close_price  NUMERIC(10,2) NOT NULL,
    volume       BIGINT NOT NULL,
    UNIQUE(company_id, price_date)
);

CREATE TABLE IF NOT EXISTS dividends (
    dividend_id      SERIAL PRIMARY KEY,
    company_id       INT REFERENCES companies,
    ex_date          DATE NOT NULL,
    amount_per_share NUMERIC(8,4) NOT NULL
);
INSERT INTO companies (ticker, name, sector, market_cap_b, founded_year) VALUES
  ('AAPL',  'Apple Inc.',            'Technology',    2800.00, 1976),
  ('MSFT',  'Microsoft Corp.',       'Technology',    2500.00, 1975),
  ('GOOGL', 'Alphabet Inc.',         'Technology',    1700.00, 1998),
  ('AMZN',  'Amazon.com Inc.',       'Consumer',      1400.00, 1994),
  ('NVDA',  'NVIDIA Corp.',          'Technology',    1200.00, 1993),
  ('TSLA',  'Tesla Inc.',            'Automotive',     650.00, 2003),
  ('META',  'Meta Platforms Inc.',   'Technology',    1100.00, 2004),
  ('NFLX',  'Netflix Inc.',          'Entertainment',  230.00, 1997),
  ('CRM',   'Salesforce Inc.',       'Technology',     220.00, 1999),
  ('SHOP',  'Shopify Inc.',          'Technology',     110.00, 2006)
ON CONFLICT (ticker) DO NOTHING;

INSERT INTO stock_prices (company_id, price_date, open_price, high_price, low_price, close_price, volume)
SELECT
    c.company_id,
    d::DATE AS price_date,
    ROUND((base + random() * 20 - 10)::NUMERIC, 2) AS open_price,
    ROUND((base + random() * 25 - 5)::NUMERIC, 2)  AS high_price,
    ROUND((base + random() * 25 - 25)::NUMERIC, 2) AS low_price,
    ROUND((base + random() * 20 - 10)::NUMERIC, 2) AS close_price,
    (1000000 + (random() * 9000000)::BIGINT)        AS volume
FROM
    generate_series('2024-01-02'::DATE, '2025-12-31'::DATE, '1 day'::INTERVAL) d
    CROSS JOIN (VALUES
        (1, 185.00),(2, 375.00),(3, 140.00),(4, 185.00),(5, 480.00),
        (6, 250.00),(7, 380.00),(8, 620.00),(9, 285.00),(10, 90.00)
    ) AS c(company_id, base)
WHERE EXTRACT(DOW FROM d) NOT IN (0,6)
ON CONFLICT DO NOTHING;

INSERT INTO dividends (company_id, ex_date, amount_per_share) VALUES
  (1, '2024-02-09', 0.24),(1, '2024-05-10', 0.25),(1, '2024-08-12', 0.25),(1, '2024-11-08', 0.25),
  (2, '2024-02-14', 0.75),(2, '2024-05-15', 0.75),(2, '2024-08-14', 0.75),(2, '2024-11-20', 0.75),
  (2, '2025-02-12', 0.75),(2, '2025-05-14', 0.75),
  (1, '2025-02-07', 0.25),(1, '2025-05-09', 0.25)
ON CONFLICT DO NOTHING;

Block 3 — Sports Analytics (teams, players, matches, player_stats)

CREATE TABLE IF NOT EXISTS teams (
    team_id      SERIAL PRIMARY KEY,
    name         TEXT NOT NULL,
    city         TEXT NOT NULL,
    country      TEXT NOT NULL,
    league       TEXT NOT NULL,
    founded_year INT,
    stadium      TEXT,
    capacity     INT
);

CREATE TABLE IF NOT EXISTS players (
    player_id      SERIAL PRIMARY KEY,
    team_id        INT REFERENCES teams,
    name           TEXT NOT NULL,
    position       TEXT NOT NULL,
    nationality    TEXT NOT NULL,
    age            INT,
    market_value_m NUMERIC(8,2)
);

CREATE TABLE IF NOT EXISTS matches (
    match_id      SERIAL PRIMARY KEY,
    home_team_id  INT REFERENCES teams,
    away_team_id  INT REFERENCES teams,
    match_date    DATE NOT NULL,
    season        TEXT NOT NULL,
    competition   TEXT NOT NULL,
    home_goals    INT NOT NULL DEFAULT 0,
    away_goals    INT NOT NULL DEFAULT 0,
    attendance    INT,
    referee       TEXT
);

CREATE TABLE IF NOT EXISTS player_stats (
    stat_id        SERIAL PRIMARY KEY,
    match_id       INT REFERENCES matches,
    player_id      INT REFERENCES players,
    goals          INT NOT NULL DEFAULT 0,
    assists        INT NOT NULL DEFAULT 0,
    minutes_played INT NOT NULL DEFAULT 0,
    shots          INT NOT NULL DEFAULT 0,
    yellow_cards   INT NOT NULL DEFAULT 0,
    red_cards      INT NOT NULL DEFAULT 0,
    rating         NUMERIC(4,2)
);
INSERT INTO teams (name, city, country, league, founded_year, stadium, capacity) VALUES
  ('Manchester City',   'Manchester',    'England', 'Premier League', 1880, 'Etihad Stadium',                  53400),
  ('Arsenal',           'London',        'England', 'Premier League', 1886, 'Emirates Stadium',                60704),
  ('Liverpool',         'Liverpool',     'England', 'Premier League', 1892, 'Anfield',                         61276),
  ('Chelsea',           'London',        'England', 'Premier League', 1905, 'Stamford Bridge',                 40341),
  ('Manchester United', 'Manchester',    'England', 'Premier League', 1878, 'Old Trafford',                    74310),
  ('Tottenham Hotspur', 'London',        'England', 'Premier League', 1882, 'Tottenham Hotspur Stadium',       62850),
  ('Newcastle United',  'Newcastle',     'England', 'Premier League', 1892, 'St James Park',                   52305),
  ('Aston Villa',       'Birmingham',    'England', 'Premier League', 1874, 'Villa Park',                      42785),
  ('West Ham United',   'London',        'England', 'Premier League', 1895, 'London Stadium',                  62500),
  ('Brighton',          'Brighton',      'England', 'Premier League', 1901, 'Amex Stadium',                    31800),
  ('Real Madrid',       'Madrid',        'Spain',   'La Liga',        1902, 'Santiago Bernabeu',               81044),
  ('Barcelona',         'Barcelona',     'Spain',   'La Liga',        1899, 'Spotify Camp Nou',                99354),
  ('Atletico Madrid',   'Madrid',        'Spain',   'La Liga',        1903, 'Civitas Metropolitano',           68456),
  ('Sevilla',           'Seville',       'Spain',   'La Liga',        1890, 'Ramon Sanchez-Pizjuan',           43883),
  ('Real Sociedad',     'San Sebastian', 'Spain',   'La Liga',        1909, 'Reale Arena',                     39500),
  ('Villarreal',        'Villarreal',    'Spain',   'La Liga',        1923, 'Estadio de la Ceramica',          23500),
  ('Athletic Bilbao',   'Bilbao',        'Spain',   'La Liga',        1898, 'San Mames',                       53289),
  ('Valencia',          'Valencia',      'Spain',   'La Liga',        1919, 'Mestalla',                        55000),
  ('Real Betis',        'Seville',       'Spain',   'La Liga',        1907, 'Estadio Benito Villamarin',       60721),
  ('Girona',            'Girona',        'Spain',   'La Liga',        1930, 'Estadi Montilivi',                13450);

INSERT INTO players (team_id, name, position, nationality, age, market_value_m) VALUES
  (1, 'Ederson',        'GK',  'Brazil',    31, 45.0),
  (1, 'Ruben Dias',     'DEF', 'Portugal',  26, 90.0),
  (1, 'Josko Gvardiol', 'DEF', 'Croatia',   22, 80.0),
  (1, 'Rodri',          'MID', 'Spain',     27, 130.0),
  (1, 'Kevin De Bruyne','MID', 'Belgium',   33, 60.0),
  (1, 'Phil Foden',     'MID', 'England',   24, 150.0),
  (1, 'Erling Haaland', 'FWD', 'Norway',    23, 180.0),
  (2, 'David Raya',     'GK',  'Spain',     29, 40.0),
  (2, 'Ben White',      'DEF', 'England',   26, 65.0),
  (2, 'William Saliba', 'DEF', 'France',    23, 100.0),
  (2, 'Declan Rice',    'MID', 'England',   25, 120.0),
  (2, 'Martin Odegaard','MID', 'Norway',    25, 130.0),
  (2, 'Bukayo Saka',    'FWD', 'England',   22, 160.0),
  (2, 'Gabriel Martinelli','FWD','Brazil',  23, 100.0),
  (3, 'Alisson',        'GK',  'Brazil',    31, 60.0),
  (3, 'Trent Alexander-Arnold','DEF','England',25,90.0),
  (3, 'Virgil van Dijk','DEF', 'Netherlands',33,55.0),
  (3, 'Alexis Mac Allister','MID','Argentina',25,80.0),
  (3, 'Dominik Szoboszlai','MID','Hungary', 23, 80.0),
  (3, 'Mohamed Salah',  'FWD', 'Egypt',     32, 60.0),
  (3, 'Darwin Nunez',   'FWD', 'Uruguay',   25, 80.0),
  (11,'Thibaut Courtois','GK', 'Belgium',   32, 45.0),
  (11,'Dani Carvajal',  'DEF', 'Spain',     32, 35.0),
  (11,'Antonio Rudiger','DEF','Germany',    31, 35.0),
  (11,'Luka Modric',    'MID', 'Croatia',   39, 10.0),
  (11,'Jude Bellingham','MID','England',    20, 200.0),
  (11,'Vinicius Jr',    'FWD', 'Brazil',    23, 200.0),
  (11,'Kylian Mbappe',  'FWD', 'France',    25, 180.0),
  (12,'Marc-Andre ter Stegen','GK','Germany',32,30.0),
  (12,'Ronald Araujo',  'DEF', 'Uruguay',   25, 80.0),
  (12,'Pedri',          'MID', 'Spain',     21, 130.0),
  (12,'Gavi',           'MID', 'Spain',     19, 120.0),
  (12,'Ferran Torres',  'FWD', 'Spain',     24, 50.0),
  (12,'Robert Lewandowski','FWD','Poland',  36, 25.0);

INSERT INTO matches (home_team_id, away_team_id, match_date, season, competition, home_goals, away_goals, attendance) VALUES
  (1, 2,  '2023-08-19', '2023-24', 'Premier League', 1, 0, 53200),
  (3, 4,  '2023-08-20', '2023-24', 'Premier League', 2, 1, 60800),
  (5, 6,  '2023-08-21', '2023-24', 'Premier League', 0, 2, 73400),
  (7, 8,  '2023-08-26', '2023-24', 'Premier League', 3, 1, 51900),
  (9, 10, '2023-08-26', '2023-24', 'Premier League', 1, 1, 62000),
  (2, 1,  '2023-10-08', '2023-24', 'Premier League', 0, 1, 60200),
  (4, 3,  '2023-10-22', '2023-24', 'Premier League', 1, 3, 39800),
  (1, 5,  '2023-12-10', '2023-24', 'Premier League', 3, 0, 53100),
  (3, 9,  '2023-12-23', '2023-24', 'Premier League', 4, 0, 60900),
  (2, 7,  '2024-01-14', '2023-24', 'Premier League', 2, 0, 60300),
  (1, 3,  '2024-03-10', '2023-24', 'Premier League', 1, 1, 53400),
  (2, 4,  '2024-04-07', '2023-24', 'Premier League', 5, 0, 60600),
  (11,12, '2023-10-28', '2023-24', 'La Liga',         2, 1, 80500),
  (12,11, '2024-04-21', '2023-24', 'La Liga',         1, 3, 98000),
  (11,13, '2023-09-24', '2023-24', 'La Liga',         3, 1, 79200),
  (13,11, '2024-02-04', '2023-24', 'La Liga',         2, 1, 67800),
  (12,13, '2023-11-05', '2023-24', 'La Liga',         1, 0, 97400),
  (11,14, '2023-08-13', '2023-24', 'La Liga',         2, 0, 78900),
  (11,15, '2024-01-13', '2023-24', 'La Liga',         2, 0, 80100),
  (1, 3,  '2024-09-22', '2024-25', 'Premier League', 2, 2, 53400),
  (2, 5,  '2024-09-01', '2024-25', 'Premier League', 3, 0, 60700),
  (3, 2,  '2024-10-27', '2024-25', 'Premier League', 2, 1, 61000),
  (4, 1,  '2024-11-10', '2024-25', 'Premier League', 0, 3, 40100),
  (1, 6,  '2024-12-15', '2024-25', 'Premier League', 4, 1, 53300),
  (11,12, '2024-10-26', '2024-25', 'La Liga',         0, 4, 81000),
  (12,11, '2025-05-11', '2024-25', 'La Liga',         2, 2, 99000);

INSERT INTO player_stats (match_id, player_id, goals, assists, minutes_played, shots, yellow_cards, rating)
SELECT
    m.match_id,
    p.player_id,
    CASE WHEN p.position = 'FWD' THEN (random() * 1.5)::INT
         WHEN p.position = 'MID' THEN (random() * 0.8)::INT
         ELSE 0 END AS goals,
    CASE WHEN p.position IN ('FWD','MID') THEN (random() * 1.2)::INT ELSE 0 END AS assists,
    CASE WHEN p.position = 'GK' THEN 90
         ELSE 45 + (random() * 50)::INT END AS minutes_played,
    CASE WHEN p.position = 'FWD' THEN (random() * 5)::INT
         WHEN p.position = 'MID' THEN (random() * 3)::INT
         ELSE (random() * 1)::INT END AS shots,
    CASE WHEN random() < 0.1 THEN 1 ELSE 0 END AS yellow_cards,
    ROUND((6.0 + random() * 3.5)::NUMERIC, 1) AS rating
FROM matches m
CROSS JOIN players p
WHERE p.team_id IN (m.home_team_id, m.away_team_id)
  AND random() < 0.8;

Block 4 — Finance & Invoicing (clients, invoices, invoice_items, payments)

CREATE TABLE IF NOT EXISTS clients (
    client_id    SERIAL PRIMARY KEY,
    company_name TEXT NOT NULL,
    industry     TEXT NOT NULL,
    country      TEXT NOT NULL,
    credit_limit NUMERIC(12,2),
    since_date   DATE NOT NULL,
    account_manager TEXT
);

CREATE TABLE IF NOT EXISTS invoices (
    invoice_id  SERIAL PRIMARY KEY,
    client_id   INT REFERENCES clients,
    invoice_no  VARCHAR(20) UNIQUE NOT NULL,
    issue_date  DATE NOT NULL,
    due_date    DATE NOT NULL,
    status      TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS invoice_items (
    item_id     SERIAL PRIMARY KEY,
    invoice_id  INT REFERENCES invoices,
    description TEXT NOT NULL,
    quantity    NUMERIC(10,2) NOT NULL,
    unit_price  NUMERIC(12,2) NOT NULL,
    tax_rate    NUMERIC(5,4) NOT NULL DEFAULT 0.20
);

CREATE TABLE IF NOT EXISTS payments (
    payment_id SERIAL PRIMARY KEY,
    invoice_id INT REFERENCES invoices,
    payment_date DATE NOT NULL,
    amount     NUMERIC(12,2) NOT NULL,
    method     TEXT NOT NULL
);
INSERT INTO clients (company_name, industry, country, credit_limit, since_date, account_manager) VALUES
  ('Acme Corp',            'Manufacturing',  'US', 500000, '2019-03-15', 'Sarah Chen'),
  ('TechFlow Ltd',         'Software',       'UK', 250000, '2020-07-01', 'James Wilson'),
  ('Nordic Supplies',      'Retail',         'SE', 180000, '2021-01-10', 'Sarah Chen'),
  ('Gulf Dynamics',        'Energy',         'AE', 750000, '2018-11-20', 'Ahmed Hassan'),
  ('Pacific Ventures',     'Finance',        'AU', 300000, '2022-02-14', 'James Wilson'),
  ('Berlin Digital',       'Software',       'DE', 200000, '2020-09-05', 'Laura Schmidt'),
  ('Maple Industries',     'Manufacturing',  'CA', 420000, '2019-06-30', 'Sarah Chen'),
  ('Tokyo Systems',        'Technology',     'JP', 550000, '2017-04-01', 'Ahmed Hassan'),
  ('Sunrise Retail',       'Retail',         'US', 150000, '2023-01-15', 'James Wilson'),
  ('Alpine Consulting',    'Consulting',     'CH', 350000, '2020-12-01', 'Laura Schmidt'),
  ('Iberian Partners',     'Manufacturing',  'ES', 280000, '2021-05-20', 'Laura Schmidt'),
  ('Cairo Solutions',      'Technology',     'EG', 120000, '2022-08-01', 'Ahmed Hassan'),
  ('Mumbai Tech',          'Software',       'IN', 190000, '2021-03-15', 'Ahmed Hassan'),
  ('Rio Commerce',         'Retail',         'BR', 160000, '2022-11-01', 'James Wilson'),
  ('Seoul Digital',        'Technology',     'KR', 340000, '2020-06-15', 'Ahmed Hassan'),
  ('Amsterdam Logistics',  'Logistics',      'NL', 410000, '2019-09-10', 'Laura Schmidt'),
  ('Singapore Finance',    'Finance',        'SG', 620000, '2018-04-20', 'James Wilson'),
  ('Toronto Media',        'Media',          'CA', 230000, '2021-07-05', 'Sarah Chen'),
  ('Sydney Health',        'Healthcare',     'AU', 380000, '2020-02-28', 'James Wilson'),
  ('Paris Fashion',        'Retail',         'FR', 290000, '2021-10-15', 'Laura Schmidt'),
  ('Lagos Energy',         'Energy',         'NG', 450000, '2019-12-01', 'Ahmed Hassan'),
  ('Buenos Aires Tech',    'Software',       'AR', 140000, '2022-04-10', 'James Wilson'),
  ('Warsaw Manufacturing', 'Manufacturing',  'PL', 320000, '2020-08-20', 'Laura Schmidt'),
  ('Nairobi Consulting',   'Consulting',     'KE', 110000, '2023-03-01', 'Ahmed Hassan'),
  ('Vienna Media',         'Media',          'AT', 175000, '2022-01-15', 'Laura Schmidt'),
  ('Dallas Energy',        'Energy',         'US', 680000, '2018-07-10', 'Sarah Chen'),
  ('Montreal Logistics',   'Logistics',      'CA', 290000, '2021-09-25', 'Sarah Chen'),
  ('Milan Fashion',        'Retail',         'IT', 220000, '2020-11-30', 'Laura Schmidt'),
  ('Helsinki Software',    'Software',       'FI', 195000, '2022-06-01', 'Laura Schmidt'),
  ('Cape Town Mining',     'Mining',         'ZA', 530000, '2019-02-14', 'Ahmed Hassan');

INSERT INTO invoices (client_id, invoice_no, issue_date, due_date, status)
SELECT
    c.client_id,
    'INV-' || LPAD(ROW_NUMBER() OVER (ORDER BY c.client_id, d)::TEXT, 5, '0'),
    d::DATE AS issue_date,
    (d + INTERVAL '30 days')::DATE AS due_date,
    CASE
        WHEN d > CURRENT_DATE - INTERVAL '30 days' THEN 'sent'
        WHEN random() < 0.75 THEN 'paid'
        WHEN random() < 0.5  THEN 'overdue'
        ELSE 'sent'
    END AS status
FROM clients c
CROSS JOIN generate_series(
    c.since_date,
    LEAST(c.since_date + INTERVAL '3 years', CURRENT_DATE),
    INTERVAL '45 days'
) d
ON CONFLICT (invoice_no) DO NOTHING;

INSERT INTO invoice_items (invoice_id, description, quantity, unit_price, tax_rate)
SELECT
    i.invoice_id,
    (ARRAY['Consulting Services','Software License','Support & Maintenance',
           'Implementation','Training','Data Migration','API Integration',
           'Project Management','Security Audit','Cloud Hosting'])[1 + (random()*9)::INT],
    ROUND((1 + random() * 9)::NUMERIC, 0) AS quantity,
    ROUND((500 + random() * 4500)::NUMERIC, 2) AS unit_price,
    CASE WHEN random() < 0.3 THEN 0.0 ELSE 0.20 END AS tax_rate
FROM invoices i
CROSS JOIN generate_series(1, 2 + (random() * 2)::INT) AS n;

INSERT INTO payments (invoice_id, payment_date, amount, method)
SELECT
    i.invoice_id,
    i.due_date - ((random() * 10)::INT || ' days')::INTERVAL,
    ROUND((SELECT SUM(quantity * unit_price) FROM invoice_items ii WHERE ii.invoice_id = i.invoice_id), 2),
    (ARRAY['bank_transfer','credit_card','check','bank_transfer','bank_transfer'])[1 + (random()*4)::INT]
FROM invoices i
WHERE i.status = 'paid';

-- Convenience column: cached invoice total computed from line items.
-- Many lessons reference `invoices.total_amount` directly; we keep it
-- in sync via a one-time UPDATE here.
ALTER TABLE invoices ADD COLUMN IF NOT EXISTS total_amount NUMERIC(12,2);
UPDATE invoices i
SET total_amount = (
    SELECT COALESCE(SUM(ii.quantity * ii.unit_price), 0)
    FROM invoice_items ii
    WHERE ii.invoice_id = i.invoice_id
)
WHERE total_amount IS NULL;

Block 5 — Extended e-commerce (ec_* tables)

Used in: CTEs, Real-World Patterns, A/B testing, funnel and cohort lessons.

CREATE TABLE IF NOT EXISTS ec_customers (
    customer_id         SERIAL PRIMARY KEY,
    name                TEXT NOT NULL,
    email               TEXT UNIQUE NOT NULL,
    country             TEXT NOT NULL,
    city                TEXT,
    acquisition_channel TEXT NOT NULL DEFAULT 'organic',
    created_at          TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    first_order_date    DATE
);

CREATE TABLE IF NOT EXISTS ec_products (
    product_id   SERIAL PRIMARY KEY,
    name         TEXT NOT NULL,
    category     TEXT NOT NULL,
    subcategory  TEXT,
    price        NUMERIC(10,2) NOT NULL,
    cost         NUMERIC(10,2) NOT NULL DEFAULT 0
);

CREATE TABLE IF NOT EXISTS ec_orders (
    order_id        SERIAL PRIMARY KEY,
    customer_id     INT REFERENCES ec_customers,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    status          TEXT NOT NULL,
    total_amount    NUMERIC(12,2) NOT NULL DEFAULT 0,
    discount_amount NUMERIC(12,2) NOT NULL DEFAULT 0,
    promo_code      TEXT
);

CREATE TABLE IF NOT EXISTS ec_order_items (
    item_id     SERIAL PRIMARY KEY,
    order_id    INT REFERENCES ec_orders,
    product_id  INT REFERENCES ec_products,
    quantity    INT NOT NULL,
    unit_price  NUMERIC(10,2) NOT NULL
);

CREATE TABLE IF NOT EXISTS ec_events (
    event_id    SERIAL PRIMARY KEY,
    customer_id INT REFERENCES ec_customers,
    session_id  TEXT,
    event_type  TEXT NOT NULL,
    page        TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS ec_ab_assignments (
    customer_id  INT REFERENCES ec_customers,
    experiment   TEXT NOT NULL,
    variant      TEXT NOT NULL,
    assigned_at  TIMESTAMPTZ NOT NULL,
    converted    BOOLEAN NOT NULL DEFAULT FALSE,
    revenue      NUMERIC(10,2),
    PRIMARY KEY (customer_id, experiment)
);
INSERT INTO ec_customers (name, email, country, city, acquisition_channel, created_at)
SELECT
    'Customer ' || n,
    'user' || n || '@example.com',
    (ARRAY['US','UK','CA','AU','DE','FR','JP','BR','IN','NL'])[1 + (n % 10)],
    (ARRAY['New York','London','Berlin','Paris','Toronto','Sydney','Tokyo','Sao Paulo','Mumbai','Amsterdam'])[1 + (n % 10)],
    (ARRAY['organic','paid_search','social','email','referral'])[1 + (n % 5)],
    NOW() - ((random() * 1095)::INT || ' days')::INTERVAL
FROM generate_series(1, 500) n
ON CONFLICT DO NOTHING;

INSERT INTO ec_products (name, category, subcategory, price, cost)
SELECT
    cat || ' ' || subcat || ' ' || n,
    cat,
    subcat,
    ROUND((15 + random() * 485)::NUMERIC, 2),
    ROUND((5  + random() * 120)::NUMERIC, 2)
FROM (VALUES
    ('Electronics','Laptops'),   ('Electronics','Phones'),
    ('Electronics','Tablets'),   ('Clothing','Tops'),
    ('Clothing','Bottoms'),      ('Clothing','Shoes'),
    ('Home','Kitchen'),          ('Home','Bedding'),
    ('Books','Fiction'),         ('Books','Non-Fiction'),
    ('Sports','Equipment'),      ('Sports','Apparel')
) AS cats(cat, subcat)
CROSS JOIN generate_series(1, 9) n
ON CONFLICT DO NOTHING;

INSERT INTO ec_orders (customer_id, created_at, status, total_amount, discount_amount, promo_code)
SELECT
    c.customer_id,
    c.created_at + ((random() * 700 + 1)::INT || ' days')::INTERVAL,
    (ARRAY['completed','completed','completed','refunded','pending','cancelled'])[1 + (random()*5)::INT],
    0,
    CASE WHEN random() < 0.2 THEN ROUND((random() * 30)::NUMERIC, 2) ELSE 0 END,
    CASE WHEN random() < 0.2 THEN 'SAVE' || (10 + (random()*20)::INT)::TEXT ELSE NULL END
FROM ec_customers c
CROSS JOIN generate_series(1, 1 + (random()*7)::INT) g
WHERE (c.created_at + ((random() * 700 + 1)::INT || ' days')::INTERVAL) <= NOW();

INSERT INTO ec_order_items (order_id, product_id, quantity, unit_price)
SELECT
    o.order_id,
    1 + (random() * 107)::INT,
    1 + (random() * 3)::INT,
    ROUND((10 + random() * 290)::NUMERIC, 2)
FROM ec_orders o
CROSS JOIN generate_series(1, 2 + (random() * 2)::INT) n;

UPDATE ec_orders o
SET total_amount = GREATEST(
    0,
    (SELECT ROUND(SUM(quantity * unit_price), 2) FROM ec_order_items oi WHERE oi.order_id = o.order_id)
    - o.discount_amount
);

UPDATE ec_customers c
SET first_order_date = (
    SELECT MIN(created_at::DATE) FROM ec_orders o WHERE o.customer_id = c.customer_id
);

INSERT INTO ec_events (customer_id, session_id, event_type, page, created_at)
SELECT
    o.customer_id,
    'sess_' || o.customer_id || '_' || o.order_id,
    evt_type,
    CASE evt_type
        WHEN 'page_view'    THEN (ARRAY['/home','/ec_products','/about','/blog'])[1+(random()*3)::INT]
        WHEN 'view'         THEN '/product/' || (1+(random()*107)::INT)::TEXT
        WHEN 'add_to_cart'  THEN '/cart'
        WHEN 'checkout'     THEN '/checkout'
        WHEN 'purchase'     THEN '/confirmation'
    END,
    o.created_at - ((5 - step) || ' minutes')::INTERVAL
FROM ec_orders o
CROSS JOIN (VALUES
    (1,'page_view'),(2,'view'),(3,'add_to_cart'),(4,'checkout'),(5,'purchase')
) AS e(step, evt_type)
WHERE o.status = 'completed';

INSERT INTO ec_ab_assignments (customer_id, experiment, variant, assigned_at, converted, revenue)
SELECT
    customer_id,
    'checkout_redesign',
    CASE WHEN random() < 0.5 THEN 'control' ELSE 'treatment' END,
    created_at + '30 days'::INTERVAL,
    random() < CASE WHEN random() < 0.5 THEN 0.12 ELSE 0.16 END,
    CASE WHEN random() < 0.14 THEN ROUND((20 + random() * 280)::NUMERIC, 2) ELSE NULL END
FROM ec_customers
WHERE random() < 0.7
ON CONFLICT DO NOTHING;

Block 6 — Full Text Search articles (fts_articles)

Used in: every Full Text Search lesson.

CREATE TABLE IF NOT EXISTS fts_articles (
    id          SERIAL PRIMARY KEY,
    title       TEXT NOT NULL,
    body        TEXT NOT NULL,
    author      TEXT,
    category    TEXT,
    published   DATE,
    search_vec  TSVECTOR
);

INSERT INTO fts_articles (title, body, author, category, published)
SELECT * FROM (VALUES
  ('PostgreSQL Performance Tuning',
   'Learn how to optimize PostgreSQL queries using indexes, EXPLAIN ANALYZE, and vacuum strategies for high-throughput systems.',
   'Alice Chen', 'Database', '2024-01-15'::DATE),
  ('Understanding Database Indexes',
   'A deep dive into B-tree, hash, GIN, and GiST index types, when to use each, and how they affect query planning.',
   'Bob Smith', 'Database', '2024-02-01'::DATE),
  ('Python and PostgreSQL Integration',
   'Connect Python applications to PostgreSQL using psycopg2 and SQLAlchemy, with async support via asyncpg.',
   'Carol White', 'Programming', '2024-02-20'::DATE),
  ('Advanced SQL Window Functions',
   'Master ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, and complex window frames for analytical queries.',
   'Alice Chen', 'Database', '2024-03-05'::DATE),
  ('Building REST APIs with FastAPI',
   'Create high-performance Python REST APIs using FastAPI, Pydantic models, and async database connections.',
   'David Lee', 'Programming', '2024-03-18'::DATE),
  ('Time Series Data in PostgreSQL',
   'Store and query time series metrics efficiently using partitioning, window functions, and the date_trunc function.',
   'Carol White', 'Database', '2024-04-02'::DATE),
  ('PostgreSQL Security Best Practices',
   'Secure your PostgreSQL installation with row-level security, encryption at rest, SSL connections, and role management.',
   'Bob Smith', 'Security', '2024-04-15'::DATE),
  ('Machine Learning with SQL',
   'Use PostgreSQL extensions for machine learning: linear regression, k-means clustering, and statistical functions.',
   'Alice Chen', 'Data Science', '2024-05-01'::DATE)
) v(title, body, author, category, published)
WHERE NOT EXISTS (SELECT 1 FROM fts_articles LIMIT 1);

UPDATE fts_articles
SET search_vec = setweight(to_tsvector('english', title), 'A')
              || setweight(to_tsvector('english', body),  'B')
WHERE search_vec IS NULL;

CREATE INDEX IF NOT EXISTS idx_fts_articles_vec ON fts_articles USING GIN(search_vec);

Block 7 — Product Catalog with JSONB (product_catalog)

Used in: every JSON / JSONB lesson.

CREATE TABLE IF NOT EXISTS product_catalog (
    product_id  SERIAL PRIMARY KEY,
    name        TEXT NOT NULL,
    category    TEXT NOT NULL,
    base_price  NUMERIC(10,2),
    attributes  JSONB
);

INSERT INTO product_catalog (name, category, base_price, attributes)
SELECT * FROM (VALUES
  ('MacBook Pro 14',   'Electronics', 1999.00::NUMERIC,
   '{"brand":"Apple","ram_gb":16,"storage_gb":512,"color":"silver","weight_kg":1.6}'::JSONB),
  ('MacBook Pro 16',   'Electronics', 2499.00::NUMERIC,
   '{"brand":"Apple","ram_gb":32,"storage_gb":1024,"color":"space grey","weight_kg":2.1}'::JSONB),
  ('ThinkPad X1',      'Electronics', 1599.00::NUMERIC,
   '{"brand":"Lenovo","ram_gb":16,"storage_gb":512,"color":"black","weight_kg":1.4}'::JSONB),
  ('501 Jeans',        'Clothing',     89.00::NUMERIC,
   '{"brand":"Levis","sizes":["28","30","32","34","36"],"material":"denim","color":"blue"}'::JSONB),
  ('Classic Tee',      'Clothing',     24.00::NUMERIC,
   '{"brand":"Uniqlo","sizes":["XS","S","M","L","XL"],"material":"cotton","color":"white"}'::JSONB),
  ('Refactoring 2nd',  'Books',        49.00::NUMERIC,
   '{"author":"Martin Fowler","isbn":"978-0-13-468599-1","pages":448,"publisher":"Addison-Wesley"}'::JSONB),
  ('Designing Data-Intensive Applications', 'Books', 55.00::NUMERIC,
   '{"author":"Martin Kleppmann","isbn":"978-1-449-37332-0","pages":616,"publisher":"OReilly"}'::JSONB)
) v(name, category, base_price, attributes)
WHERE NOT EXISTS (SELECT 1 FROM product_catalog LIMIT 1);

CREATE INDEX IF NOT EXISTS idx_product_catalog_attrs ON product_catalog USING GIN(attributes);

Block 8 — Time Series tables (ts_*)

Used in: every Time Series Analytics lesson.

CREATE TABLE IF NOT EXISTS ts_server_metrics (
    id           BIGSERIAL PRIMARY KEY,
    host         TEXT NOT NULL,
    region       TEXT,
    recorded_at  TIMESTAMPTZ NOT NULL,
    cpu_pct      NUMERIC(5,2),
    mem_pct      NUMERIC(5,2),
    disk_io_mbps NUMERIC(8,2),
    net_mbps     NUMERIC(8,2),
    error_count  INT DEFAULT 0
);

CREATE INDEX IF NOT EXISTS idx_ts_server_metrics_host_time
    ON ts_server_metrics (host, recorded_at);

CREATE TABLE IF NOT EXISTS ts_stock_prices (
    id          BIGSERIAL PRIMARY KEY,
    ticker      TEXT NOT NULL,
    price_date  DATE NOT NULL,
    open_cents  INT,
    high_cents  INT,
    low_cents   INT,
    close_cents INT,
    volume      BIGINT,
    UNIQUE (ticker, price_date)
);

CREATE TABLE IF NOT EXISTS ts_web_events (
    id          BIGSERIAL PRIMARY KEY,
    session_id  TEXT NOT NULL,
    user_id     BIGINT,
    event_type  TEXT,
    page        TEXT,
    created_at TIMESTAMPTZ NOT NULL
);

CREATE INDEX IF NOT EXISTS idx_ts_web_events_session_time
    ON ts_web_events (session_id, created_at);

CREATE TABLE IF NOT EXISTS ts_energy_readings (
    id          BIGSERIAL PRIMARY KEY,
    meter_id    TEXT NOT NULL,
    building    TEXT,
    read_at     TIMESTAMPTZ NOT NULL,
    kwh         NUMERIC(10,4),
    UNIQUE (meter_id, read_at)
);
INSERT INTO ts_server_metrics
    (host, region, recorded_at, cpu_pct, mem_pct, disk_io_mbps, net_mbps, error_count)
SELECT
    'srv-' || LPAD(h::TEXT, 2, '0'),
    (ARRAY['us-east','us-west','eu-west'])[(h % 3) + 1],
    NOW() - INTERVAL '7 days' + (m || ' minutes')::INTERVAL,
    LEAST(99, GREATEST(1,
        20 + 30 * SIN(m::float * PI() / 720)
           + (h * 7 + m * 13) % 20
           + (RANDOM() * 10)::INT
    )),
    LEAST(99, GREATEST(10,
        40 + 20 * SIN(m::float * PI() / 720 + 1)
           + (h * 5 + m * 7) % 15
    )),
    ROUND((10 + (m * h * 3) % 90)::numeric, 2),
    ROUND((5  + (m * h * 7) % 50)::numeric, 2),
    CASE WHEN (m * h) % 200 = 0 THEN (RANDOM() * 5)::INT ELSE 0 END
FROM generate_series(1, 6) h,
     generate_series(0, 10079) m
ON CONFLICT DO NOTHING;

INSERT INTO ts_stock_prices
    (ticker, price_date, open_cents, high_cents, low_cents, close_cents, volume)
SELECT
    ticker,
    (NOW() - INTERVAL '2 years')::DATE + d,
    base + (d * 7 + ticker_idx * 13) % 500,
    base + (d * 7 + ticker_idx * 13) % 500 + (d * 3 + ticker_idx) % 150,
    GREATEST(100, base + (d * 7 + ticker_idx * 13) % 500 - (d * 5 + ticker_idx) % 200),
    base + (d * 11 + ticker_idx * 17) % 600,
    1000000 + (d * ticker_idx * 37) % 9000000
FROM generate_series(0, 730) d,
     (VALUES ('AAPL', 15000, 1), ('MSFT', 30000, 2), ('GOOG', 12000, 3),
             ('AMZN', 18000, 4), ('NVDA', 8000, 5)) AS s(ticker, base, ticker_idx)
WHERE EXTRACT(DOW FROM (NOW() - INTERVAL '2 years')::DATE + d) NOT IN (0, 6)
ON CONFLICT DO NOTHING;

INSERT INTO ts_web_events (session_id, user_id, event_type, page, created_at)
SELECT
    'sess-' || LPAD(s::TEXT, 6, '0'),
    (s * 7) % 2000 + 1,
    (ARRAY['page_view','page_view','page_view','click','click',
           'add_to_cart','purchase','bounce'])[(e % 8) + 1],
    (ARRAY['/home','/products','/product/detail','/cart',
           '/checkout','/blog','/pricing','/about'])[(e % 8) + 1],
    NOW() - INTERVAL '30 days'
        + ((s * 41) % (30 * 1440) || ' minutes')::INTERVAL
        + (e * 2 || ' minutes')::INTERVAL
FROM generate_series(1, 500) s,
     generate_series(0, 9) e
ON CONFLICT DO NOTHING;

INSERT INTO ts_energy_readings (meter_id, building, read_at, kwh)
SELECT
    'MTR-' || LPAD(m::TEXT, 3, '0'),
    (ARRAY['HQ','Warehouse','DataCenter','Office-A',
           'Office-B','Retail-1','Retail-2','Parking'])[(m % 8) + 1],
    NOW() - INTERVAL '30 days' + (i * 15 || ' minutes')::INTERVAL,
    ROUND((
        5 + 10 * SIN(i::float * PI() / 48)
          + (m * 2 + i * 3) % 8
          + (RANDOM() * 2)::INT
    )::numeric, 4)
FROM generate_series(1, 8) m,
     generate_series(0, 2879) i
ON CONFLICT DO NOTHING;

Block 9 — Healthcare (hc_*)

Used in: every Healthcare & Hospital Analytics lesson.

CREATE TABLE IF NOT EXISTS hc_wards (
    id       SERIAL PRIMARY KEY,
    name     TEXT NOT NULL,
    type     TEXT NOT NULL,
    capacity INT  NOT NULL,
    floor    INT
);

CREATE TABLE IF NOT EXISTS hc_staff (
    id        SERIAL PRIMARY KEY,
    name      TEXT NOT NULL,
    role      TEXT NOT NULL,
    specialty TEXT,
    ward_id   INT  REFERENCES hc_wards(id),
    hire_date DATE NOT NULL
);

CREATE TABLE IF NOT EXISTS hc_patients (
    id             SERIAL PRIMARY KEY,
    mrn            TEXT UNIQUE NOT NULL,
    name           TEXT NOT NULL,
    dob            DATE NOT NULL,
    gender         TEXT NOT NULL,
    blood_type     TEXT,
    insurance_type TEXT
);

CREATE TABLE IF NOT EXISTS hc_admissions (
    id                    SERIAL PRIMARY KEY,
    patient_id            INT  REFERENCES hc_patients(id),
    ward_id               INT  REFERENCES hc_wards(id),
    attending_id          INT  REFERENCES hc_staff(id),
    admission_type        TEXT NOT NULL,
    admitted_at           TIMESTAMPTZ NOT NULL,
    discharged_at         TIMESTAMPTZ,
    discharge_disposition TEXT
);

CREATE TABLE IF NOT EXISTS hc_diagnoses (
    id           SERIAL PRIMARY KEY,
    admission_id INT  REFERENCES hc_admissions(id),
    icd_code     TEXT NOT NULL,
    description  TEXT NOT NULL,
    is_primary   BOOLEAN NOT NULL DEFAULT false,
    diagnosed_at TIMESTAMPTZ NOT NULL
);

CREATE TABLE IF NOT EXISTS hc_procedures (
    id           SERIAL PRIMARY KEY,
    admission_id INT  REFERENCES hc_admissions(id),
    proc_code    TEXT NOT NULL,
    description  TEXT NOT NULL,
    performed_by INT  REFERENCES hc_staff(id),
    performed_at TIMESTAMPTZ NOT NULL,
    cost_cents   INT  NOT NULL
);

CREATE TABLE IF NOT EXISTS hc_lab_results (
    id            SERIAL PRIMARY KEY,
    patient_id    INT  REFERENCES hc_patients(id),
    admission_id  INT  REFERENCES hc_admissions(id),
    test_name     TEXT NOT NULL,
    value_numeric NUMERIC,
    unit          TEXT,
    ref_low       NUMERIC,
    ref_high      NUMERIC,
    is_abnormal   BOOLEAN GENERATED ALWAYS AS (
        value_numeric IS NOT NULL AND ref_low IS NOT NULL AND ref_high IS NOT NULL
        AND (value_numeric < ref_low OR value_numeric > ref_high)
    ) STORED,
    collected_at  TIMESTAMPTZ NOT NULL
);

CREATE TABLE IF NOT EXISTS hc_medications (
    id           SERIAL PRIMARY KEY,
    admission_id INT  REFERENCES hc_admissions(id),
    drug_name    TEXT NOT NULL,
    dosage_mg    NUMERIC,
    frequency    TEXT NOT NULL,
    route        TEXT,
    start_date   DATE NOT NULL,
    end_date     DATE
);
INSERT INTO hc_wards (name, type, capacity, floor) VALUES
    ('ICU',       'ICU',       12, 3),
    ('Emergency', 'Emergency', 30, 1),
    ('Surgery',   'Surgery',   20, 2),
    ('General-A', 'General',   25, 4),
    ('General-B', 'General',   25, 4),
    ('Pediatric', 'Pediatric', 15, 5)
ON CONFLICT DO NOTHING;

INSERT INTO hc_staff (name, role, specialty, ward_id, hire_date)
SELECT
    'Staff ' || g,
    CASE g % 3 WHEN 0 THEN 'Doctor' WHEN 1 THEN 'Nurse' ELSE 'Technician' END,
    CASE g % 5 WHEN 0 THEN 'Cardiology' WHEN 1 THEN 'Emergency Medicine'
               WHEN 2 THEN 'Surgery' WHEN 3 THEN 'Internal Medicine' ELSE 'Pediatrics' END,
    (g % 6) + 1,
    '2015-01-01'::DATE + (g * 90) * INTERVAL '1 day'
FROM generate_series(1, 20) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_patients (mrn, name, dob, gender, blood_type, insurance_type)
SELECT
    'MRN-' || LPAD(g::TEXT, 6, '0'),
    'Patient ' || g,
    '1950-01-01'::DATE + ((g * 127) % 25000) * INTERVAL '1 day',
    CASE g % 2 WHEN 0 THEN 'M' ELSE 'F' END,
    (ARRAY['A+','A-','B+','B-','AB+','AB-','O+','O-'])[((g * 3) % 8) + 1],
    (ARRAY['Private','Medicare','Medicaid','Uninsured'])[((g * 7) % 4) + 1]
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_admissions
    (patient_id, ward_id, attending_id, admission_type,
     admitted_at, discharged_at, discharge_disposition)
SELECT
    (g % 500) + 1,
    (g % 6) + 1,
    (g % 20) + 1,
    (ARRAY['Emergency','Elective','Transfer'])[((g * 11) % 3) + 1],
    NOW() - ((g * 3) % 1095 + 1) * INTERVAL '1 day'
        + ((g * 7) % 24) * INTERVAL '1 hour',
    CASE WHEN g % 10 = 0 THEN NULL
         ELSE NOW() - ((g * 3) % 1095 + 1) * INTERVAL '1 day'
              + ((g * 7) % 24) * INTERVAL '1 hour'
              + ((g % 14) + 1) * INTERVAL '1 day'
    END,
    CASE WHEN g % 10 = 0 THEN NULL
         ELSE (ARRAY['Home','SNF','AMA','Expired','Home Health'])[((g * 13) % 5) + 1]
    END
FROM generate_series(1, 700) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_diagnoses
    (admission_id, icd_code, description, is_primary, diagnosed_at)
SELECT
    (g % 700) + 1,
    (ARRAY['I21','J18','K57','N18','E11','I50','J44','M79','R07','Z51'])[(g % 10) + 1],
    (ARRAY['Acute MI','Pneumonia','Diverticulitis','Chronic Kidney Disease',
           'Type 2 Diabetes','Heart Failure','COPD','Musculoskeletal Pain',
           'Chest Pain','Chemotherapy'])[((g * 3) % 10) + 1],
    g % 3 != 0,
    NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
FROM generate_series(1, 1000) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_procedures
    (admission_id, proc_code, description, performed_by, performed_at, cost_cents)
SELECT
    (g % 700) + 1,
    'P' || LPAD(((g * 7) % 50 + 1)::TEXT, 3, '0'),
    (ARRAY['ECG','CT Scan','MRI','Blood Culture','Echocardiogram',
           'Colonoscopy','Bronchoscopy','Lumbar Puncture','Intubation','Dialysis'])[(g % 10) + 1],
    (g % 20) + 1,
    NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day',
    ((g * 137) % 45000 + 5000)
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_lab_results
    (patient_id, admission_id, test_name, value_numeric, unit, ref_low, ref_high, collected_at)
SELECT
    (g % 500) + 1,
    (g % 700) + 1,
    (ARRAY['Hemoglobin','WBC','Creatinine','Sodium','Glucose',
           'Troponin','INR','Potassium'])[(g % 8) + 1],
    CASE (g % 8)
        WHEN 0 THEN 10 + ((g * 31) % 80) / 10.0
        WHEN 1 THEN 4  + ((g * 17) % 100) / 10.0
        WHEN 2 THEN 0.5 + ((g * 7) % 30) / 10.0
        WHEN 3 THEN 130 + ((g * 11) % 30)
        WHEN 4 THEN 70  + ((g * 23) % 200)
        WHEN 5 THEN ((g * 3) % 50) / 10.0
        WHEN 6 THEN 0.8 + ((g * 13) % 30) / 10.0
        ELSE        3.0 + ((g * 19) % 30) / 10.0
    END,
    (ARRAY['g/dL','K/uL','mg/dL','mEq/L','mg/dL','ng/mL','ratio','mEq/L'])[(g % 8) + 1],
    (ARRAY[12, 4.5, 0.6, 136,  70, 0,    0.8, 3.5])[(g % 8) + 1],
    (ARRAY[17, 11,  1.2, 145, 100, 0.04, 1.2, 5.0])[(g % 8) + 1],
    NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
FROM generate_series(1, 1500) g
ON CONFLICT DO NOTHING;

INSERT INTO hc_medications
    (admission_id, drug_name, dosage_mg, frequency, route, start_date, end_date)
SELECT
    (g % 700) + 1,
    (ARRAY['Metformin','Lisinopril','Atorvastatin','Amlodipine','Metoprolol',
           'Furosemide','Warfarin','Insulin Glargine','Albuterol','Pantoprazole'])[(g % 10) + 1],
    CASE (g % 5) WHEN 0 THEN 500 WHEN 1 THEN 10 WHEN 2 THEN 40 WHEN 3 THEN 5 ELSE 25 END,
    (ARRAY['QD','BID','TID','PRN','QID'])[((g * 3) % 5) + 1],
    (ARRAY['PO','IV','IM','SQ'])[((g * 7) % 4) + 1],
    (NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day')::DATE,
    CASE WHEN g % 5 = 0 THEN NULL
         ELSE (NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
               + ((g % 14) + 1) * INTERVAL '1 day')::DATE
    END
FROM generate_series(1, 800) g
ON CONFLICT DO NOTHING;

Block 10 — SaaS Analytics (saas_*)

CREATE TABLE IF NOT EXISTS saas_plans (
    id               SERIAL PRIMARY KEY,
    name             TEXT NOT NULL,
    price_cents      INT  NOT NULL,
    billing_interval TEXT NOT NULL,
    max_seats        INT  NOT NULL
);

CREATE TABLE IF NOT EXISTS saas_accounts (
    id                   SERIAL PRIMARY KEY,
    name                 TEXT NOT NULL,
    industry             TEXT NOT NULL,
    country              TEXT NOT NULL,
    acquisition_channel  TEXT NOT NULL,
    plan_id              INT  REFERENCES saas_plans(id),
    status               TEXT NOT NULL,
    seats                INT  NOT NULL DEFAULT 1,
    mrr_cents            INT  NOT NULL DEFAULT 0,
    trial_started_at     TIMESTAMPTZ,
    converted_at         TIMESTAMPTZ,
    churned_at           TIMESTAMPTZ,
    created_at           TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS saas_mrr_events (
    id          SERIAL PRIMARY KEY,
    account_id  INT  REFERENCES saas_accounts(id),
    event_type  TEXT NOT NULL,
    amount_cents INT NOT NULL,
    occurred_at  DATE NOT NULL,
    reason       TEXT
);

CREATE TABLE IF NOT EXISTS saas_feature_usage (
    id           SERIAL PRIMARY KEY,
    account_id   INT  REFERENCES saas_accounts(id),
    feature_name TEXT NOT NULL,
    usage_count  INT  NOT NULL DEFAULT 0,
    period_date  DATE NOT NULL
);

CREATE TABLE IF NOT EXISTS saas_sessions (
    id          BIGSERIAL PRIMARY KEY,
    account_id  INT  REFERENCES saas_accounts(id),
    user_email  TEXT NOT NULL,
    started_at  TIMESTAMPTZ NOT NULL,
    ended_at    TIMESTAMPTZ,
    page_count  INT NOT NULL DEFAULT 1
);

CREATE TABLE IF NOT EXISTS saas_payments (
    id             SERIAL PRIMARY KEY,
    account_id     INT  REFERENCES saas_accounts(id),
    amount_cents   INT  NOT NULL,
    status         TEXT NOT NULL,
    failure_reason TEXT,
    attempted_at   TIMESTAMPTZ NOT NULL,
    succeeded_at   TIMESTAMPTZ
);
INSERT INTO saas_plans (name, price_cents, billing_interval, max_seats) VALUES
    ('Starter',    4900,  'monthly', 3),
    ('Growth',    14900,  'monthly', 10),
    ('Business',  39900,  'monthly', 50),
    ('Enterprise',99900,  'monthly', 999)
ON CONFLICT DO NOTHING;

INSERT INTO saas_accounts
    (name, industry, country, acquisition_channel, plan_id, status, seats, mrr_cents,
     trial_started_at, converted_at, churned_at, created_at)
SELECT
    'Account ' || gs,
    (ARRAY['SaaS','FinTech','HealthTech','eCommerce','EdTech','MarTech','HR Tech','LegalTech'])[gs % 8 + 1],
    (ARRAY['US','UK','CA','AU','DE','FR','NL','SG'])[gs % 8 + 1],
    (ARRAY['Organic','Paid Search','Referral','Outbound','Partner','Content','Social','Event'])[gs % 8 + 1],
    (gs % 4) + 1,
    CASE gs % 10
        WHEN 9 THEN 'trial'
        WHEN 7 THEN 'churned'
        WHEN 8 THEN 'churned'
        ELSE 'active'
    END,
    (gs % 10) + 1,
    CASE (gs % 4) + 1
        WHEN 1 THEN 4900 WHEN 2 THEN 14900 WHEN 3 THEN 39900 ELSE 99900
    END,
    TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours'),
    CASE WHEN gs % 10 NOT IN (9) THEN
        TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours') + ((gs % 14 + 7) * INTERVAL '1 day')
    END,
    CASE WHEN gs % 10 IN (7,8) THEN
        TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours') + ((gs % 200 + 60) * INTERVAL '1 day')
    END,
    TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours')
FROM generate_series(1, 400) gs
ON CONFLICT DO NOTHING;

INSERT INTO saas_mrr_events (account_id, event_type, amount_cents, occurred_at, reason)
SELECT
    (gs % 400) + 1,
    (ARRAY['new','expansion','contraction','churn','reactivation'])[gs % 5 + 1],
    CASE (gs % 5) + 1
        WHEN 1 THEN ((gs % 4) + 1) * 4900
        WHEN 2 THEN  (gs % 3 + 1) * 5000
        WHEN 3 THEN -(gs % 3 + 1) * 5000
        WHEN 4 THEN -((gs % 4) + 1) * 4900
        ELSE         (gs % 4 + 1)  * 4900
    END,
    DATE '2022-01-01' + (gs * 2 % 730),
    (ARRAY['Upgrade','Seats added','Seats removed','Price objection','Product fit',
           'Reactivated after pause','New signup','Downgrade','Budget cut','Returning customer'])[gs % 10 + 1]
FROM generate_series(1, 600) gs
ON CONFLICT DO NOTHING;

INSERT INTO saas_feature_usage (account_id, feature_name, usage_count, period_date)
SELECT
    (gs % 400) + 1,
    (ARRAY['Dashboard','Reports','API','Exports','Integrations','Alerts','Collaboration','Mobile'])[gs % 8 + 1],
    (gs % 50) * (CASE WHEN gs % 7 = 0 THEN 0 ELSE 1 END),
    DATE_TRUNC('month', DATE '2022-01-01' + (gs % 730))::DATE
FROM generate_series(1, 2000) gs
ON CONFLICT DO NOTHING;

INSERT INTO saas_sessions (account_id, user_email, started_at, ended_at, page_count)
SELECT
    (gs % 400) + 1,
    'user' || (gs % 5 + 1) || '@account' || ((gs % 400) + 1) || '.com',
    TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '6 hours'),
    TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '6 hours') + ((gs % 60 + 1) * INTERVAL '1 minute'),
    (gs % 20) + 1
FROM generate_series(1, 3000) gs
ON CONFLICT DO NOTHING;

INSERT INTO saas_payments (account_id, amount_cents, status, failure_reason, attempted_at, succeeded_at)
SELECT
    (gs % 400) + 1,
    CASE (gs % 4) + 1
        WHEN 1 THEN 4900 WHEN 2 THEN 14900 WHEN 3 THEN 39900 ELSE 99900
    END,
    CASE WHEN gs % 6 = 0 THEN 'failed' WHEN gs % 20 = 0 THEN 'pending' ELSE 'succeeded' END,
    CASE WHEN gs % 6 = 0 THEN
        (ARRAY['Card declined','Insufficient funds','Expired card','Do not honor','Invalid account'])[gs % 5 + 1]
    END,
    TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '10 hours'),
    CASE WHEN gs % 6 != 0 AND gs % 20 != 0 THEN
        TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '10 hours') + INTERVAL '30 seconds'
    END
FROM generate_series(1, 800) gs
ON CONFLICT DO NOTHING;

Block 11 — Logistics & Supply Chain (lg_*)

CREATE TABLE IF NOT EXISTS lg_warehouses (
    id          SERIAL PRIMARY KEY,
    code        TEXT UNIQUE NOT NULL,
    name        TEXT NOT NULL,
    city        TEXT NOT NULL,
    country     TEXT NOT NULL DEFAULT 'US',
    lat         NUMERIC(9,6) NOT NULL DEFAULT 0,
    lng         NUMERIC(9,6) NOT NULL DEFAULT 0,
    capacity_m3 INT
);

CREATE TABLE IF NOT EXISTS lg_suppliers (
    id               SERIAL PRIMARY KEY,
    name             TEXT NOT NULL,
    country          TEXT NOT NULL,
    lead_time_days   INT NOT NULL DEFAULT 14,
    reliability_score NUMERIC(3,2) NOT NULL DEFAULT 0.90
);

CREATE TABLE IF NOT EXISTS lg_skus (
    id            SERIAL PRIMARY KEY,
    sku           TEXT UNIQUE NOT NULL,
    name          TEXT NOT NULL,
    category      TEXT NOT NULL,
    unit_cost     NUMERIC(10,2) NOT NULL,
    weight_kg     NUMERIC(6,3) NOT NULL,
    supplier_id   INT REFERENCES lg_suppliers(id)
);

CREATE TABLE IF NOT EXISTS lg_inventory (
    id            SERIAL PRIMARY KEY,
    warehouse_id  INT REFERENCES lg_warehouses(id),
    sku_id        INT REFERENCES lg_skus(id),
    quantity      INT NOT NULL DEFAULT 0,
    reorder_point INT NOT NULL DEFAULT 50,
    reorder_qty   INT NOT NULL DEFAULT 200,
    UNIQUE (warehouse_id, sku_id)
);

CREATE TABLE IF NOT EXISTS lg_purchase_orders (
    id            SERIAL PRIMARY KEY,
    supplier_id   INT REFERENCES lg_suppliers(id),
    warehouse_id  INT REFERENCES lg_warehouses(id),
    sku_id        INT REFERENCES lg_skus(id),
    quantity      INT NOT NULL,
    unit_cost     NUMERIC(10,2) NOT NULL,
    status        TEXT NOT NULL DEFAULT 'open',
    ordered_at    TIMESTAMPTZ NOT NULL,
    expected_at   DATE NOT NULL,
    received_at   TIMESTAMPTZ
);

CREATE TABLE IF NOT EXISTS lg_shipments (
    id            BIGSERIAL PRIMARY KEY,
    order_ref     TEXT,
    warehouse_id  INT REFERENCES lg_warehouses(id),
    carrier       TEXT,
    service_level TEXT,
    origin_city   TEXT,
    dest_city     TEXT,
    dest_state    TEXT,
    weight_kg     NUMERIC(8,2),
    shipped_at    TIMESTAMPTZ,
    promised_at   TIMESTAMPTZ,
    delivered_at  TIMESTAMPTZ,
    freight_cents INT
);

CREATE TABLE IF NOT EXISTS lg_delivery_events (
    id          BIGSERIAL PRIMARY KEY,
    shipment_id BIGINT REFERENCES lg_shipments(id),
    status      TEXT,
    location    TEXT,
    occurred_at TIMESTAMPTZ
);
INSERT INTO lg_warehouses (code, name, city, country, lat, lng, capacity_m3)
SELECT * FROM (VALUES
    ('WH-NE', 'Northeast Hub',   'Newark',       'US',  40.7357,  -74.1724, 50000),
    ('WH-SE', 'Southeast Hub',   'Atlanta',      'US',  33.7490,  -84.3880, 45000),
    ('WH-MW', 'Midwest Hub',     'Chicago',      'US',  41.8781,  -87.6298, 55000),
    ('WH-SW', 'Southwest Hub',   'Dallas',       'US',  32.7767,  -96.7970, 40000),
    ('WH-WC', 'West Coast Hub',  'Los Angeles',  'US',  34.0522, -118.2437, 60000),
    ('WH-NW', 'Northwest Hub',   'Seattle',      'US',  47.6062, -122.3321, 35000)
) AS v(code, name, city, country, lat, lng, cap)
ON CONFLICT DO NOTHING;

INSERT INTO lg_suppliers (name, country, lead_time_days, reliability_score)
SELECT * FROM (VALUES
    ('Apex Manufacturing',    'US',  7,  0.97),
    ('Global Parts Co',       'CN', 21,  0.88),
    ('Precision Components',  'DE', 14,  0.95),
    ('FastSource Inc',        'US',  5,  0.92),
    ('Pacific Rim Supplies',  'TW', 18,  0.90),
    ('Euro Logistics BV',     'NL', 16,  0.93),
    ('Southland Goods',       'MX', 10,  0.85),
    ('Atlas Industrial',      'US',  8,  0.96)
) AS v(name, country, lead_time_days, reliability_score)
ON CONFLICT DO NOTHING;

INSERT INTO lg_skus (sku, name, category, unit_cost, weight_kg)
SELECT
    'SKU-' || LPAD(g::TEXT, 4, '0'),
    (ARRAY['Widget','Gadget','Component','Assembly','Module','Device','Unit','Part'])[((g-1) % 8) + 1]
        || ' ' || g,
    (ARRAY['Electronics','Hardware','Packaging','Chemical','Mechanical'])[(( g-1) % 5) + 1],
    ROUND(((500 + (g * 137) % 9500) / 100.0)::numeric, 2),
    ROUND((0.1 + (g * 0.23) % 49.9)::numeric, 2)
FROM generate_series(1, 40) g
ON CONFLICT DO NOTHING;

INSERT INTO lg_inventory (warehouse_id, sku_id, quantity, reorder_point, reorder_qty)
SELECT
    (((s-1)) % 6) + 1,
    s,
    ABS(500 + ((1 * s * 7) % 1500) - ((1 * 13) % 200)),
    50 + (s * 17) % 450,
    200 + (s * 53) % 800
FROM generate_series(1, 40) s
ON CONFLICT DO NOTHING;

INSERT INTO lg_purchase_orders
    (supplier_id, warehouse_id, sku_id, quantity, unit_cost,
     ordered_at, expected_at, received_at, status)
SELECT
    (g % 8) + 1,
    (g % 6) + 1,
    (g % 40) + 1,
    100 + (g * 37) % 900,
    ROUND(((500 + (g * 131) % 9000) / 100.0)::numeric, 2),
    NOW() - (((g * 7) % 365) || ' days')::INTERVAL,
    (NOW() - (((g * 7) % 365) || ' days')::INTERVAL + (7 + (g % 21) || ' days')::INTERVAL)::DATE,
    CASE WHEN g % 10 < 8
         THEN NOW() - (((g * 7) % 365) || ' days')::INTERVAL + ((10 + (g % 25)) || ' days')::INTERVAL
         ELSE NULL END,
    CASE WHEN g % 10 < 8 THEN 'received'
         WHEN g % 10 = 8 THEN 'cancelled'
         ELSE 'open' END
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;

INSERT INTO lg_shipments
    (order_ref, warehouse_id, carrier, service_level, origin_city, dest_city,
     dest_state, weight_kg, shipped_at, promised_at, delivered_at, freight_cents)
SELECT
    'ORD-' || LPAD(g::TEXT, 6, '0'),
    (g % 6) + 1,
    (ARRAY['FedEx','UPS','USPS','DHL','OnTrac'])[(g % 5) + 1],
    (ARRAY['standard','express','overnight'])[(g % 3) + 1],
    (ARRAY['Newark','Atlanta','Chicago','Dallas','Los Angeles','Seattle'])[(g % 6) + 1],
    (ARRAY['Boston','Miami','Detroit','Houston','Phoenix','Portland',
           'Denver','Nashville','Charlotte','Minneapolis'])[(g % 10) + 1],
    (ARRAY['MA','FL','MI','TX','AZ','OR','CO','TN','NC','MN'])[(g % 10) + 1],
    ROUND((1.0 + (g * 0.37) % 99)::numeric, 2),
    NOW() - (((g * 3) % 365) || ' days')::INTERVAL,
    NOW() - (((g * 3) % 365) || ' days')::INTERVAL
        + ((CASE (g % 3) WHEN 0 THEN 5 WHEN 1 THEN 2 ELSE 1 END) || ' days')::INTERVAL,
    CASE WHEN g % 12 < 10
         THEN NOW() - (((g * 3) % 365) || ' days')::INTERVAL
              + ((CASE (g % 3) WHEN 0 THEN 4 + (g % 3) WHEN 1 THEN 2 + (g % 2) ELSE 1 END) || ' days')::INTERVAL
         ELSE NULL END,
    300 + (g * 47) % 4700
FROM generate_series(1, 2000) g
ON CONFLICT DO NOTHING;

INSERT INTO lg_delivery_events (shipment_id, status, location, occurred_at)
SELECT
    s.id,
    ev.status,
    ev.location,
    s.shipped_at + (ev.offset_hours || ' hours')::INTERVAL
FROM lg_shipments s
CROSS JOIN LATERAL (VALUES
    ('picked',            'Origin Facility',  0),
    ('in_transit',        'Transit Hub',     12),
    ('out_for_delivery',  'Dest City',       36),
    ('delivered',         'Customer',        48)
) AS ev(status, location, offset_hours)
WHERE s.delivered_at IS NOT NULL
  AND s.id % 3 = 0
ON CONFLICT DO NOTHING;

Block 12 — HR & Workforce Analytics (hr_*)

CREATE TABLE IF NOT EXISTS hr_departments (
    id          SERIAL PRIMARY KEY,
    name        TEXT NOT NULL,
    parent_id   INT REFERENCES hr_departments(id),
    cost_center TEXT,
    location    TEXT
);

CREATE TABLE IF NOT EXISTS hr_job_levels (
    id          SERIAL PRIMARY KEY,
    level_code  TEXT UNIQUE NOT NULL,
    title       TEXT,
    level_type  TEXT,
    min_salary  INT,
    max_salary  INT
);

CREATE TABLE IF NOT EXISTS hr_employees (
    id              BIGSERIAL PRIMARY KEY,
    emp_id          TEXT UNIQUE NOT NULL,
    name            TEXT,
    dept_id         INT REFERENCES hr_departments(id),
    manager_id      BIGINT REFERENCES hr_employees(id),
    job_level_id    INT REFERENCES hr_job_levels(id),
    gender          TEXT,
    ethnicity       TEXT,
    hire_date       DATE,
    termination_date DATE,
    termination_reason TEXT,
    location        TEXT,
    is_active       BOOLEAN GENERATED ALWAYS AS (termination_date IS NULL) STORED
);

CREATE TABLE IF NOT EXISTS hr_salaries (
    id          BIGSERIAL PRIMARY KEY,
    employee_id BIGINT REFERENCES hr_employees(id),
    salary_usd  INT NOT NULL,
    effective_date DATE NOT NULL,
    change_reason  TEXT
);

CREATE TABLE IF NOT EXISTS hr_performance_reviews (
    id              BIGSERIAL PRIMARY KEY,
    employee_id     BIGINT REFERENCES hr_employees(id),
    review_year     INT,
    rating          INT CHECK (rating BETWEEN 1 AND 5),
    reviewer_id     BIGINT REFERENCES hr_employees(id),
    review_date     DATE,
    UNIQUE (employee_id, review_year)
);

CREATE TABLE IF NOT EXISTS hr_headcount_snapshots (
    id              BIGSERIAL PRIMARY KEY,
    dept_id         INT REFERENCES hr_departments(id),
    snapshot_month  DATE,
    headcount       INT,
    UNIQUE (dept_id, snapshot_month)
);
INSERT INTO hr_departments (name, parent_id, cost_center, location)
VALUES
    ('Engineering',     NULL, 'CC-100', 'San Francisco'),
    ('Product',         NULL, 'CC-200', 'San Francisco'),
    ('Sales',           NULL, 'CC-300', 'New York'),
    ('Marketing',       NULL, 'CC-400', 'New York'),
    ('HR',              NULL, 'CC-500', 'Austin'),
    ('Finance',         NULL, 'CC-600', 'Austin'),
    ('Backend',            1, 'CC-101', 'San Francisco'),
    ('Frontend',           1, 'CC-102', 'San Francisco'),
    ('Data & ML',          1, 'CC-103', 'Remote'),
    ('DevOps',             1, 'CC-104', 'Remote'),
    ('Enterprise Sales',   3, 'CC-301', 'New York'),
    ('SMB Sales',          3, 'CC-302', 'Chicago')
ON CONFLICT DO NOTHING;

INSERT INTO hr_job_levels (level_code, title, level_type, min_salary, max_salary)
VALUES
    ('IC1', 'Junior Engineer',      'ic',        60000,  90000),
    ('IC2', 'Mid Engineer',         'ic',        90000, 130000),
    ('IC3', 'Senior Engineer',      'ic',       130000, 175000),
    ('IC4', 'Staff Engineer',       'ic',       175000, 230000),
    ('IC5', 'Principal Engineer',   'ic',       230000, 300000),
    ('M1',  'Engineering Manager',  'manager',  160000, 210000),
    ('M2',  'Senior Manager',       'manager',  200000, 260000),
    ('VP',  'Vice President',       'executive',250000, 350000),
    ('C',   'C-Suite',              'executive',350000, 600000)
ON CONFLICT DO NOTHING;

INSERT INTO hr_employees
    (emp_id, name, dept_id, manager_id, job_level_id,
     gender, ethnicity, hire_date, termination_date, termination_reason, location)
SELECT
    'EMP-' || LPAD(g::TEXT, 5, '0'),
    (ARRAY['Alex','Jordan','Morgan','Taylor','Casey','Riley','Quinn','Avery',
           'Jamie','Drew','Blake','Cameron','Skyler','Reese','Sage'])[(g % 15) + 1]
        || ' ' ||
    (ARRAY['Smith','Johnson','Lee','Garcia','Brown','Davis','Wilson','Chen',
           'Martinez','Anderson','Taylor','Thomas','Moore','Jackson','White'])[(g % 15) + 1],
    (g % 12) + 1,
    CASE WHEN g > 20 THEN (((g - 1) % 20) + 1) ELSE NULL END,
    (g % 9) + 1,
    (ARRAY['M','F','NB'])[(g % 3) + 1],
    (ARRAY['White','Hispanic','Asian','Black','Multiracial','Other'])[(g % 6) + 1],
    CURRENT_DATE - ((30 + (g * 7) % 2190) || ' days')::INTERVAL,
    CASE WHEN g % 8 = 0
         THEN CURRENT_DATE - ((1 + (g * 3) % 180) || ' days')::INTERVAL
         ELSE NULL END,
    CASE WHEN g % 8 = 0
         THEN (ARRAY['voluntary','involuntary','layoff','retirement'])[(g % 4) + 1]
         ELSE NULL END,
    (ARRAY['San Francisco','New York','Austin','Remote','Chicago','Seattle'])[(g % 6) + 1]
FROM generate_series(1, 600) g
ON CONFLICT DO NOTHING;

INSERT INTO hr_salaries (employee_id, salary_usd, effective_date, change_reason)
SELECT
    e.id,
    jl.min_salary + ((e.id * 1237) % (jl.max_salary - jl.min_salary)),
    e.hire_date,
    'hire'
FROM hr_employees e
JOIN hr_job_levels jl ON jl.id = e.job_level_id
ON CONFLICT DO NOTHING;

INSERT INTO hr_salaries (employee_id, salary_usd, effective_date, change_reason)
SELECT
    e.id,
    ROUND(s.salary_usd * (1.03 + (e.id % 5) * 0.01)),
    (DATE_TRUNC('year', e.hire_date) + INTERVAL '1 year')::DATE,
    'merit'
FROM hr_employees e
JOIN hr_salaries s ON s.employee_id = e.id AND s.change_reason = 'hire'
WHERE e.hire_date < CURRENT_DATE - INTERVAL '1 year'
ON CONFLICT DO NOTHING;

INSERT INTO hr_performance_reviews
    (employee_id, review_year, rating, reviewer_id, review_date)
SELECT
    e.id,
    yr.review_year,
    LEAST(5, GREATEST(1, 3 + ((e.id * yr.review_year * 7) % 5) - 2)),
    COALESCE(e.manager_id, e.id),
    TO_DATE(yr.review_year || '-12-15', 'YYYY-MM-DD')
FROM hr_employees e
CROSS JOIN (VALUES (2022),(2023),(2024)) AS yr(review_year)
WHERE e.hire_date < TO_DATE(yr.review_year || '-12-01', 'YYYY-MM-DD')
ON CONFLICT DO NOTHING;

INSERT INTO hr_headcount_snapshots (dept_id, snapshot_month, headcount)
SELECT
    d.id,
    gs.month,
    50 + ((d.id * EXTRACT(MONTH FROM gs.month)::INT * 3) % 80)
FROM hr_departments d
CROSS JOIN generate_series(
    DATE_TRUNC('month', NOW() - INTERVAL '24 months'),
    DATE_TRUNC('month', NOW()),
    INTERVAL '1 month'
) AS gs(month)
ON CONFLICT DO NOTHING;

Block 13 — Twitter / Social Graph (tw_*)

CREATE TABLE IF NOT EXISTS tw_users (
    id              BIGSERIAL PRIMARY KEY,
    username        TEXT UNIQUE NOT NULL,
    display_name    TEXT,
    bio             TEXT,
    location        TEXT,
    verified        BOOL DEFAULT FALSE,
    follower_count  INT DEFAULT 0,
    following_count INT DEFAULT 0,
    tweet_count     INT DEFAULT 0,
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    is_bot          BOOL DEFAULT FALSE
);

CREATE TABLE IF NOT EXISTS tw_tweets (
    id              BIGSERIAL PRIMARY KEY,
    user_id         BIGINT REFERENCES tw_users(id),
    content         TEXT NOT NULL,
    lang            TEXT DEFAULT 'en',
    source          TEXT,
    reply_to_id     BIGINT REFERENCES tw_tweets(id),
    retweet_of_id   BIGINT REFERENCES tw_tweets(id),
    like_count      INT DEFAULT 0,
    retweet_count   INT DEFAULT 0,
    reply_count     INT DEFAULT 0,
    quote_count     INT DEFAULT 0,
    impression_count INT DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL
);

CREATE INDEX IF NOT EXISTS idx_tw_tweets_user_time
    ON tw_tweets(user_id, created_at);
CREATE INDEX IF NOT EXISTS idx_tw_tweets_time
    ON tw_tweets(created_at);

CREATE TABLE IF NOT EXISTS tw_follows (
    follower_id BIGINT REFERENCES tw_users(id),
    followee_id BIGINT REFERENCES tw_users(id),
    followed_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id)
);

CREATE TABLE IF NOT EXISTS tw_likes (
    user_id    BIGINT REFERENCES tw_users(id),
    tweet_id   BIGINT REFERENCES tw_tweets(id),
    liked_at   TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (user_id, tweet_id)
);

CREATE TABLE IF NOT EXISTS tw_hashtags (
    id          BIGSERIAL PRIMARY KEY,
    tag         TEXT UNIQUE NOT NULL
);

CREATE TABLE IF NOT EXISTS tw_tweet_hashtags (
    tweet_id   BIGINT REFERENCES tw_tweets(id),
    hashtag_id BIGINT REFERENCES tw_hashtags(id),
    PRIMARY KEY (tweet_id, hashtag_id)
);

CREATE TABLE IF NOT EXISTS tw_mentions (
    tweet_id        BIGINT REFERENCES tw_tweets(id),
    mentioned_user_id BIGINT REFERENCES tw_users(id),
    PRIMARY KEY (tweet_id, mentioned_user_id)
);
INSERT INTO tw_users
    (username, display_name, location, verified, follower_count,
     following_count, tweet_count, created_at, is_bot)
SELECT
    'user_' || LPAD(g::TEXT, 5, '0'),
    (ARRAY['Alex','Jordan','Morgan','Taylor','Casey','Riley','Sam','Drew',
           'Blake','Cameron','Skyler','Reese','Avery','Quinn','Jamie'])[(g%15)+1]
        || ' ' ||
    (ARRAY['Smith','Lee','Garcia','Brown','Davis','Wilson','Chen','Martinez',
           'Anderson','Thomas','Moore','Jackson','White','Harris','Clark'])[(g%15)+1],
    (ARRAY['New York','London','Tokyo','San Francisco','LA','Chicago',
           'Toronto','Berlin','Sydney','Singapore','Mumbai','Paris'])[(g%12)+1],
    g % 50 = 0,
    (g * 137) % 100000,
    (g * 73)  % 5000,
    (g * 41)  % 10000,
    NOW() - (((g * 11) % 1460) || ' days')::INTERVAL,
    g % 100 = 0
FROM generate_series(1, 1000) g
ON CONFLICT DO NOTHING;

INSERT INTO tw_hashtags (tag)
SELECT unnest(ARRAY[
    'ai','machinelearning','python','postgresql','datascience',
    'webdev','javascript','startup','tech','crypto',
    'openai','llm','cloudcomputing','devops','cybersecurity',
    'ux','productmanagement','sql','analytics','fintech'
]) ON CONFLICT DO NOTHING;

INSERT INTO tw_tweets
    (user_id, content, lang, source, reply_to_id, retweet_of_id,
     like_count, retweet_count, reply_count, quote_count,
     impression_count, created_at)
SELECT
    (g % 1000) + 1,
    (ARRAY[
        'Just shipped a new feature using #sql and #python — loving the combo!',
        'Hot take: #machinelearning is just statistics with better marketing.',
        'Thread on why #postgresql outperforms MySQL for analytics',
        'Finished reading about #ai trends. Mind blown. #tech',
        'Nobody talks about #sql enough. It is the most underrated skill in #datascience.',
        'My #startup journey: month 6. Revenue: $0. Lessons: priceless. #founder',
        'Built a realtime dashboard with #postgresql and #webdev stack. Sharing soon.',
        '#crypto is down again. Time to #hodl or fold? #fintech',
        'New blog post on #devops pipelines — link in bio. #cloudcomputing',
        'Question for #ux folks: how do you handle dark patterns ethically?'
    ])[(g % 10) + 1],
    'en',
    (ARRAY['Web','iPhone','Android','API'])[(g % 4) + 1],
    CASE WHEN g % 7 = 0 AND g > 100 THEN (g % 100) + 1 ELSE NULL END,
    CASE WHEN g % 5 = 0 AND g > 50  THEN (g % 50)  + 1 ELSE NULL END,
    CASE WHEN g % 100 = 0 THEN (g * 97) % 50000
         WHEN g % 20 = 0  THEN (g * 53) % 5000
         ELSE (g * 13) % 500 END,
    CASE WHEN g % 100 = 0 THEN (g * 71) % 20000
         WHEN g % 20 = 0  THEN (g * 37) % 2000
         ELSE (g * 7) % 200 END,
    (g * 11) % 300,
    (g * 5)  % 100,
    CASE WHEN g % 100 = 0 THEN (g * 113) % 1000000
         ELSE (g * 23) % 50000 END,
    NOW() - INTERVAL '90 days' + ((g * 13 % 129600) || ' minutes')::INTERVAL
FROM generate_series(1, 10000) g
ON CONFLICT DO NOTHING;

INSERT INTO tw_tweet_hashtags (tweet_id, hashtag_id)
SELECT t.id, ((t.id * 7) % 20) + 1
FROM tw_tweets t
ON CONFLICT DO NOTHING;

INSERT INTO tw_tweet_hashtags (tweet_id, hashtag_id)
SELECT t.id, ((t.id * 13) % 20) + 1
FROM tw_tweets t
WHERE t.id % 3 = 0
  AND ((t.id * 13) % 20) + 1 != ((t.id * 7) % 20) + 1
ON CONFLICT DO NOTHING;

INSERT INTO tw_follows (follower_id, followee_id, followed_at)
SELECT
    (g % 1000) + 1,
    ((g * 37 + f * 13) % 1000) + 1,
    NOW() - (((g * 7 + f * 3) % 365) || ' days')::INTERVAL
FROM generate_series(1, 1000) g,
     generate_series(1, 20) f
WHERE (g % 1000) + 1 != ((g * 37 + f * 13) % 1000) + 1
ON CONFLICT DO NOTHING;

INSERT INTO tw_likes (user_id, tweet_id, liked_at)
SELECT
    (g % 1000) + 1,
    (g * 17 % 10000) + 1,
    NOW() - ((g % 90) || ' days')::INTERVAL
FROM generate_series(1, 20000) g
ON CONFLICT DO NOTHING;

INSERT INTO tw_mentions (tweet_id, mentioned_user_id)
SELECT
    t.id,
    ((t.id * 31) % 1000) + 1
FROM tw_tweets t
WHERE t.id % 4 = 0
ON CONFLICT DO NOTHING;

Verify everything loaded

Once the blocks above are run, this query should return a row per table with non-zero counts:

SELECT 'customers' AS tbl, COUNT(*) FROM customers
UNION ALL SELECT 'orders', COUNT(*) FROM orders
UNION ALL SELECT 'companies', COUNT(*) FROM companies
UNION ALL SELECT 'stock_prices', COUNT(*) FROM stock_prices
UNION ALL SELECT 'teams', COUNT(*) FROM teams
UNION ALL SELECT 'matches', COUNT(*) FROM matches
UNION ALL SELECT 'clients', COUNT(*) FROM clients
UNION ALL SELECT 'invoices', COUNT(*) FROM invoices
UNION ALL SELECT 'ec_customers', COUNT(*) FROM ec_customers
UNION ALL SELECT 'ec_orders', COUNT(*) FROM ec_orders
UNION ALL SELECT 'fts_articles', COUNT(*) FROM fts_articles
UNION ALL SELECT 'product_catalog', COUNT(*) FROM product_catalog
UNION ALL SELECT 'ts_server_metrics', COUNT(*) FROM ts_server_metrics
UNION ALL SELECT 'hc_patients', COUNT(*) FROM hc_patients
UNION ALL SELECT 'saas_accounts', COUNT(*) FROM saas_accounts
UNION ALL SELECT 'lg_warehouses', COUNT(*) FROM lg_warehouses
UNION ALL SELECT 'hr_employees', COUNT(*) FROM hr_employees
UNION ALL SELECT 'tw_users', COUNT(*) FROM tw_users;

If every row shows a non-zero count, you're set — every lesson in the course is now ready to run against your database.


Need help?

You have lifetime access to this course on every device — laptop, phone, tablet — through the website. If anything goes wrong with setup, or you have a question about any lesson:

  • Sign in to your account at absolutelearners.com and send us a message from the user portal — we read and reply to every one.
  • Or use the contact form on the website if you have not signed in yet.

We are happy to help you get unstuck. Welcome aboard, and have fun working through the course!