Setting up a practice database
Setting up your practice database
This single lesson contains every CREATE TABLE and INSERT statement used anywhere in this course. Run it once on your own PostgreSQL instance and every example, every exercise, and every section will work without you ever having to hunt down setup scripts in individual lessons.
Lifetime access — pick this up whenever you like. You purchased this course on a one-time basis, so you have permanent access. You can sign in from the website, your phone, or your laptop and pick up exactly where you left off. There is no expiry, no recurring charge, and no rush. If you ever have a question, head to the user portal after signing in and send us a message — or use the contact form on absolutelearners.com. We answer every message personally.
How to run the SQL below
You only need a working PostgreSQL database. Three easy options:
Option 1 — Local PostgreSQL
If you have Postgres installed, create a new database and connect with psql:
createdb shopmetrics
psql shopmetrics
Then paste the blocks below in order.
Option 2 — neon.tech (free cloud Postgres, no credit card)
- Sign up at neon.tech
- Create a project, open the SQL editor
- Paste the blocks below in order and hit Run
Option 3 — db-fiddle.com Quick experiments without an account. Paste schema + a query and you're off.
Whatever option you pick — type the queries yourself the first time around rather than copying and pasting. Building muscle memory with the syntax is the single biggest difference between people who recognise SQL and people who write SQL.
Block 1 — ShopMetrics e-commerce (used by Foundations & most early lessons)
-- Customers
CREATE TABLE IF NOT EXISTS customers (
customer_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
country TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Products
CREATE TABLE IF NOT EXISTS products (
product_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
price NUMERIC(10,2) NOT NULL
);
-- Orders
CREATE TABLE IF NOT EXISTS orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT NOT NULL -- 'completed', 'refunded', 'pending'
);
-- Order line items
CREATE TABLE IF NOT EXISTS order_items (
item_id SERIAL PRIMARY KEY,
order_id INT REFERENCES orders,
product_id INT REFERENCES products,
quantity INT NOT NULL,
unit_price NUMERIC(10,2) NOT NULL
);
-- Page events (for funnel analysis later)
CREATE TABLE IF NOT EXISTS events (
event_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers,
event_type TEXT NOT NULL, -- 'view', 'add_to_cart', 'purchase'
page TEXT,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Customers (50 rows)
INSERT INTO customers (name, email, country, created_at)
SELECT
'Customer ' || n,
'user' || n || '@example.com',
(ARRAY['US','UK','CA','AU','DE'])[1 + (n % 5)],
NOW() - (random() * INTERVAL '2 years')
FROM generate_series(1, 50) AS n
ON CONFLICT (email) DO NOTHING;
-- Products (20 rows)
INSERT INTO products (name, category, price)
VALUES
('SQL Mastery Book', 'Books', 49.99),
('Python Crash Course', 'Books', 39.99),
('Mechanical Keyboard', 'Hardware', 129.00),
('Wireless Mouse', 'Hardware', 59.99),
('4K Monitor', 'Hardware', 499.00),
('VSCode Pro Theme Pack', 'Software', 12.00),
('Linux Command Poster', 'Merch', 19.99),
('Dev Sticker Pack', 'Merch', 9.99),
('Cloud Storage 1yr', 'Software', 99.00),
('Data Engineering Course', 'Courses', 199.00),
('React Deep Dive', 'Courses', 89.00),
('PostgreSQL internals', 'Books', 59.99),
('Standing Desk Mat', 'Hardware', 79.00),
('USB-C Hub', 'Hardware', 45.00),
('Noise Cancelling Buds', 'Hardware', 179.00),
('Git Cheatsheet Poster', 'Merch', 14.99),
('Bash Scripting Guide', 'Books', 29.99),
('CI/CD Pipeline Course', 'Courses', 149.00),
('API Design Handbook', 'Books', 44.99),
('Dark Mode Icon Pack', 'Software', 8.00)
ON CONFLICT DO NOTHING;
-- Orders (200 rows spread across 2 years)
INSERT INTO orders (customer_id, created_at, status)
SELECT
1 + (random() * 49)::INT,
NOW() - (random() * INTERVAL '2 years'),
(ARRAY['completed','completed','completed','refunded','pending'])[1 + (random()*4)::INT]
FROM generate_series(1, 200)
ON CONFLICT DO NOTHING;
-- Order items (1-4 items per order)
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
SELECT
o.order_id,
1 + (random() * 19)::INT,
1 + (random() * 3)::INT,
p.price
FROM orders o
CROSS JOIN LATERAL (
SELECT price FROM products
ORDER BY random()
LIMIT 1 + (random() * 3)::INT
) p
ON CONFLICT DO NOTHING;
-- Events
INSERT INTO events (customer_id, event_type, page, occurred_at)
SELECT
1 + (random() * 49)::INT,
(ARRAY['view','view','view','add_to_cart','add_to_cart','purchase'])[1 + (random()*5)::INT],
'/product/' || (1 + (random() * 19)::INT),
NOW() - (random() * INTERVAL '2 years')
FROM generate_series(1, 1000)
ON CONFLICT DO NOTHING;
Block 2 — Stock Market (companies, stock_prices, dividends)
Used in: CTEs, Window Functions, Stock Market Analytics, Time Series.
CREATE TABLE IF NOT EXISTS companies (
company_id SERIAL PRIMARY KEY,
ticker VARCHAR(10) UNIQUE NOT NULL,
name TEXT NOT NULL,
sector TEXT NOT NULL,
market_cap_b NUMERIC(10,2),
founded_year INT
);
CREATE TABLE IF NOT EXISTS stock_prices (
price_id SERIAL PRIMARY KEY,
company_id INT REFERENCES companies,
price_date DATE NOT NULL,
open_price NUMERIC(10,2) NOT NULL,
high_price NUMERIC(10,2) NOT NULL,
low_price NUMERIC(10,2) NOT NULL,
close_price NUMERIC(10,2) NOT NULL,
volume BIGINT NOT NULL,
UNIQUE(company_id, price_date)
);
CREATE TABLE IF NOT EXISTS dividends (
dividend_id SERIAL PRIMARY KEY,
company_id INT REFERENCES companies,
ex_date DATE NOT NULL,
amount_per_share NUMERIC(8,4) NOT NULL
);
INSERT INTO companies (ticker, name, sector, market_cap_b, founded_year) VALUES
('AAPL', 'Apple Inc.', 'Technology', 2800.00, 1976),
('MSFT', 'Microsoft Corp.', 'Technology', 2500.00, 1975),
('GOOGL', 'Alphabet Inc.', 'Technology', 1700.00, 1998),
('AMZN', 'Amazon.com Inc.', 'Consumer', 1400.00, 1994),
('NVDA', 'NVIDIA Corp.', 'Technology', 1200.00, 1993),
('TSLA', 'Tesla Inc.', 'Automotive', 650.00, 2003),
('META', 'Meta Platforms Inc.', 'Technology', 1100.00, 2004),
('NFLX', 'Netflix Inc.', 'Entertainment', 230.00, 1997),
('CRM', 'Salesforce Inc.', 'Technology', 220.00, 1999),
('SHOP', 'Shopify Inc.', 'Technology', 110.00, 2006)
ON CONFLICT (ticker) DO NOTHING;
INSERT INTO stock_prices (company_id, price_date, open_price, high_price, low_price, close_price, volume)
SELECT
c.company_id,
d::DATE AS price_date,
ROUND((base + random() * 20 - 10)::NUMERIC, 2) AS open_price,
ROUND((base + random() * 25 - 5)::NUMERIC, 2) AS high_price,
ROUND((base + random() * 25 - 25)::NUMERIC, 2) AS low_price,
ROUND((base + random() * 20 - 10)::NUMERIC, 2) AS close_price,
(1000000 + (random() * 9000000)::BIGINT) AS volume
FROM
generate_series('2024-01-02'::DATE, '2025-12-31'::DATE, '1 day'::INTERVAL) d
CROSS JOIN (VALUES
(1, 185.00),(2, 375.00),(3, 140.00),(4, 185.00),(5, 480.00),
(6, 250.00),(7, 380.00),(8, 620.00),(9, 285.00),(10, 90.00)
) AS c(company_id, base)
WHERE EXTRACT(DOW FROM d) NOT IN (0,6)
ON CONFLICT DO NOTHING;
INSERT INTO dividends (company_id, ex_date, amount_per_share) VALUES
(1, '2024-02-09', 0.24),(1, '2024-05-10', 0.25),(1, '2024-08-12', 0.25),(1, '2024-11-08', 0.25),
(2, '2024-02-14', 0.75),(2, '2024-05-15', 0.75),(2, '2024-08-14', 0.75),(2, '2024-11-20', 0.75),
(2, '2025-02-12', 0.75),(2, '2025-05-14', 0.75),
(1, '2025-02-07', 0.25),(1, '2025-05-09', 0.25)
ON CONFLICT DO NOTHING;
Block 3 — Sports Analytics (teams, players, matches, player_stats)
CREATE TABLE IF NOT EXISTS teams (
team_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
city TEXT NOT NULL,
country TEXT NOT NULL,
league TEXT NOT NULL,
founded_year INT,
stadium TEXT,
capacity INT
);
CREATE TABLE IF NOT EXISTS players (
player_id SERIAL PRIMARY KEY,
team_id INT REFERENCES teams,
name TEXT NOT NULL,
position TEXT NOT NULL,
nationality TEXT NOT NULL,
age INT,
market_value_m NUMERIC(8,2)
);
CREATE TABLE IF NOT EXISTS matches (
match_id SERIAL PRIMARY KEY,
home_team_id INT REFERENCES teams,
away_team_id INT REFERENCES teams,
match_date DATE NOT NULL,
season TEXT NOT NULL,
competition TEXT NOT NULL,
home_goals INT NOT NULL DEFAULT 0,
away_goals INT NOT NULL DEFAULT 0,
attendance INT,
referee TEXT
);
CREATE TABLE IF NOT EXISTS player_stats (
stat_id SERIAL PRIMARY KEY,
match_id INT REFERENCES matches,
player_id INT REFERENCES players,
goals INT NOT NULL DEFAULT 0,
assists INT NOT NULL DEFAULT 0,
minutes_played INT NOT NULL DEFAULT 0,
shots INT NOT NULL DEFAULT 0,
yellow_cards INT NOT NULL DEFAULT 0,
red_cards INT NOT NULL DEFAULT 0,
rating NUMERIC(4,2)
);
INSERT INTO teams (name, city, country, league, founded_year, stadium, capacity) VALUES
('Manchester City', 'Manchester', 'England', 'Premier League', 1880, 'Etihad Stadium', 53400),
('Arsenal', 'London', 'England', 'Premier League', 1886, 'Emirates Stadium', 60704),
('Liverpool', 'Liverpool', 'England', 'Premier League', 1892, 'Anfield', 61276),
('Chelsea', 'London', 'England', 'Premier League', 1905, 'Stamford Bridge', 40341),
('Manchester United', 'Manchester', 'England', 'Premier League', 1878, 'Old Trafford', 74310),
('Tottenham Hotspur', 'London', 'England', 'Premier League', 1882, 'Tottenham Hotspur Stadium', 62850),
('Newcastle United', 'Newcastle', 'England', 'Premier League', 1892, 'St James Park', 52305),
('Aston Villa', 'Birmingham', 'England', 'Premier League', 1874, 'Villa Park', 42785),
('West Ham United', 'London', 'England', 'Premier League', 1895, 'London Stadium', 62500),
('Brighton', 'Brighton', 'England', 'Premier League', 1901, 'Amex Stadium', 31800),
('Real Madrid', 'Madrid', 'Spain', 'La Liga', 1902, 'Santiago Bernabeu', 81044),
('Barcelona', 'Barcelona', 'Spain', 'La Liga', 1899, 'Spotify Camp Nou', 99354),
('Atletico Madrid', 'Madrid', 'Spain', 'La Liga', 1903, 'Civitas Metropolitano', 68456),
('Sevilla', 'Seville', 'Spain', 'La Liga', 1890, 'Ramon Sanchez-Pizjuan', 43883),
('Real Sociedad', 'San Sebastian', 'Spain', 'La Liga', 1909, 'Reale Arena', 39500),
('Villarreal', 'Villarreal', 'Spain', 'La Liga', 1923, 'Estadio de la Ceramica', 23500),
('Athletic Bilbao', 'Bilbao', 'Spain', 'La Liga', 1898, 'San Mames', 53289),
('Valencia', 'Valencia', 'Spain', 'La Liga', 1919, 'Mestalla', 55000),
('Real Betis', 'Seville', 'Spain', 'La Liga', 1907, 'Estadio Benito Villamarin', 60721),
('Girona', 'Girona', 'Spain', 'La Liga', 1930, 'Estadi Montilivi', 13450);
INSERT INTO players (team_id, name, position, nationality, age, market_value_m) VALUES
(1, 'Ederson', 'GK', 'Brazil', 31, 45.0),
(1, 'Ruben Dias', 'DEF', 'Portugal', 26, 90.0),
(1, 'Josko Gvardiol', 'DEF', 'Croatia', 22, 80.0),
(1, 'Rodri', 'MID', 'Spain', 27, 130.0),
(1, 'Kevin De Bruyne','MID', 'Belgium', 33, 60.0),
(1, 'Phil Foden', 'MID', 'England', 24, 150.0),
(1, 'Erling Haaland', 'FWD', 'Norway', 23, 180.0),
(2, 'David Raya', 'GK', 'Spain', 29, 40.0),
(2, 'Ben White', 'DEF', 'England', 26, 65.0),
(2, 'William Saliba', 'DEF', 'France', 23, 100.0),
(2, 'Declan Rice', 'MID', 'England', 25, 120.0),
(2, 'Martin Odegaard','MID', 'Norway', 25, 130.0),
(2, 'Bukayo Saka', 'FWD', 'England', 22, 160.0),
(2, 'Gabriel Martinelli','FWD','Brazil', 23, 100.0),
(3, 'Alisson', 'GK', 'Brazil', 31, 60.0),
(3, 'Trent Alexander-Arnold','DEF','England',25,90.0),
(3, 'Virgil van Dijk','DEF', 'Netherlands',33,55.0),
(3, 'Alexis Mac Allister','MID','Argentina',25,80.0),
(3, 'Dominik Szoboszlai','MID','Hungary', 23, 80.0),
(3, 'Mohamed Salah', 'FWD', 'Egypt', 32, 60.0),
(3, 'Darwin Nunez', 'FWD', 'Uruguay', 25, 80.0),
(11,'Thibaut Courtois','GK', 'Belgium', 32, 45.0),
(11,'Dani Carvajal', 'DEF', 'Spain', 32, 35.0),
(11,'Antonio Rudiger','DEF','Germany', 31, 35.0),
(11,'Luka Modric', 'MID', 'Croatia', 39, 10.0),
(11,'Jude Bellingham','MID','England', 20, 200.0),
(11,'Vinicius Jr', 'FWD', 'Brazil', 23, 200.0),
(11,'Kylian Mbappe', 'FWD', 'France', 25, 180.0),
(12,'Marc-Andre ter Stegen','GK','Germany',32,30.0),
(12,'Ronald Araujo', 'DEF', 'Uruguay', 25, 80.0),
(12,'Pedri', 'MID', 'Spain', 21, 130.0),
(12,'Gavi', 'MID', 'Spain', 19, 120.0),
(12,'Ferran Torres', 'FWD', 'Spain', 24, 50.0),
(12,'Robert Lewandowski','FWD','Poland', 36, 25.0);
INSERT INTO matches (home_team_id, away_team_id, match_date, season, competition, home_goals, away_goals, attendance) VALUES
(1, 2, '2023-08-19', '2023-24', 'Premier League', 1, 0, 53200),
(3, 4, '2023-08-20', '2023-24', 'Premier League', 2, 1, 60800),
(5, 6, '2023-08-21', '2023-24', 'Premier League', 0, 2, 73400),
(7, 8, '2023-08-26', '2023-24', 'Premier League', 3, 1, 51900),
(9, 10, '2023-08-26', '2023-24', 'Premier League', 1, 1, 62000),
(2, 1, '2023-10-08', '2023-24', 'Premier League', 0, 1, 60200),
(4, 3, '2023-10-22', '2023-24', 'Premier League', 1, 3, 39800),
(1, 5, '2023-12-10', '2023-24', 'Premier League', 3, 0, 53100),
(3, 9, '2023-12-23', '2023-24', 'Premier League', 4, 0, 60900),
(2, 7, '2024-01-14', '2023-24', 'Premier League', 2, 0, 60300),
(1, 3, '2024-03-10', '2023-24', 'Premier League', 1, 1, 53400),
(2, 4, '2024-04-07', '2023-24', 'Premier League', 5, 0, 60600),
(11,12, '2023-10-28', '2023-24', 'La Liga', 2, 1, 80500),
(12,11, '2024-04-21', '2023-24', 'La Liga', 1, 3, 98000),
(11,13, '2023-09-24', '2023-24', 'La Liga', 3, 1, 79200),
(13,11, '2024-02-04', '2023-24', 'La Liga', 2, 1, 67800),
(12,13, '2023-11-05', '2023-24', 'La Liga', 1, 0, 97400),
(11,14, '2023-08-13', '2023-24', 'La Liga', 2, 0, 78900),
(11,15, '2024-01-13', '2023-24', 'La Liga', 2, 0, 80100),
(1, 3, '2024-09-22', '2024-25', 'Premier League', 2, 2, 53400),
(2, 5, '2024-09-01', '2024-25', 'Premier League', 3, 0, 60700),
(3, 2, '2024-10-27', '2024-25', 'Premier League', 2, 1, 61000),
(4, 1, '2024-11-10', '2024-25', 'Premier League', 0, 3, 40100),
(1, 6, '2024-12-15', '2024-25', 'Premier League', 4, 1, 53300),
(11,12, '2024-10-26', '2024-25', 'La Liga', 0, 4, 81000),
(12,11, '2025-05-11', '2024-25', 'La Liga', 2, 2, 99000);
INSERT INTO player_stats (match_id, player_id, goals, assists, minutes_played, shots, yellow_cards, rating)
SELECT
m.match_id,
p.player_id,
CASE WHEN p.position = 'FWD' THEN (random() * 1.5)::INT
WHEN p.position = 'MID' THEN (random() * 0.8)::INT
ELSE 0 END AS goals,
CASE WHEN p.position IN ('FWD','MID') THEN (random() * 1.2)::INT ELSE 0 END AS assists,
CASE WHEN p.position = 'GK' THEN 90
ELSE 45 + (random() * 50)::INT END AS minutes_played,
CASE WHEN p.position = 'FWD' THEN (random() * 5)::INT
WHEN p.position = 'MID' THEN (random() * 3)::INT
ELSE (random() * 1)::INT END AS shots,
CASE WHEN random() < 0.1 THEN 1 ELSE 0 END AS yellow_cards,
ROUND((6.0 + random() * 3.5)::NUMERIC, 1) AS rating
FROM matches m
CROSS JOIN players p
WHERE p.team_id IN (m.home_team_id, m.away_team_id)
AND random() < 0.8;
Block 4 — Finance & Invoicing (clients, invoices, invoice_items, payments)
CREATE TABLE IF NOT EXISTS clients (
client_id SERIAL PRIMARY KEY,
company_name TEXT NOT NULL,
industry TEXT NOT NULL,
country TEXT NOT NULL,
credit_limit NUMERIC(12,2),
since_date DATE NOT NULL,
account_manager TEXT
);
CREATE TABLE IF NOT EXISTS invoices (
invoice_id SERIAL PRIMARY KEY,
client_id INT REFERENCES clients,
invoice_no VARCHAR(20) UNIQUE NOT NULL,
issue_date DATE NOT NULL,
due_date DATE NOT NULL,
status TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS invoice_items (
item_id SERIAL PRIMARY KEY,
invoice_id INT REFERENCES invoices,
description TEXT NOT NULL,
quantity NUMERIC(10,2) NOT NULL,
unit_price NUMERIC(12,2) NOT NULL,
tax_rate NUMERIC(5,4) NOT NULL DEFAULT 0.20
);
CREATE TABLE IF NOT EXISTS payments (
payment_id SERIAL PRIMARY KEY,
invoice_id INT REFERENCES invoices,
payment_date DATE NOT NULL,
amount NUMERIC(12,2) NOT NULL,
method TEXT NOT NULL
);
INSERT INTO clients (company_name, industry, country, credit_limit, since_date, account_manager) VALUES
('Acme Corp', 'Manufacturing', 'US', 500000, '2019-03-15', 'Sarah Chen'),
('TechFlow Ltd', 'Software', 'UK', 250000, '2020-07-01', 'James Wilson'),
('Nordic Supplies', 'Retail', 'SE', 180000, '2021-01-10', 'Sarah Chen'),
('Gulf Dynamics', 'Energy', 'AE', 750000, '2018-11-20', 'Ahmed Hassan'),
('Pacific Ventures', 'Finance', 'AU', 300000, '2022-02-14', 'James Wilson'),
('Berlin Digital', 'Software', 'DE', 200000, '2020-09-05', 'Laura Schmidt'),
('Maple Industries', 'Manufacturing', 'CA', 420000, '2019-06-30', 'Sarah Chen'),
('Tokyo Systems', 'Technology', 'JP', 550000, '2017-04-01', 'Ahmed Hassan'),
('Sunrise Retail', 'Retail', 'US', 150000, '2023-01-15', 'James Wilson'),
('Alpine Consulting', 'Consulting', 'CH', 350000, '2020-12-01', 'Laura Schmidt'),
('Iberian Partners', 'Manufacturing', 'ES', 280000, '2021-05-20', 'Laura Schmidt'),
('Cairo Solutions', 'Technology', 'EG', 120000, '2022-08-01', 'Ahmed Hassan'),
('Mumbai Tech', 'Software', 'IN', 190000, '2021-03-15', 'Ahmed Hassan'),
('Rio Commerce', 'Retail', 'BR', 160000, '2022-11-01', 'James Wilson'),
('Seoul Digital', 'Technology', 'KR', 340000, '2020-06-15', 'Ahmed Hassan'),
('Amsterdam Logistics', 'Logistics', 'NL', 410000, '2019-09-10', 'Laura Schmidt'),
('Singapore Finance', 'Finance', 'SG', 620000, '2018-04-20', 'James Wilson'),
('Toronto Media', 'Media', 'CA', 230000, '2021-07-05', 'Sarah Chen'),
('Sydney Health', 'Healthcare', 'AU', 380000, '2020-02-28', 'James Wilson'),
('Paris Fashion', 'Retail', 'FR', 290000, '2021-10-15', 'Laura Schmidt'),
('Lagos Energy', 'Energy', 'NG', 450000, '2019-12-01', 'Ahmed Hassan'),
('Buenos Aires Tech', 'Software', 'AR', 140000, '2022-04-10', 'James Wilson'),
('Warsaw Manufacturing', 'Manufacturing', 'PL', 320000, '2020-08-20', 'Laura Schmidt'),
('Nairobi Consulting', 'Consulting', 'KE', 110000, '2023-03-01', 'Ahmed Hassan'),
('Vienna Media', 'Media', 'AT', 175000, '2022-01-15', 'Laura Schmidt'),
('Dallas Energy', 'Energy', 'US', 680000, '2018-07-10', 'Sarah Chen'),
('Montreal Logistics', 'Logistics', 'CA', 290000, '2021-09-25', 'Sarah Chen'),
('Milan Fashion', 'Retail', 'IT', 220000, '2020-11-30', 'Laura Schmidt'),
('Helsinki Software', 'Software', 'FI', 195000, '2022-06-01', 'Laura Schmidt'),
('Cape Town Mining', 'Mining', 'ZA', 530000, '2019-02-14', 'Ahmed Hassan');
INSERT INTO invoices (client_id, invoice_no, issue_date, due_date, status)
SELECT
c.client_id,
'INV-' || LPAD(ROW_NUMBER() OVER (ORDER BY c.client_id, d)::TEXT, 5, '0'),
d::DATE AS issue_date,
(d + INTERVAL '30 days')::DATE AS due_date,
CASE
WHEN d > CURRENT_DATE - INTERVAL '30 days' THEN 'sent'
WHEN random() < 0.75 THEN 'paid'
WHEN random() < 0.5 THEN 'overdue'
ELSE 'sent'
END AS status
FROM clients c
CROSS JOIN generate_series(
c.since_date,
LEAST(c.since_date + INTERVAL '3 years', CURRENT_DATE),
INTERVAL '45 days'
) d
ON CONFLICT (invoice_no) DO NOTHING;
INSERT INTO invoice_items (invoice_id, description, quantity, unit_price, tax_rate)
SELECT
i.invoice_id,
(ARRAY['Consulting Services','Software License','Support & Maintenance',
'Implementation','Training','Data Migration','API Integration',
'Project Management','Security Audit','Cloud Hosting'])[1 + (random()*9)::INT],
ROUND((1 + random() * 9)::NUMERIC, 0) AS quantity,
ROUND((500 + random() * 4500)::NUMERIC, 2) AS unit_price,
CASE WHEN random() < 0.3 THEN 0.0 ELSE 0.20 END AS tax_rate
FROM invoices i
CROSS JOIN generate_series(1, 2 + (random() * 2)::INT) AS n;
INSERT INTO payments (invoice_id, payment_date, amount, method)
SELECT
i.invoice_id,
i.due_date - ((random() * 10)::INT || ' days')::INTERVAL,
ROUND((SELECT SUM(quantity * unit_price) FROM invoice_items ii WHERE ii.invoice_id = i.invoice_id), 2),
(ARRAY['bank_transfer','credit_card','check','bank_transfer','bank_transfer'])[1 + (random()*4)::INT]
FROM invoices i
WHERE i.status = 'paid';
-- Convenience column: cached invoice total computed from line items.
-- Many lessons reference `invoices.total_amount` directly; we keep it
-- in sync via a one-time UPDATE here.
ALTER TABLE invoices ADD COLUMN IF NOT EXISTS total_amount NUMERIC(12,2);
UPDATE invoices i
SET total_amount = (
SELECT COALESCE(SUM(ii.quantity * ii.unit_price), 0)
FROM invoice_items ii
WHERE ii.invoice_id = i.invoice_id
)
WHERE total_amount IS NULL;
Block 5 — Extended e-commerce (ec_* tables)
Used in: CTEs, Real-World Patterns, A/B testing, funnel and cohort lessons.
CREATE TABLE IF NOT EXISTS ec_customers (
customer_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
country TEXT NOT NULL,
city TEXT,
acquisition_channel TEXT NOT NULL DEFAULT 'organic',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
first_order_date DATE
);
CREATE TABLE IF NOT EXISTS ec_products (
product_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
subcategory TEXT,
price NUMERIC(10,2) NOT NULL,
cost NUMERIC(10,2) NOT NULL DEFAULT 0
);
CREATE TABLE IF NOT EXISTS ec_orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES ec_customers,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT NOT NULL,
total_amount NUMERIC(12,2) NOT NULL DEFAULT 0,
discount_amount NUMERIC(12,2) NOT NULL DEFAULT 0,
promo_code TEXT
);
CREATE TABLE IF NOT EXISTS ec_order_items (
item_id SERIAL PRIMARY KEY,
order_id INT REFERENCES ec_orders,
product_id INT REFERENCES ec_products,
quantity INT NOT NULL,
unit_price NUMERIC(10,2) NOT NULL
);
CREATE TABLE IF NOT EXISTS ec_events (
event_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES ec_customers,
session_id TEXT,
event_type TEXT NOT NULL,
page TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS ec_ab_assignments (
customer_id INT REFERENCES ec_customers,
experiment TEXT NOT NULL,
variant TEXT NOT NULL,
assigned_at TIMESTAMPTZ NOT NULL,
converted BOOLEAN NOT NULL DEFAULT FALSE,
revenue NUMERIC(10,2),
PRIMARY KEY (customer_id, experiment)
);
INSERT INTO ec_customers (name, email, country, city, acquisition_channel, created_at)
SELECT
'Customer ' || n,
'user' || n || '@example.com',
(ARRAY['US','UK','CA','AU','DE','FR','JP','BR','IN','NL'])[1 + (n % 10)],
(ARRAY['New York','London','Berlin','Paris','Toronto','Sydney','Tokyo','Sao Paulo','Mumbai','Amsterdam'])[1 + (n % 10)],
(ARRAY['organic','paid_search','social','email','referral'])[1 + (n % 5)],
NOW() - ((random() * 1095)::INT || ' days')::INTERVAL
FROM generate_series(1, 500) n
ON CONFLICT DO NOTHING;
INSERT INTO ec_products (name, category, subcategory, price, cost)
SELECT
cat || ' ' || subcat || ' ' || n,
cat,
subcat,
ROUND((15 + random() * 485)::NUMERIC, 2),
ROUND((5 + random() * 120)::NUMERIC, 2)
FROM (VALUES
('Electronics','Laptops'), ('Electronics','Phones'),
('Electronics','Tablets'), ('Clothing','Tops'),
('Clothing','Bottoms'), ('Clothing','Shoes'),
('Home','Kitchen'), ('Home','Bedding'),
('Books','Fiction'), ('Books','Non-Fiction'),
('Sports','Equipment'), ('Sports','Apparel')
) AS cats(cat, subcat)
CROSS JOIN generate_series(1, 9) n
ON CONFLICT DO NOTHING;
INSERT INTO ec_orders (customer_id, created_at, status, total_amount, discount_amount, promo_code)
SELECT
c.customer_id,
c.created_at + ((random() * 700 + 1)::INT || ' days')::INTERVAL,
(ARRAY['completed','completed','completed','refunded','pending','cancelled'])[1 + (random()*5)::INT],
0,
CASE WHEN random() < 0.2 THEN ROUND((random() * 30)::NUMERIC, 2) ELSE 0 END,
CASE WHEN random() < 0.2 THEN 'SAVE' || (10 + (random()*20)::INT)::TEXT ELSE NULL END
FROM ec_customers c
CROSS JOIN generate_series(1, 1 + (random()*7)::INT) g
WHERE (c.created_at + ((random() * 700 + 1)::INT || ' days')::INTERVAL) <= NOW();
INSERT INTO ec_order_items (order_id, product_id, quantity, unit_price)
SELECT
o.order_id,
1 + (random() * 107)::INT,
1 + (random() * 3)::INT,
ROUND((10 + random() * 290)::NUMERIC, 2)
FROM ec_orders o
CROSS JOIN generate_series(1, 2 + (random() * 2)::INT) n;
UPDATE ec_orders o
SET total_amount = GREATEST(
0,
(SELECT ROUND(SUM(quantity * unit_price), 2) FROM ec_order_items oi WHERE oi.order_id = o.order_id)
- o.discount_amount
);
UPDATE ec_customers c
SET first_order_date = (
SELECT MIN(created_at::DATE) FROM ec_orders o WHERE o.customer_id = c.customer_id
);
INSERT INTO ec_events (customer_id, session_id, event_type, page, created_at)
SELECT
o.customer_id,
'sess_' || o.customer_id || '_' || o.order_id,
evt_type,
CASE evt_type
WHEN 'page_view' THEN (ARRAY['/home','/ec_products','/about','/blog'])[1+(random()*3)::INT]
WHEN 'view' THEN '/product/' || (1+(random()*107)::INT)::TEXT
WHEN 'add_to_cart' THEN '/cart'
WHEN 'checkout' THEN '/checkout'
WHEN 'purchase' THEN '/confirmation'
END,
o.created_at - ((5 - step) || ' minutes')::INTERVAL
FROM ec_orders o
CROSS JOIN (VALUES
(1,'page_view'),(2,'view'),(3,'add_to_cart'),(4,'checkout'),(5,'purchase')
) AS e(step, evt_type)
WHERE o.status = 'completed';
INSERT INTO ec_ab_assignments (customer_id, experiment, variant, assigned_at, converted, revenue)
SELECT
customer_id,
'checkout_redesign',
CASE WHEN random() < 0.5 THEN 'control' ELSE 'treatment' END,
created_at + '30 days'::INTERVAL,
random() < CASE WHEN random() < 0.5 THEN 0.12 ELSE 0.16 END,
CASE WHEN random() < 0.14 THEN ROUND((20 + random() * 280)::NUMERIC, 2) ELSE NULL END
FROM ec_customers
WHERE random() < 0.7
ON CONFLICT DO NOTHING;
Block 6 — Full Text Search articles (fts_articles)
Used in: every Full Text Search lesson.
CREATE TABLE IF NOT EXISTS fts_articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL,
author TEXT,
category TEXT,
published DATE,
search_vec TSVECTOR
);
INSERT INTO fts_articles (title, body, author, category, published)
SELECT * FROM (VALUES
('PostgreSQL Performance Tuning',
'Learn how to optimize PostgreSQL queries using indexes, EXPLAIN ANALYZE, and vacuum strategies for high-throughput systems.',
'Alice Chen', 'Database', '2024-01-15'::DATE),
('Understanding Database Indexes',
'A deep dive into B-tree, hash, GIN, and GiST index types, when to use each, and how they affect query planning.',
'Bob Smith', 'Database', '2024-02-01'::DATE),
('Python and PostgreSQL Integration',
'Connect Python applications to PostgreSQL using psycopg2 and SQLAlchemy, with async support via asyncpg.',
'Carol White', 'Programming', '2024-02-20'::DATE),
('Advanced SQL Window Functions',
'Master ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, and complex window frames for analytical queries.',
'Alice Chen', 'Database', '2024-03-05'::DATE),
('Building REST APIs with FastAPI',
'Create high-performance Python REST APIs using FastAPI, Pydantic models, and async database connections.',
'David Lee', 'Programming', '2024-03-18'::DATE),
('Time Series Data in PostgreSQL',
'Store and query time series metrics efficiently using partitioning, window functions, and the date_trunc function.',
'Carol White', 'Database', '2024-04-02'::DATE),
('PostgreSQL Security Best Practices',
'Secure your PostgreSQL installation with row-level security, encryption at rest, SSL connections, and role management.',
'Bob Smith', 'Security', '2024-04-15'::DATE),
('Machine Learning with SQL',
'Use PostgreSQL extensions for machine learning: linear regression, k-means clustering, and statistical functions.',
'Alice Chen', 'Data Science', '2024-05-01'::DATE)
) v(title, body, author, category, published)
WHERE NOT EXISTS (SELECT 1 FROM fts_articles LIMIT 1);
UPDATE fts_articles
SET search_vec = setweight(to_tsvector('english', title), 'A')
|| setweight(to_tsvector('english', body), 'B')
WHERE search_vec IS NULL;
CREATE INDEX IF NOT EXISTS idx_fts_articles_vec ON fts_articles USING GIN(search_vec);
Block 7 — Product Catalog with JSONB (product_catalog)
Used in: every JSON / JSONB lesson.
CREATE TABLE IF NOT EXISTS product_catalog (
product_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
base_price NUMERIC(10,2),
attributes JSONB
);
INSERT INTO product_catalog (name, category, base_price, attributes)
SELECT * FROM (VALUES
('MacBook Pro 14', 'Electronics', 1999.00::NUMERIC,
'{"brand":"Apple","ram_gb":16,"storage_gb":512,"color":"silver","weight_kg":1.6}'::JSONB),
('MacBook Pro 16', 'Electronics', 2499.00::NUMERIC,
'{"brand":"Apple","ram_gb":32,"storage_gb":1024,"color":"space grey","weight_kg":2.1}'::JSONB),
('ThinkPad X1', 'Electronics', 1599.00::NUMERIC,
'{"brand":"Lenovo","ram_gb":16,"storage_gb":512,"color":"black","weight_kg":1.4}'::JSONB),
('501 Jeans', 'Clothing', 89.00::NUMERIC,
'{"brand":"Levis","sizes":["28","30","32","34","36"],"material":"denim","color":"blue"}'::JSONB),
('Classic Tee', 'Clothing', 24.00::NUMERIC,
'{"brand":"Uniqlo","sizes":["XS","S","M","L","XL"],"material":"cotton","color":"white"}'::JSONB),
('Refactoring 2nd', 'Books', 49.00::NUMERIC,
'{"author":"Martin Fowler","isbn":"978-0-13-468599-1","pages":448,"publisher":"Addison-Wesley"}'::JSONB),
('Designing Data-Intensive Applications', 'Books', 55.00::NUMERIC,
'{"author":"Martin Kleppmann","isbn":"978-1-449-37332-0","pages":616,"publisher":"OReilly"}'::JSONB)
) v(name, category, base_price, attributes)
WHERE NOT EXISTS (SELECT 1 FROM product_catalog LIMIT 1);
CREATE INDEX IF NOT EXISTS idx_product_catalog_attrs ON product_catalog USING GIN(attributes);
Block 8 — Time Series tables (ts_*)
Used in: every Time Series Analytics lesson.
CREATE TABLE IF NOT EXISTS ts_server_metrics (
id BIGSERIAL PRIMARY KEY,
host TEXT NOT NULL,
region TEXT,
recorded_at TIMESTAMPTZ NOT NULL,
cpu_pct NUMERIC(5,2),
mem_pct NUMERIC(5,2),
disk_io_mbps NUMERIC(8,2),
net_mbps NUMERIC(8,2),
error_count INT DEFAULT 0
);
CREATE INDEX IF NOT EXISTS idx_ts_server_metrics_host_time
ON ts_server_metrics (host, recorded_at);
CREATE TABLE IF NOT EXISTS ts_stock_prices (
id BIGSERIAL PRIMARY KEY,
ticker TEXT NOT NULL,
price_date DATE NOT NULL,
open_cents INT,
high_cents INT,
low_cents INT,
close_cents INT,
volume BIGINT,
UNIQUE (ticker, price_date)
);
CREATE TABLE IF NOT EXISTS ts_web_events (
id BIGSERIAL PRIMARY KEY,
session_id TEXT NOT NULL,
user_id BIGINT,
event_type TEXT,
page TEXT,
created_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_ts_web_events_session_time
ON ts_web_events (session_id, created_at);
CREATE TABLE IF NOT EXISTS ts_energy_readings (
id BIGSERIAL PRIMARY KEY,
meter_id TEXT NOT NULL,
building TEXT,
read_at TIMESTAMPTZ NOT NULL,
kwh NUMERIC(10,4),
UNIQUE (meter_id, read_at)
);
INSERT INTO ts_server_metrics
(host, region, recorded_at, cpu_pct, mem_pct, disk_io_mbps, net_mbps, error_count)
SELECT
'srv-' || LPAD(h::TEXT, 2, '0'),
(ARRAY['us-east','us-west','eu-west'])[(h % 3) + 1],
NOW() - INTERVAL '7 days' + (m || ' minutes')::INTERVAL,
LEAST(99, GREATEST(1,
20 + 30 * SIN(m::float * PI() / 720)
+ (h * 7 + m * 13) % 20
+ (RANDOM() * 10)::INT
)),
LEAST(99, GREATEST(10,
40 + 20 * SIN(m::float * PI() / 720 + 1)
+ (h * 5 + m * 7) % 15
)),
ROUND((10 + (m * h * 3) % 90)::numeric, 2),
ROUND((5 + (m * h * 7) % 50)::numeric, 2),
CASE WHEN (m * h) % 200 = 0 THEN (RANDOM() * 5)::INT ELSE 0 END
FROM generate_series(1, 6) h,
generate_series(0, 10079) m
ON CONFLICT DO NOTHING;
INSERT INTO ts_stock_prices
(ticker, price_date, open_cents, high_cents, low_cents, close_cents, volume)
SELECT
ticker,
(NOW() - INTERVAL '2 years')::DATE + d,
base + (d * 7 + ticker_idx * 13) % 500,
base + (d * 7 + ticker_idx * 13) % 500 + (d * 3 + ticker_idx) % 150,
GREATEST(100, base + (d * 7 + ticker_idx * 13) % 500 - (d * 5 + ticker_idx) % 200),
base + (d * 11 + ticker_idx * 17) % 600,
1000000 + (d * ticker_idx * 37) % 9000000
FROM generate_series(0, 730) d,
(VALUES ('AAPL', 15000, 1), ('MSFT', 30000, 2), ('GOOG', 12000, 3),
('AMZN', 18000, 4), ('NVDA', 8000, 5)) AS s(ticker, base, ticker_idx)
WHERE EXTRACT(DOW FROM (NOW() - INTERVAL '2 years')::DATE + d) NOT IN (0, 6)
ON CONFLICT DO NOTHING;
INSERT INTO ts_web_events (session_id, user_id, event_type, page, created_at)
SELECT
'sess-' || LPAD(s::TEXT, 6, '0'),
(s * 7) % 2000 + 1,
(ARRAY['page_view','page_view','page_view','click','click',
'add_to_cart','purchase','bounce'])[(e % 8) + 1],
(ARRAY['/home','/products','/product/detail','/cart',
'/checkout','/blog','/pricing','/about'])[(e % 8) + 1],
NOW() - INTERVAL '30 days'
+ ((s * 41) % (30 * 1440) || ' minutes')::INTERVAL
+ (e * 2 || ' minutes')::INTERVAL
FROM generate_series(1, 500) s,
generate_series(0, 9) e
ON CONFLICT DO NOTHING;
INSERT INTO ts_energy_readings (meter_id, building, read_at, kwh)
SELECT
'MTR-' || LPAD(m::TEXT, 3, '0'),
(ARRAY['HQ','Warehouse','DataCenter','Office-A',
'Office-B','Retail-1','Retail-2','Parking'])[(m % 8) + 1],
NOW() - INTERVAL '30 days' + (i * 15 || ' minutes')::INTERVAL,
ROUND((
5 + 10 * SIN(i::float * PI() / 48)
+ (m * 2 + i * 3) % 8
+ (RANDOM() * 2)::INT
)::numeric, 4)
FROM generate_series(1, 8) m,
generate_series(0, 2879) i
ON CONFLICT DO NOTHING;
Block 9 — Healthcare (hc_*)
Used in: every Healthcare & Hospital Analytics lesson.
CREATE TABLE IF NOT EXISTS hc_wards (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
type TEXT NOT NULL,
capacity INT NOT NULL,
floor INT
);
CREATE TABLE IF NOT EXISTS hc_staff (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
role TEXT NOT NULL,
specialty TEXT,
ward_id INT REFERENCES hc_wards(id),
hire_date DATE NOT NULL
);
CREATE TABLE IF NOT EXISTS hc_patients (
id SERIAL PRIMARY KEY,
mrn TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
dob DATE NOT NULL,
gender TEXT NOT NULL,
blood_type TEXT,
insurance_type TEXT
);
CREATE TABLE IF NOT EXISTS hc_admissions (
id SERIAL PRIMARY KEY,
patient_id INT REFERENCES hc_patients(id),
ward_id INT REFERENCES hc_wards(id),
attending_id INT REFERENCES hc_staff(id),
admission_type TEXT NOT NULL,
admitted_at TIMESTAMPTZ NOT NULL,
discharged_at TIMESTAMPTZ,
discharge_disposition TEXT
);
CREATE TABLE IF NOT EXISTS hc_diagnoses (
id SERIAL PRIMARY KEY,
admission_id INT REFERENCES hc_admissions(id),
icd_code TEXT NOT NULL,
description TEXT NOT NULL,
is_primary BOOLEAN NOT NULL DEFAULT false,
diagnosed_at TIMESTAMPTZ NOT NULL
);
CREATE TABLE IF NOT EXISTS hc_procedures (
id SERIAL PRIMARY KEY,
admission_id INT REFERENCES hc_admissions(id),
proc_code TEXT NOT NULL,
description TEXT NOT NULL,
performed_by INT REFERENCES hc_staff(id),
performed_at TIMESTAMPTZ NOT NULL,
cost_cents INT NOT NULL
);
CREATE TABLE IF NOT EXISTS hc_lab_results (
id SERIAL PRIMARY KEY,
patient_id INT REFERENCES hc_patients(id),
admission_id INT REFERENCES hc_admissions(id),
test_name TEXT NOT NULL,
value_numeric NUMERIC,
unit TEXT,
ref_low NUMERIC,
ref_high NUMERIC,
is_abnormal BOOLEAN GENERATED ALWAYS AS (
value_numeric IS NOT NULL AND ref_low IS NOT NULL AND ref_high IS NOT NULL
AND (value_numeric < ref_low OR value_numeric > ref_high)
) STORED,
collected_at TIMESTAMPTZ NOT NULL
);
CREATE TABLE IF NOT EXISTS hc_medications (
id SERIAL PRIMARY KEY,
admission_id INT REFERENCES hc_admissions(id),
drug_name TEXT NOT NULL,
dosage_mg NUMERIC,
frequency TEXT NOT NULL,
route TEXT,
start_date DATE NOT NULL,
end_date DATE
);
INSERT INTO hc_wards (name, type, capacity, floor) VALUES
('ICU', 'ICU', 12, 3),
('Emergency', 'Emergency', 30, 1),
('Surgery', 'Surgery', 20, 2),
('General-A', 'General', 25, 4),
('General-B', 'General', 25, 4),
('Pediatric', 'Pediatric', 15, 5)
ON CONFLICT DO NOTHING;
INSERT INTO hc_staff (name, role, specialty, ward_id, hire_date)
SELECT
'Staff ' || g,
CASE g % 3 WHEN 0 THEN 'Doctor' WHEN 1 THEN 'Nurse' ELSE 'Technician' END,
CASE g % 5 WHEN 0 THEN 'Cardiology' WHEN 1 THEN 'Emergency Medicine'
WHEN 2 THEN 'Surgery' WHEN 3 THEN 'Internal Medicine' ELSE 'Pediatrics' END,
(g % 6) + 1,
'2015-01-01'::DATE + (g * 90) * INTERVAL '1 day'
FROM generate_series(1, 20) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_patients (mrn, name, dob, gender, blood_type, insurance_type)
SELECT
'MRN-' || LPAD(g::TEXT, 6, '0'),
'Patient ' || g,
'1950-01-01'::DATE + ((g * 127) % 25000) * INTERVAL '1 day',
CASE g % 2 WHEN 0 THEN 'M' ELSE 'F' END,
(ARRAY['A+','A-','B+','B-','AB+','AB-','O+','O-'])[((g * 3) % 8) + 1],
(ARRAY['Private','Medicare','Medicaid','Uninsured'])[((g * 7) % 4) + 1]
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_admissions
(patient_id, ward_id, attending_id, admission_type,
admitted_at, discharged_at, discharge_disposition)
SELECT
(g % 500) + 1,
(g % 6) + 1,
(g % 20) + 1,
(ARRAY['Emergency','Elective','Transfer'])[((g * 11) % 3) + 1],
NOW() - ((g * 3) % 1095 + 1) * INTERVAL '1 day'
+ ((g * 7) % 24) * INTERVAL '1 hour',
CASE WHEN g % 10 = 0 THEN NULL
ELSE NOW() - ((g * 3) % 1095 + 1) * INTERVAL '1 day'
+ ((g * 7) % 24) * INTERVAL '1 hour'
+ ((g % 14) + 1) * INTERVAL '1 day'
END,
CASE WHEN g % 10 = 0 THEN NULL
ELSE (ARRAY['Home','SNF','AMA','Expired','Home Health'])[((g * 13) % 5) + 1]
END
FROM generate_series(1, 700) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_diagnoses
(admission_id, icd_code, description, is_primary, diagnosed_at)
SELECT
(g % 700) + 1,
(ARRAY['I21','J18','K57','N18','E11','I50','J44','M79','R07','Z51'])[(g % 10) + 1],
(ARRAY['Acute MI','Pneumonia','Diverticulitis','Chronic Kidney Disease',
'Type 2 Diabetes','Heart Failure','COPD','Musculoskeletal Pain',
'Chest Pain','Chemotherapy'])[((g * 3) % 10) + 1],
g % 3 != 0,
NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
FROM generate_series(1, 1000) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_procedures
(admission_id, proc_code, description, performed_by, performed_at, cost_cents)
SELECT
(g % 700) + 1,
'P' || LPAD(((g * 7) % 50 + 1)::TEXT, 3, '0'),
(ARRAY['ECG','CT Scan','MRI','Blood Culture','Echocardiogram',
'Colonoscopy','Bronchoscopy','Lumbar Puncture','Intubation','Dialysis'])[(g % 10) + 1],
(g % 20) + 1,
NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day',
((g * 137) % 45000 + 5000)
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_lab_results
(patient_id, admission_id, test_name, value_numeric, unit, ref_low, ref_high, collected_at)
SELECT
(g % 500) + 1,
(g % 700) + 1,
(ARRAY['Hemoglobin','WBC','Creatinine','Sodium','Glucose',
'Troponin','INR','Potassium'])[(g % 8) + 1],
CASE (g % 8)
WHEN 0 THEN 10 + ((g * 31) % 80) / 10.0
WHEN 1 THEN 4 + ((g * 17) % 100) / 10.0
WHEN 2 THEN 0.5 + ((g * 7) % 30) / 10.0
WHEN 3 THEN 130 + ((g * 11) % 30)
WHEN 4 THEN 70 + ((g * 23) % 200)
WHEN 5 THEN ((g * 3) % 50) / 10.0
WHEN 6 THEN 0.8 + ((g * 13) % 30) / 10.0
ELSE 3.0 + ((g * 19) % 30) / 10.0
END,
(ARRAY['g/dL','K/uL','mg/dL','mEq/L','mg/dL','ng/mL','ratio','mEq/L'])[(g % 8) + 1],
(ARRAY[12, 4.5, 0.6, 136, 70, 0, 0.8, 3.5])[(g % 8) + 1],
(ARRAY[17, 11, 1.2, 145, 100, 0.04, 1.2, 5.0])[(g % 8) + 1],
NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
FROM generate_series(1, 1500) g
ON CONFLICT DO NOTHING;
INSERT INTO hc_medications
(admission_id, drug_name, dosage_mg, frequency, route, start_date, end_date)
SELECT
(g % 700) + 1,
(ARRAY['Metformin','Lisinopril','Atorvastatin','Amlodipine','Metoprolol',
'Furosemide','Warfarin','Insulin Glargine','Albuterol','Pantoprazole'])[(g % 10) + 1],
CASE (g % 5) WHEN 0 THEN 500 WHEN 1 THEN 10 WHEN 2 THEN 40 WHEN 3 THEN 5 ELSE 25 END,
(ARRAY['QD','BID','TID','PRN','QID'])[((g * 3) % 5) + 1],
(ARRAY['PO','IV','IM','SQ'])[((g * 7) % 4) + 1],
(NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day')::DATE,
CASE WHEN g % 5 = 0 THEN NULL
ELSE (NOW() - ((g * 2) % 1095 + 1) * INTERVAL '1 day'
+ ((g % 14) + 1) * INTERVAL '1 day')::DATE
END
FROM generate_series(1, 800) g
ON CONFLICT DO NOTHING;
Block 10 — SaaS Analytics (saas_*)
CREATE TABLE IF NOT EXISTS saas_plans (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
price_cents INT NOT NULL,
billing_interval TEXT NOT NULL,
max_seats INT NOT NULL
);
CREATE TABLE IF NOT EXISTS saas_accounts (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
industry TEXT NOT NULL,
country TEXT NOT NULL,
acquisition_channel TEXT NOT NULL,
plan_id INT REFERENCES saas_plans(id),
status TEXT NOT NULL,
seats INT NOT NULL DEFAULT 1,
mrr_cents INT NOT NULL DEFAULT 0,
trial_started_at TIMESTAMPTZ,
converted_at TIMESTAMPTZ,
churned_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS saas_mrr_events (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES saas_accounts(id),
event_type TEXT NOT NULL,
amount_cents INT NOT NULL,
occurred_at DATE NOT NULL,
reason TEXT
);
CREATE TABLE IF NOT EXISTS saas_feature_usage (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES saas_accounts(id),
feature_name TEXT NOT NULL,
usage_count INT NOT NULL DEFAULT 0,
period_date DATE NOT NULL
);
CREATE TABLE IF NOT EXISTS saas_sessions (
id BIGSERIAL PRIMARY KEY,
account_id INT REFERENCES saas_accounts(id),
user_email TEXT NOT NULL,
started_at TIMESTAMPTZ NOT NULL,
ended_at TIMESTAMPTZ,
page_count INT NOT NULL DEFAULT 1
);
CREATE TABLE IF NOT EXISTS saas_payments (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES saas_accounts(id),
amount_cents INT NOT NULL,
status TEXT NOT NULL,
failure_reason TEXT,
attempted_at TIMESTAMPTZ NOT NULL,
succeeded_at TIMESTAMPTZ
);
INSERT INTO saas_plans (name, price_cents, billing_interval, max_seats) VALUES
('Starter', 4900, 'monthly', 3),
('Growth', 14900, 'monthly', 10),
('Business', 39900, 'monthly', 50),
('Enterprise',99900, 'monthly', 999)
ON CONFLICT DO NOTHING;
INSERT INTO saas_accounts
(name, industry, country, acquisition_channel, plan_id, status, seats, mrr_cents,
trial_started_at, converted_at, churned_at, created_at)
SELECT
'Account ' || gs,
(ARRAY['SaaS','FinTech','HealthTech','eCommerce','EdTech','MarTech','HR Tech','LegalTech'])[gs % 8 + 1],
(ARRAY['US','UK','CA','AU','DE','FR','NL','SG'])[gs % 8 + 1],
(ARRAY['Organic','Paid Search','Referral','Outbound','Partner','Content','Social','Event'])[gs % 8 + 1],
(gs % 4) + 1,
CASE gs % 10
WHEN 9 THEN 'trial'
WHEN 7 THEN 'churned'
WHEN 8 THEN 'churned'
ELSE 'active'
END,
(gs % 10) + 1,
CASE (gs % 4) + 1
WHEN 1 THEN 4900 WHEN 2 THEN 14900 WHEN 3 THEN 39900 ELSE 99900
END,
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours'),
CASE WHEN gs % 10 NOT IN (9) THEN
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours') + ((gs % 14 + 7) * INTERVAL '1 day')
END,
CASE WHEN gs % 10 IN (7,8) THEN
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours') + ((gs % 200 + 60) * INTERVAL '1 day')
END,
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '22 hours')
FROM generate_series(1, 400) gs
ON CONFLICT DO NOTHING;
INSERT INTO saas_mrr_events (account_id, event_type, amount_cents, occurred_at, reason)
SELECT
(gs % 400) + 1,
(ARRAY['new','expansion','contraction','churn','reactivation'])[gs % 5 + 1],
CASE (gs % 5) + 1
WHEN 1 THEN ((gs % 4) + 1) * 4900
WHEN 2 THEN (gs % 3 + 1) * 5000
WHEN 3 THEN -(gs % 3 + 1) * 5000
WHEN 4 THEN -((gs % 4) + 1) * 4900
ELSE (gs % 4 + 1) * 4900
END,
DATE '2022-01-01' + (gs * 2 % 730),
(ARRAY['Upgrade','Seats added','Seats removed','Price objection','Product fit',
'Reactivated after pause','New signup','Downgrade','Budget cut','Returning customer'])[gs % 10 + 1]
FROM generate_series(1, 600) gs
ON CONFLICT DO NOTHING;
INSERT INTO saas_feature_usage (account_id, feature_name, usage_count, period_date)
SELECT
(gs % 400) + 1,
(ARRAY['Dashboard','Reports','API','Exports','Integrations','Alerts','Collaboration','Mobile'])[gs % 8 + 1],
(gs % 50) * (CASE WHEN gs % 7 = 0 THEN 0 ELSE 1 END),
DATE_TRUNC('month', DATE '2022-01-01' + (gs % 730))::DATE
FROM generate_series(1, 2000) gs
ON CONFLICT DO NOTHING;
INSERT INTO saas_sessions (account_id, user_email, started_at, ended_at, page_count)
SELECT
(gs % 400) + 1,
'user' || (gs % 5 + 1) || '@account' || ((gs % 400) + 1) || '.com',
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '6 hours'),
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '6 hours') + ((gs % 60 + 1) * INTERVAL '1 minute'),
(gs % 20) + 1
FROM generate_series(1, 3000) gs
ON CONFLICT DO NOTHING;
INSERT INTO saas_payments (account_id, amount_cents, status, failure_reason, attempted_at, succeeded_at)
SELECT
(gs % 400) + 1,
CASE (gs % 4) + 1
WHEN 1 THEN 4900 WHEN 2 THEN 14900 WHEN 3 THEN 39900 ELSE 99900
END,
CASE WHEN gs % 6 = 0 THEN 'failed' WHEN gs % 20 = 0 THEN 'pending' ELSE 'succeeded' END,
CASE WHEN gs % 6 = 0 THEN
(ARRAY['Card declined','Insufficient funds','Expired card','Do not honor','Invalid account'])[gs % 5 + 1]
END,
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '10 hours'),
CASE WHEN gs % 6 != 0 AND gs % 20 != 0 THEN
TIMESTAMPTZ '2022-01-01' + (gs * INTERVAL '10 hours') + INTERVAL '30 seconds'
END
FROM generate_series(1, 800) gs
ON CONFLICT DO NOTHING;
Block 11 — Logistics & Supply Chain (lg_*)
CREATE TABLE IF NOT EXISTS lg_warehouses (
id SERIAL PRIMARY KEY,
code TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
city TEXT NOT NULL,
country TEXT NOT NULL DEFAULT 'US',
lat NUMERIC(9,6) NOT NULL DEFAULT 0,
lng NUMERIC(9,6) NOT NULL DEFAULT 0,
capacity_m3 INT
);
CREATE TABLE IF NOT EXISTS lg_suppliers (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
country TEXT NOT NULL,
lead_time_days INT NOT NULL DEFAULT 14,
reliability_score NUMERIC(3,2) NOT NULL DEFAULT 0.90
);
CREATE TABLE IF NOT EXISTS lg_skus (
id SERIAL PRIMARY KEY,
sku TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
category TEXT NOT NULL,
unit_cost NUMERIC(10,2) NOT NULL,
weight_kg NUMERIC(6,3) NOT NULL,
supplier_id INT REFERENCES lg_suppliers(id)
);
CREATE TABLE IF NOT EXISTS lg_inventory (
id SERIAL PRIMARY KEY,
warehouse_id INT REFERENCES lg_warehouses(id),
sku_id INT REFERENCES lg_skus(id),
quantity INT NOT NULL DEFAULT 0,
reorder_point INT NOT NULL DEFAULT 50,
reorder_qty INT NOT NULL DEFAULT 200,
UNIQUE (warehouse_id, sku_id)
);
CREATE TABLE IF NOT EXISTS lg_purchase_orders (
id SERIAL PRIMARY KEY,
supplier_id INT REFERENCES lg_suppliers(id),
warehouse_id INT REFERENCES lg_warehouses(id),
sku_id INT REFERENCES lg_skus(id),
quantity INT NOT NULL,
unit_cost NUMERIC(10,2) NOT NULL,
status TEXT NOT NULL DEFAULT 'open',
ordered_at TIMESTAMPTZ NOT NULL,
expected_at DATE NOT NULL,
received_at TIMESTAMPTZ
);
CREATE TABLE IF NOT EXISTS lg_shipments (
id BIGSERIAL PRIMARY KEY,
order_ref TEXT,
warehouse_id INT REFERENCES lg_warehouses(id),
carrier TEXT,
service_level TEXT,
origin_city TEXT,
dest_city TEXT,
dest_state TEXT,
weight_kg NUMERIC(8,2),
shipped_at TIMESTAMPTZ,
promised_at TIMESTAMPTZ,
delivered_at TIMESTAMPTZ,
freight_cents INT
);
CREATE TABLE IF NOT EXISTS lg_delivery_events (
id BIGSERIAL PRIMARY KEY,
shipment_id BIGINT REFERENCES lg_shipments(id),
status TEXT,
location TEXT,
occurred_at TIMESTAMPTZ
);
INSERT INTO lg_warehouses (code, name, city, country, lat, lng, capacity_m3)
SELECT * FROM (VALUES
('WH-NE', 'Northeast Hub', 'Newark', 'US', 40.7357, -74.1724, 50000),
('WH-SE', 'Southeast Hub', 'Atlanta', 'US', 33.7490, -84.3880, 45000),
('WH-MW', 'Midwest Hub', 'Chicago', 'US', 41.8781, -87.6298, 55000),
('WH-SW', 'Southwest Hub', 'Dallas', 'US', 32.7767, -96.7970, 40000),
('WH-WC', 'West Coast Hub', 'Los Angeles', 'US', 34.0522, -118.2437, 60000),
('WH-NW', 'Northwest Hub', 'Seattle', 'US', 47.6062, -122.3321, 35000)
) AS v(code, name, city, country, lat, lng, cap)
ON CONFLICT DO NOTHING;
INSERT INTO lg_suppliers (name, country, lead_time_days, reliability_score)
SELECT * FROM (VALUES
('Apex Manufacturing', 'US', 7, 0.97),
('Global Parts Co', 'CN', 21, 0.88),
('Precision Components', 'DE', 14, 0.95),
('FastSource Inc', 'US', 5, 0.92),
('Pacific Rim Supplies', 'TW', 18, 0.90),
('Euro Logistics BV', 'NL', 16, 0.93),
('Southland Goods', 'MX', 10, 0.85),
('Atlas Industrial', 'US', 8, 0.96)
) AS v(name, country, lead_time_days, reliability_score)
ON CONFLICT DO NOTHING;
INSERT INTO lg_skus (sku, name, category, unit_cost, weight_kg)
SELECT
'SKU-' || LPAD(g::TEXT, 4, '0'),
(ARRAY['Widget','Gadget','Component','Assembly','Module','Device','Unit','Part'])[((g-1) % 8) + 1]
|| ' ' || g,
(ARRAY['Electronics','Hardware','Packaging','Chemical','Mechanical'])[(( g-1) % 5) + 1],
ROUND(((500 + (g * 137) % 9500) / 100.0)::numeric, 2),
ROUND((0.1 + (g * 0.23) % 49.9)::numeric, 2)
FROM generate_series(1, 40) g
ON CONFLICT DO NOTHING;
INSERT INTO lg_inventory (warehouse_id, sku_id, quantity, reorder_point, reorder_qty)
SELECT
(((s-1)) % 6) + 1,
s,
ABS(500 + ((1 * s * 7) % 1500) - ((1 * 13) % 200)),
50 + (s * 17) % 450,
200 + (s * 53) % 800
FROM generate_series(1, 40) s
ON CONFLICT DO NOTHING;
INSERT INTO lg_purchase_orders
(supplier_id, warehouse_id, sku_id, quantity, unit_cost,
ordered_at, expected_at, received_at, status)
SELECT
(g % 8) + 1,
(g % 6) + 1,
(g % 40) + 1,
100 + (g * 37) % 900,
ROUND(((500 + (g * 131) % 9000) / 100.0)::numeric, 2),
NOW() - (((g * 7) % 365) || ' days')::INTERVAL,
(NOW() - (((g * 7) % 365) || ' days')::INTERVAL + (7 + (g % 21) || ' days')::INTERVAL)::DATE,
CASE WHEN g % 10 < 8
THEN NOW() - (((g * 7) % 365) || ' days')::INTERVAL + ((10 + (g % 25)) || ' days')::INTERVAL
ELSE NULL END,
CASE WHEN g % 10 < 8 THEN 'received'
WHEN g % 10 = 8 THEN 'cancelled'
ELSE 'open' END
FROM generate_series(1, 500) g
ON CONFLICT DO NOTHING;
INSERT INTO lg_shipments
(order_ref, warehouse_id, carrier, service_level, origin_city, dest_city,
dest_state, weight_kg, shipped_at, promised_at, delivered_at, freight_cents)
SELECT
'ORD-' || LPAD(g::TEXT, 6, '0'),
(g % 6) + 1,
(ARRAY['FedEx','UPS','USPS','DHL','OnTrac'])[(g % 5) + 1],
(ARRAY['standard','express','overnight'])[(g % 3) + 1],
(ARRAY['Newark','Atlanta','Chicago','Dallas','Los Angeles','Seattle'])[(g % 6) + 1],
(ARRAY['Boston','Miami','Detroit','Houston','Phoenix','Portland',
'Denver','Nashville','Charlotte','Minneapolis'])[(g % 10) + 1],
(ARRAY['MA','FL','MI','TX','AZ','OR','CO','TN','NC','MN'])[(g % 10) + 1],
ROUND((1.0 + (g * 0.37) % 99)::numeric, 2),
NOW() - (((g * 3) % 365) || ' days')::INTERVAL,
NOW() - (((g * 3) % 365) || ' days')::INTERVAL
+ ((CASE (g % 3) WHEN 0 THEN 5 WHEN 1 THEN 2 ELSE 1 END) || ' days')::INTERVAL,
CASE WHEN g % 12 < 10
THEN NOW() - (((g * 3) % 365) || ' days')::INTERVAL
+ ((CASE (g % 3) WHEN 0 THEN 4 + (g % 3) WHEN 1 THEN 2 + (g % 2) ELSE 1 END) || ' days')::INTERVAL
ELSE NULL END,
300 + (g * 47) % 4700
FROM generate_series(1, 2000) g
ON CONFLICT DO NOTHING;
INSERT INTO lg_delivery_events (shipment_id, status, location, occurred_at)
SELECT
s.id,
ev.status,
ev.location,
s.shipped_at + (ev.offset_hours || ' hours')::INTERVAL
FROM lg_shipments s
CROSS JOIN LATERAL (VALUES
('picked', 'Origin Facility', 0),
('in_transit', 'Transit Hub', 12),
('out_for_delivery', 'Dest City', 36),
('delivered', 'Customer', 48)
) AS ev(status, location, offset_hours)
WHERE s.delivered_at IS NOT NULL
AND s.id % 3 = 0
ON CONFLICT DO NOTHING;
Block 12 — HR & Workforce Analytics (hr_*)
CREATE TABLE IF NOT EXISTS hr_departments (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
parent_id INT REFERENCES hr_departments(id),
cost_center TEXT,
location TEXT
);
CREATE TABLE IF NOT EXISTS hr_job_levels (
id SERIAL PRIMARY KEY,
level_code TEXT UNIQUE NOT NULL,
title TEXT,
level_type TEXT,
min_salary INT,
max_salary INT
);
CREATE TABLE IF NOT EXISTS hr_employees (
id BIGSERIAL PRIMARY KEY,
emp_id TEXT UNIQUE NOT NULL,
name TEXT,
dept_id INT REFERENCES hr_departments(id),
manager_id BIGINT REFERENCES hr_employees(id),
job_level_id INT REFERENCES hr_job_levels(id),
gender TEXT,
ethnicity TEXT,
hire_date DATE,
termination_date DATE,
termination_reason TEXT,
location TEXT,
is_active BOOLEAN GENERATED ALWAYS AS (termination_date IS NULL) STORED
);
CREATE TABLE IF NOT EXISTS hr_salaries (
id BIGSERIAL PRIMARY KEY,
employee_id BIGINT REFERENCES hr_employees(id),
salary_usd INT NOT NULL,
effective_date DATE NOT NULL,
change_reason TEXT
);
CREATE TABLE IF NOT EXISTS hr_performance_reviews (
id BIGSERIAL PRIMARY KEY,
employee_id BIGINT REFERENCES hr_employees(id),
review_year INT,
rating INT CHECK (rating BETWEEN 1 AND 5),
reviewer_id BIGINT REFERENCES hr_employees(id),
review_date DATE,
UNIQUE (employee_id, review_year)
);
CREATE TABLE IF NOT EXISTS hr_headcount_snapshots (
id BIGSERIAL PRIMARY KEY,
dept_id INT REFERENCES hr_departments(id),
snapshot_month DATE,
headcount INT,
UNIQUE (dept_id, snapshot_month)
);
INSERT INTO hr_departments (name, parent_id, cost_center, location)
VALUES
('Engineering', NULL, 'CC-100', 'San Francisco'),
('Product', NULL, 'CC-200', 'San Francisco'),
('Sales', NULL, 'CC-300', 'New York'),
('Marketing', NULL, 'CC-400', 'New York'),
('HR', NULL, 'CC-500', 'Austin'),
('Finance', NULL, 'CC-600', 'Austin'),
('Backend', 1, 'CC-101', 'San Francisco'),
('Frontend', 1, 'CC-102', 'San Francisco'),
('Data & ML', 1, 'CC-103', 'Remote'),
('DevOps', 1, 'CC-104', 'Remote'),
('Enterprise Sales', 3, 'CC-301', 'New York'),
('SMB Sales', 3, 'CC-302', 'Chicago')
ON CONFLICT DO NOTHING;
INSERT INTO hr_job_levels (level_code, title, level_type, min_salary, max_salary)
VALUES
('IC1', 'Junior Engineer', 'ic', 60000, 90000),
('IC2', 'Mid Engineer', 'ic', 90000, 130000),
('IC3', 'Senior Engineer', 'ic', 130000, 175000),
('IC4', 'Staff Engineer', 'ic', 175000, 230000),
('IC5', 'Principal Engineer', 'ic', 230000, 300000),
('M1', 'Engineering Manager', 'manager', 160000, 210000),
('M2', 'Senior Manager', 'manager', 200000, 260000),
('VP', 'Vice President', 'executive',250000, 350000),
('C', 'C-Suite', 'executive',350000, 600000)
ON CONFLICT DO NOTHING;
INSERT INTO hr_employees
(emp_id, name, dept_id, manager_id, job_level_id,
gender, ethnicity, hire_date, termination_date, termination_reason, location)
SELECT
'EMP-' || LPAD(g::TEXT, 5, '0'),
(ARRAY['Alex','Jordan','Morgan','Taylor','Casey','Riley','Quinn','Avery',
'Jamie','Drew','Blake','Cameron','Skyler','Reese','Sage'])[(g % 15) + 1]
|| ' ' ||
(ARRAY['Smith','Johnson','Lee','Garcia','Brown','Davis','Wilson','Chen',
'Martinez','Anderson','Taylor','Thomas','Moore','Jackson','White'])[(g % 15) + 1],
(g % 12) + 1,
CASE WHEN g > 20 THEN (((g - 1) % 20) + 1) ELSE NULL END,
(g % 9) + 1,
(ARRAY['M','F','NB'])[(g % 3) + 1],
(ARRAY['White','Hispanic','Asian','Black','Multiracial','Other'])[(g % 6) + 1],
CURRENT_DATE - ((30 + (g * 7) % 2190) || ' days')::INTERVAL,
CASE WHEN g % 8 = 0
THEN CURRENT_DATE - ((1 + (g * 3) % 180) || ' days')::INTERVAL
ELSE NULL END,
CASE WHEN g % 8 = 0
THEN (ARRAY['voluntary','involuntary','layoff','retirement'])[(g % 4) + 1]
ELSE NULL END,
(ARRAY['San Francisco','New York','Austin','Remote','Chicago','Seattle'])[(g % 6) + 1]
FROM generate_series(1, 600) g
ON CONFLICT DO NOTHING;
INSERT INTO hr_salaries (employee_id, salary_usd, effective_date, change_reason)
SELECT
e.id,
jl.min_salary + ((e.id * 1237) % (jl.max_salary - jl.min_salary)),
e.hire_date,
'hire'
FROM hr_employees e
JOIN hr_job_levels jl ON jl.id = e.job_level_id
ON CONFLICT DO NOTHING;
INSERT INTO hr_salaries (employee_id, salary_usd, effective_date, change_reason)
SELECT
e.id,
ROUND(s.salary_usd * (1.03 + (e.id % 5) * 0.01)),
(DATE_TRUNC('year', e.hire_date) + INTERVAL '1 year')::DATE,
'merit'
FROM hr_employees e
JOIN hr_salaries s ON s.employee_id = e.id AND s.change_reason = 'hire'
WHERE e.hire_date < CURRENT_DATE - INTERVAL '1 year'
ON CONFLICT DO NOTHING;
INSERT INTO hr_performance_reviews
(employee_id, review_year, rating, reviewer_id, review_date)
SELECT
e.id,
yr.review_year,
LEAST(5, GREATEST(1, 3 + ((e.id * yr.review_year * 7) % 5) - 2)),
COALESCE(e.manager_id, e.id),
TO_DATE(yr.review_year || '-12-15', 'YYYY-MM-DD')
FROM hr_employees e
CROSS JOIN (VALUES (2022),(2023),(2024)) AS yr(review_year)
WHERE e.hire_date < TO_DATE(yr.review_year || '-12-01', 'YYYY-MM-DD')
ON CONFLICT DO NOTHING;
INSERT INTO hr_headcount_snapshots (dept_id, snapshot_month, headcount)
SELECT
d.id,
gs.month,
50 + ((d.id * EXTRACT(MONTH FROM gs.month)::INT * 3) % 80)
FROM hr_departments d
CROSS JOIN generate_series(
DATE_TRUNC('month', NOW() - INTERVAL '24 months'),
DATE_TRUNC('month', NOW()),
INTERVAL '1 month'
) AS gs(month)
ON CONFLICT DO NOTHING;
Block 13 — Twitter / Social Graph (tw_*)
CREATE TABLE IF NOT EXISTS tw_users (
id BIGSERIAL PRIMARY KEY,
username TEXT UNIQUE NOT NULL,
display_name TEXT,
bio TEXT,
location TEXT,
verified BOOL DEFAULT FALSE,
follower_count INT DEFAULT 0,
following_count INT DEFAULT 0,
tweet_count INT DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT NOW(),
is_bot BOOL DEFAULT FALSE
);
CREATE TABLE IF NOT EXISTS tw_tweets (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT REFERENCES tw_users(id),
content TEXT NOT NULL,
lang TEXT DEFAULT 'en',
source TEXT,
reply_to_id BIGINT REFERENCES tw_tweets(id),
retweet_of_id BIGINT REFERENCES tw_tweets(id),
like_count INT DEFAULT 0,
retweet_count INT DEFAULT 0,
reply_count INT DEFAULT 0,
quote_count INT DEFAULT 0,
impression_count INT DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_tw_tweets_user_time
ON tw_tweets(user_id, created_at);
CREATE INDEX IF NOT EXISTS idx_tw_tweets_time
ON tw_tweets(created_at);
CREATE TABLE IF NOT EXISTS tw_follows (
follower_id BIGINT REFERENCES tw_users(id),
followee_id BIGINT REFERENCES tw_users(id),
followed_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id)
);
CREATE TABLE IF NOT EXISTS tw_likes (
user_id BIGINT REFERENCES tw_users(id),
tweet_id BIGINT REFERENCES tw_tweets(id),
liked_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (user_id, tweet_id)
);
CREATE TABLE IF NOT EXISTS tw_hashtags (
id BIGSERIAL PRIMARY KEY,
tag TEXT UNIQUE NOT NULL
);
CREATE TABLE IF NOT EXISTS tw_tweet_hashtags (
tweet_id BIGINT REFERENCES tw_tweets(id),
hashtag_id BIGINT REFERENCES tw_hashtags(id),
PRIMARY KEY (tweet_id, hashtag_id)
);
CREATE TABLE IF NOT EXISTS tw_mentions (
tweet_id BIGINT REFERENCES tw_tweets(id),
mentioned_user_id BIGINT REFERENCES tw_users(id),
PRIMARY KEY (tweet_id, mentioned_user_id)
);
INSERT INTO tw_users
(username, display_name, location, verified, follower_count,
following_count, tweet_count, created_at, is_bot)
SELECT
'user_' || LPAD(g::TEXT, 5, '0'),
(ARRAY['Alex','Jordan','Morgan','Taylor','Casey','Riley','Sam','Drew',
'Blake','Cameron','Skyler','Reese','Avery','Quinn','Jamie'])[(g%15)+1]
|| ' ' ||
(ARRAY['Smith','Lee','Garcia','Brown','Davis','Wilson','Chen','Martinez',
'Anderson','Thomas','Moore','Jackson','White','Harris','Clark'])[(g%15)+1],
(ARRAY['New York','London','Tokyo','San Francisco','LA','Chicago',
'Toronto','Berlin','Sydney','Singapore','Mumbai','Paris'])[(g%12)+1],
g % 50 = 0,
(g * 137) % 100000,
(g * 73) % 5000,
(g * 41) % 10000,
NOW() - (((g * 11) % 1460) || ' days')::INTERVAL,
g % 100 = 0
FROM generate_series(1, 1000) g
ON CONFLICT DO NOTHING;
INSERT INTO tw_hashtags (tag)
SELECT unnest(ARRAY[
'ai','machinelearning','python','postgresql','datascience',
'webdev','javascript','startup','tech','crypto',
'openai','llm','cloudcomputing','devops','cybersecurity',
'ux','productmanagement','sql','analytics','fintech'
]) ON CONFLICT DO NOTHING;
INSERT INTO tw_tweets
(user_id, content, lang, source, reply_to_id, retweet_of_id,
like_count, retweet_count, reply_count, quote_count,
impression_count, created_at)
SELECT
(g % 1000) + 1,
(ARRAY[
'Just shipped a new feature using #sql and #python — loving the combo!',
'Hot take: #machinelearning is just statistics with better marketing.',
'Thread on why #postgresql outperforms MySQL for analytics',
'Finished reading about #ai trends. Mind blown. #tech',
'Nobody talks about #sql enough. It is the most underrated skill in #datascience.',
'My #startup journey: month 6. Revenue: $0. Lessons: priceless. #founder',
'Built a realtime dashboard with #postgresql and #webdev stack. Sharing soon.',
'#crypto is down again. Time to #hodl or fold? #fintech',
'New blog post on #devops pipelines — link in bio. #cloudcomputing',
'Question for #ux folks: how do you handle dark patterns ethically?'
])[(g % 10) + 1],
'en',
(ARRAY['Web','iPhone','Android','API'])[(g % 4) + 1],
CASE WHEN g % 7 = 0 AND g > 100 THEN (g % 100) + 1 ELSE NULL END,
CASE WHEN g % 5 = 0 AND g > 50 THEN (g % 50) + 1 ELSE NULL END,
CASE WHEN g % 100 = 0 THEN (g * 97) % 50000
WHEN g % 20 = 0 THEN (g * 53) % 5000
ELSE (g * 13) % 500 END,
CASE WHEN g % 100 = 0 THEN (g * 71) % 20000
WHEN g % 20 = 0 THEN (g * 37) % 2000
ELSE (g * 7) % 200 END,
(g * 11) % 300,
(g * 5) % 100,
CASE WHEN g % 100 = 0 THEN (g * 113) % 1000000
ELSE (g * 23) % 50000 END,
NOW() - INTERVAL '90 days' + ((g * 13 % 129600) || ' minutes')::INTERVAL
FROM generate_series(1, 10000) g
ON CONFLICT DO NOTHING;
INSERT INTO tw_tweet_hashtags (tweet_id, hashtag_id)
SELECT t.id, ((t.id * 7) % 20) + 1
FROM tw_tweets t
ON CONFLICT DO NOTHING;
INSERT INTO tw_tweet_hashtags (tweet_id, hashtag_id)
SELECT t.id, ((t.id * 13) % 20) + 1
FROM tw_tweets t
WHERE t.id % 3 = 0
AND ((t.id * 13) % 20) + 1 != ((t.id * 7) % 20) + 1
ON CONFLICT DO NOTHING;
INSERT INTO tw_follows (follower_id, followee_id, followed_at)
SELECT
(g % 1000) + 1,
((g * 37 + f * 13) % 1000) + 1,
NOW() - (((g * 7 + f * 3) % 365) || ' days')::INTERVAL
FROM generate_series(1, 1000) g,
generate_series(1, 20) f
WHERE (g % 1000) + 1 != ((g * 37 + f * 13) % 1000) + 1
ON CONFLICT DO NOTHING;
INSERT INTO tw_likes (user_id, tweet_id, liked_at)
SELECT
(g % 1000) + 1,
(g * 17 % 10000) + 1,
NOW() - ((g % 90) || ' days')::INTERVAL
FROM generate_series(1, 20000) g
ON CONFLICT DO NOTHING;
INSERT INTO tw_mentions (tweet_id, mentioned_user_id)
SELECT
t.id,
((t.id * 31) % 1000) + 1
FROM tw_tweets t
WHERE t.id % 4 = 0
ON CONFLICT DO NOTHING;
Verify everything loaded
Once the blocks above are run, this query should return a row per table with non-zero counts:
SELECT 'customers' AS tbl, COUNT(*) FROM customers
UNION ALL SELECT 'orders', COUNT(*) FROM orders
UNION ALL SELECT 'companies', COUNT(*) FROM companies
UNION ALL SELECT 'stock_prices', COUNT(*) FROM stock_prices
UNION ALL SELECT 'teams', COUNT(*) FROM teams
UNION ALL SELECT 'matches', COUNT(*) FROM matches
UNION ALL SELECT 'clients', COUNT(*) FROM clients
UNION ALL SELECT 'invoices', COUNT(*) FROM invoices
UNION ALL SELECT 'ec_customers', COUNT(*) FROM ec_customers
UNION ALL SELECT 'ec_orders', COUNT(*) FROM ec_orders
UNION ALL SELECT 'fts_articles', COUNT(*) FROM fts_articles
UNION ALL SELECT 'product_catalog', COUNT(*) FROM product_catalog
UNION ALL SELECT 'ts_server_metrics', COUNT(*) FROM ts_server_metrics
UNION ALL SELECT 'hc_patients', COUNT(*) FROM hc_patients
UNION ALL SELECT 'saas_accounts', COUNT(*) FROM saas_accounts
UNION ALL SELECT 'lg_warehouses', COUNT(*) FROM lg_warehouses
UNION ALL SELECT 'hr_employees', COUNT(*) FROM hr_employees
UNION ALL SELECT 'tw_users', COUNT(*) FROM tw_users;
If every row shows a non-zero count, you're set — every lesson in the course is now ready to run against your database.
Need help?
You have lifetime access to this course on every device — laptop, phone, tablet — through the website. If anything goes wrong with setup, or you have a question about any lesson:
- Sign in to your account at absolutelearners.com and send us a message from the user portal — we read and reply to every one.
- Or use the contact form on the website if you have not signed in yet.
We are happy to help you get unstuck. Welcome aboard, and have fun working through the course!