Introduction to Stock Market Data
Introduction to Stock Market Data
Financial market data is one of the most data-rich domains available, and SQL is the tool used by quantitative analysts, traders, and financial data engineers to query, transform, and analyse it. Whether you work at a hedge fund, a bank, a fintech startup, or simply manage personal investments, understanding how to query market data efficiently is a high-value skill.
This section teaches financial market SQL using a realistic stock market dataset. Every technique here — moving averages, volatility calculation, technical indicators, portfolio correlation — mirrors what professional quants and financial analysts actually compute.
What Stock Market Data Looks Like
Stock market data arrives in a specific format called OHLCV: Open, High, Low, Close, Volume. Each row represents one trading day for one stock. This is the fundamental unit of financial time-series data.
date | ticker | open | high | low | close | volume
------------|--------|--------|--------|--------|--------|----------
2024-01-02 | AAPL | 185.50 | 187.20 | 184.10 | 186.80 | 42,300,000
2024-01-03 | AAPL | 186.80 | 188.50 | 185.00 | 184.20 | 38,100,000
2024-01-04 | AAPL | 184.20 | 185.80 | 182.10 | 183.50 | 45,200,000
Understanding each field:
- Open: The price at market open (9:30 AM ET for US markets)
- High: The highest price traded during the day
- Low: The lowest price traded during the day
- Close: The final price at market close — the most commonly referenced price
- Volume: Number of shares traded — high volume validates price moves; low volume is suspect
The Dataset Schema
-- COMPANIES: Stocks we're tracking
-- id (UUID), ticker (TEXT), name (TEXT), sector (TEXT), exchange (TEXT)
SELECT * FROM companies LIMIT 5;
sector is important for comparative analysis — you should compare tech stocks to other tech stocks, not to energy companies. Common sectors: Technology, Healthcare, Financials, Energy, Consumer Discretionary.
-- STOCK_PRICES: Daily OHLCV data
-- company_id (UUID → companies.id), price_date (DATE),
-- open_price (NUMERIC), high_price (NUMERIC), low_price (NUMERIC),
-- close_price (NUMERIC), volume (BIGINT), close_price (NUMERIC)
SELECT * FROM stock_prices ORDER BY price_date DESC LIMIT 10;
close_price is the close price adjusted for dividends and stock splits. For return calculations, always use close_price rather than close_price — unadjusted prices create artificial jumps on ex-dividend dates.
-- DIVIDENDS: Cash dividends paid to shareholders
-- company_id (UUID → companies.id), ex_date (DATE),
-- payment_date (DATE), amount_per_share (NUMERIC)
SELECT * FROM dividends LIMIT 5;
ex_date is the cutoff date — you must own the stock before this date to receive the dividend. The close_price in stock_prices accounts for these payments.