Earnings and Volume Pattern Analysis
Earnings and Volume Pattern Analysis
Price tells you what happened. Volume tells you who was behind it.
A stock can move up on thin volume — a handful of retail traders chasing momentum — or on massive volume, which typically signals institutional investors: pension funds, hedge funds, and mutual funds moving hundreds of millions of dollars. These two scenarios look identical on a price chart but are completely different from a signal quality perspective. High-volume moves are far more likely to be sustained than low-volume moves.
This lesson covers the SQL techniques to quantify volume, detect anomalies, and understand the relationship between price moves and the volume behind them. These are the queries that power stock scanners, pre-market alerts, and earnings event analysis.
Volume Concepts
Before writing SQL, it's important to understand what we're measuring:
| Metric | Formula | What it tells you |
|---|---|---|
| Relative volume (RVOL) | today_volume / avg_20d_volume |
How unusual is today's activity compared to the norm? RVOL of 3.0 means 3× normal volume |
| Up-volume ratio | up_day_volume / total_volume |
Is buying pressure concentrated? >60% up-volume = accumulation signal |
| VWAP | SUM(price × volume) / SUM(volume) |
The "true" average price weighted by how much traded at each level — used by institutions as their execution benchmark |
| Volume Z-score | (volume - mean) / stddev |
Statistical measure of how many standard deviations above normal today's volume is; Z > 2 is unusual, Z > 3 is rare |
| On-Balance Volume (OBV) | Running sum adding on up-days, subtracting on down-days | Whether money is flowing into or out of a stock over time — divergence from price is a leading indicator |
Example 1: Relative Volume (RVOL) Scanner
This is the most important volume query in practice. It scans every stock and computes today's volume as a multiple of the 20-day average. Traders use this to find stocks "in play" — with unusual activity that warrants further investigation.
The query uses two CTEs: avg_volume computes the rolling 20-day average per company, and latest retrieves the most recent trading day's data using DISTINCT ON. These are joined to produce a ranked list sorted by relative volume descending — the highest RVOL stocks appear first.
The CASE statement translates the ratio into human-readable signals: EXTREME (3×+), HIGH (2×+), ABOVE AVERAGE (1.5×+), or NORMAL. These thresholds are commonly used in retail trading platforms.
WITH avg_volume AS (
SELECT company_id,
AVG(volume) AS avg_20d_vol
FROM stock_prices
WHERE price_date >= CURRENT_DATE - 20
GROUP BY company_id
),
latest AS (
SELECT DISTINCT ON (company_id)
company_id, price_date, volume, close_price,
close_price - LAG(close_price) OVER (PARTITION BY company_id ORDER BY price_date) AS price_change
FROM stock_prices ORDER BY company_id, price_date DESC
)
SELECT
c.ticker, c.sector,
l.price_date,
ROUND(l.close_price, 2) AS price,
ROUND(l.price_change, 2) AS price_change,
l.volume,
ROUND(av.avg_20d_vol) AS avg_20d_volume,
ROUND(l.volume / NULLIF(av.avg_20d_vol, 0), 2) AS relative_volume,
CASE
WHEN l.volume > av.avg_20d_vol * 3 THEN 'EXTREME VOLUME'
WHEN l.volume > av.avg_20d_vol * 2 THEN 'HIGH VOLUME'
WHEN l.volume > av.avg_20d_vol * 1.5 THEN 'ABOVE AVERAGE'
ELSE 'NORMAL'
END AS volume_signal
FROM latest l
JOIN avg_volume av ON av.company_id = l.company_id
JOIN companies c ON c.company_id = l.company_id
ORDER BY relative_volume DESC;
What This Returns
| Column | Explanation |
|---|---|
relative_volume |
The key metric — 1.0 means exactly average, 3.0 means 3× normal |
price_change |
Today's price movement in dollars — combine with RVOL to classify the event |
volume_signal |
Human-readable tier — useful for alerting systems |
Interpretation: A stock with RVOL 4.0 and a +5% price change on earnings day is very different from a stock with RVOL 4.0 on a day with no news. The query gives you the volume context; you add the fundamental context.
Example 2: VWAP (Volume-Weighted Average Price)
VWAP is the benchmark that institutional traders are measured against. A fund manager who buys a stock above VWAP "paid too much" relative to the day's average; below VWAP means they got a discount. VWAP is also used as a support/resistance level in intraday trading.
With daily OHLCV data (no intraday ticks), we approximate VWAP using the typical price: (high + low + close) / 3. This is the standard approximation when minute-by-minute data is unavailable.
The query computes a cumulative VWAP from the beginning of the dataset to the current date, which is more useful for detecting long-term price dislocation than a single-day VWAP.
-- Intraday VWAP using daily OHLCV (approximation using typical price)
SELECT
c.ticker,
sp.price_date,
ROUND(SUM((sp.high_price + sp.low_price + sp.close_price) / 3 * sp.volume)
OVER (PARTITION BY sp.company_id ORDER BY sp.price_date)
/ NULLIF(SUM(sp.volume) OVER (PARTITION BY sp.company_id ORDER BY sp.price_date), 0),
2) AS cumulative_vwap,
ROUND(sp.close_price, 2) AS close_price,
sp.volume
FROM stock_prices sp
JOIN companies c ON c.company_id = sp.company_id
WHERE sp.company_id = 1 -- single company example
ORDER BY sp.price_date;
What This Returns
| Column | Explanation |
|---|---|
cumulative_vwap |
Volume-weighted average price from day 1 to the current row's date |
close_price |
Actual closing price |
| Implicit: difference | When close > VWAP, price is above the weighted average — potential overvaluation signal |
The two window function calls (SUM(...) OVER (PARTITION BY ... ORDER BY ...)) both use cumulative running sums. Dividing the running sum of price×volume by the running sum of volume gives the weighted average at every point in time.
Example 3: Unusual Volume Anomaly Detection
Rather than using arbitrary multiples (2×, 3×), this query applies proper statistics: it calculates each stock's mean volume and standard deviation, then computes a Z-score for every trading day. A Z-score of 3+ means the volume was more than 3 standard deviations above the mean — something that should happen less than 0.3% of the time under a normal distribution.
The query then classifies each anomaly as BULLISH SURGE (high volume + price up), BEARISH DUMP (high volume + price down), or HIGH VOLUME FLAT (unusual volume but price didn't move — often signals indecision or a quiet news event). These are exactly the patterns earnings traders and event-driven funds scan for.
WITH vol_stats AS (
SELECT company_id,
AVG(volume) AS mean_vol,
STDDEV(volume) AS std_vol
FROM stock_prices
GROUP BY company_id
),
anomalies AS (
SELECT sp.company_id, sp.price_date, sp.volume, sp.close_price,
ROUND((sp.volume - vs.mean_vol)::NUMERIC / NULLIF(vs.std_vol, 0), 2) AS volume_z_score,
100.0 * (sp.close_price - LAG(sp.close_price) OVER (PARTITION BY sp.company_id ORDER BY sp.price_date))
/ NULLIF(LAG(sp.close_price) OVER (PARTITION BY sp.company_id ORDER BY sp.price_date), 0) AS price_change_pct
FROM stock_prices sp
JOIN vol_stats vs ON vs.company_id = sp.company_id
)
SELECT c.ticker, a.price_date,
a.volume, ROUND(a.volume_z_score, 2) AS volume_z,
ROUND(a.price_change_pct::NUMERIC, 2) AS price_chg_pct,
CASE
WHEN a.volume_z_score > 3 AND a.price_change_pct > 2 THEN 'BULLISH SURGE'
WHEN a.volume_z_score > 3 AND a.price_change_pct < -2 THEN 'BEARISH DUMP'
WHEN a.volume_z_score > 3 THEN 'HIGH VOLUME FLAT'
ELSE 'NORMAL'
END AS event_type
FROM anomalies a
JOIN companies c ON c.company_id = a.company_id
WHERE a.volume_z_score > 3
ORDER BY a.volume_z_score DESC;
What This Returns
| Column | Explanation |
|---|---|
volume_z |
Z-score: 3 = rare, 4 = very rare, 5+ = extremely significant event |
price_chg_pct |
Price move on the same day — combined with Z-score gives directional context |
event_type |
Classified signal — run this daily and store the results to build an event history |
Real-world application: Earnings announcements almost always appear in this output on their report date. By correlating dates with an earnings calendar, you can measure the "earnings effect" — how much stock prices typically move on high-volume earnings days for each company.
Example 4: Up/Down Volume Ratio (Accumulation vs Distribution)
This is a market microstructure concept. The theory: if most of the volume in a stock over the past 20 days came on days when the price went up (up-volume dominates), institutional buyers are quietly accumulating shares. If most volume came on down days (down-volume dominates), institutions are distributing (selling) their positions.
The query labels each trading day as an "up" or "down" day based on whether the close was higher or lower than the prior day. It then computes a 20-day rolling sum of up-volume and down-volume separately, and classifies the current balance as ACCUMULATION, DISTRIBUTION, or NEUTRAL.
WITH daily_direction AS (
SELECT company_id, price_date, volume,
CASE WHEN close_price >= LAG(close_price) OVER (PARTITION BY company_id ORDER BY price_date)
THEN volume ELSE 0 END AS up_volume,
CASE WHEN close_price < LAG(close_price) OVER (PARTITION BY company_id ORDER BY price_date)
THEN volume ELSE 0 END AS down_volume
FROM stock_prices
),
rolling_ratio AS (
SELECT company_id, price_date,
SUM(up_volume) OVER (PARTITION BY company_id ORDER BY price_date ROWS 19 PRECEDING) AS sum_up_vol,
SUM(down_volume) OVER (PARTITION BY company_id ORDER BY price_date ROWS 19 PRECEDING) AS sum_down_vol
FROM daily_direction
)
SELECT DISTINCT ON (rr.company_id)
c.ticker, c.sector,
ROUND(sum_up_vol::NUMERIC / NULLIF(sum_up_vol + sum_down_vol, 0) * 100, 1) AS up_volume_pct,
CASE
WHEN sum_up_vol > sum_down_vol * 1.5 THEN 'ACCUMULATION'
WHEN sum_down_vol > sum_up_vol * 1.5 THEN 'DISTRIBUTION'
ELSE 'NEUTRAL'
END AS volume_signal
FROM rolling_ratio rr
JOIN companies c ON c.company_id = rr.company_id
ORDER BY rr.company_id, rr.price_date DESC;
What This Returns
| Column | Explanation |
|---|---|
up_volume_pct |
Percentage of rolling 20-day volume that occurred on up-days; >60% = accumulation, <40% = distribution |
volume_signal |
ACCUMULATION (smart money buying), DISTRIBUTION (smart money selling), or NEUTRAL |
The ROWS 19 PRECEDING frame specification means "the current row plus the 19 rows before it" — a 20-row rolling window. This is the correct frame for computing rolling statistics when you want to include the current row.