Season Comparison and Historical Records

Season Comparison and Historical Records

Season-over-season analysis answers questions that single-season snapshots cannot: Is this team improving or declining? Has the league become more or less competitive? Which players broke long-standing records this year? Comparing across time requires joining datasets by year, using LAG() to access prior-period values, and building aggregations at the season level.

This lesson builds a complete season comparison toolkit — team evolution over multiple seasons, all-time records, and long-term trend analysis.

Why Season Comparison Is Harder Than It Looks

The challenge with cross-season analysis is that you're grouping by a time dimension (season) on top of all your usual aggregations. The query structure must:

  1. Compute each team's stats for each season separately
  2. Then rank within each season
  3. Then use LAG() to compare to the previous season

Doing all three in one query requires multiple CTE layers, each building on the previous one.

Season-by-Season Team Evolution

This query builds a full historical record for every team across every season. The CTE computes season-level stats (points, goals for, goals against), then the final SELECT adds LAG() to compare with the previous season and RANK() to show final position.

The key technique is PARTITION BY team_id ORDER BY m.season in the LAG window — this ensures we're comparing each team to its own previous season, not to some other team.

WITH team_by_season AS (
    SELECT team_id, t.season AS season,
        SUM(CASE WHEN (home_team_id = team_id AND home_goals > away_goals) OR
                      (away_team_id = team_id AND away_goals > home_goals) THEN 3
                 WHEN home_goals = away_goals THEN 1 ELSE 0 END) AS pts,
        SUM(CASE WHEN home_team_id = team_id THEN home_goals ELSE away_goals END) AS gf,
        SUM(CASE WHEN home_team_id = team_id THEN away_goals ELSE home_goals END) AS ga,
        COUNT(*) AS played
    FROM (SELECT home_team_id AS team_id, season FROM matches WHERE competition='league'
          UNION ALL SELECT away_team_id, season FROM matches WHERE competition='league') t
    JOIN matches m ON (m.home_team_id = t.team_id OR m.away_team_id = t.team_id) AND m.season = t.season
    WHERE m.competition = 'league' GROUP BY t.team_id, t.season
)
SELECT t.name, tbs.season, tbs.pts, tbs.gf, tbs.ga, tbs.gf - tbs.ga AS gd,
    LAG(tbs.pts) OVER (PARTITION BY tbs.team_id ORDER BY tbs.season) AS prev_season_pts,
    tbs.pts - LAG(tbs.pts) OVER (PARTITION BY tbs.team_id ORDER BY tbs.season) AS pts_change,
    RANK() OVER (PARTITION BY tbs.season ORDER BY tbs.pts DESC) AS final_position
FROM team_by_season tbs
JOIN teams t ON t.team_id = tbs.team_id
ORDER BY t.name, tbs.season;

What This Returns

Each row is one team's performance in one season. The most informative columns:

Column What it tells you
pts_change Points improvement/decline vs previous season. Positive = improved.
prev_season_pts NULL for a team's first season in the dataset
final_position Where they finished in the league that season

A team with consistently improving pts_change over 3 seasons is trending upward. A team dropping 10+ points season-on-season is in trouble. LAG() is NULL for the first season because there's no prior year to compare to — NULLIF or COALESCE can handle this if needed.

Purchase this course to unlock the full lesson.

Sign up