Network Flow Analysis — Which Warehouses Ship the Most?

Network Flow Analysis — Which Warehouses Ship the Most?

Network analysis treats warehouses as nodes and shipments as edges. Understanding flow directionality, hub vs spoke warehouses, and imbalanced lanes guides network design decisions.


1. Shipment Volume by Warehouse (Inbound vs Outbound)

SELECT
    w.code,
    w.city,
    w.country,
    COUNT(s_out.id) AS outbound_shipments,
    COUNT(s_in.id)  AS inbound_shipments,
    COUNT(s_out.id) - COUNT(s_in.id) AS net_outbound
FROM lg_warehouses w
LEFT JOIN lg_shipments s_out ON s_out.warehouse_id      = w.id
LEFT JOIN lg_shipments s_in  ON s_in.warehouse_id  = w.id
GROUP BY w.id, w.code, w.city, w.country
ORDER BY outbound_shipments DESC;

What This Returns

code  | city        | outbound_shipments | inbound_shipments | net_outbound
------+-------------+--------------------+-------------------+-------------
US-NY | New York    |                 75 |                68 |            7
DE-FR | Frankfurt   |                 75 |                73 |            2
...

2. Busiest Lanes (Origin → Destination Pairs)

SELECT
    w_o.code || ' → ' || w_d.code     AS lane,
    COUNT(*)                           AS shipments,
    SUM(s.weight_kg)                   AS total_weight_kg,
    SUM(s.freight_cents) / 100            AS total_cost_usd,
    ROUND((AVG(s.freight_cents) / 100.0)::NUMERIC, 2) AS avg_cost_usd,
    ROUND((100.0 * COUNT(*) FILTER (WHERE s.delivered_at IS NOT NULL
        AND s.delivered_at <= s.promised_at)
        / NULLIF(COUNT(*) FILTER (WHERE s.delivered_at IS NOT NULL), 0))::NUMERIC, 1) AS otd_pct
FROM lg_shipments s
JOIN lg_warehouses w_o ON w_o.id = s.warehouse_id
JOIN lg_warehouses w_d ON w_d.id = s.warehouse_id
GROUP BY lane
ORDER BY shipments DESC;

3. Imbalanced Lanes (One-Way Traffic)

Lanes with large imbalances indicate where empty container repositioning is needed:

WITH lane_counts AS (
    SELECT
        LEAST(warehouse_id, warehouse_id)    AS node_a,
        GREATEST(warehouse_id, warehouse_id) AS node_b,
        SUM(CASE WHEN warehouse_id < warehouse_id THEN 1 ELSE 0 END) AS forward,
        SUM(CASE WHEN warehouse_id > warehouse_id THEN 1 ELSE 0 END) AS backward
    FROM lg_shipments
    GROUP BY node_a, node_b
)
SELECT
    w_a.code || ' ↔ ' || w_b.code AS pair,
    forward,
    backward,
    ABS(forward - backward) AS imbalance
FROM lane_counts lc
JOIN lg_warehouses w_a ON w_a.id = lc.node_a
JOIN lg_warehouses w_b ON w_b.id = lc.node_b
ORDER BY imbalance DESC;

4. Hub Identification — Warehouses with Highest Betweenness

Warehouses that appear in both inbound and outbound flows act as hubs:

WITH flows AS (
    SELECT warehouse_id AS warehouse_id, 1 AS flow_type
    FROM lg_shipments
    UNION ALL
    SELECT warehouse_id, 2
    FROM lg_shipments
)
SELECT
    w.code,
    w.city,
    COUNT(*) FILTER (WHERE flow_type = 1) AS outbound,
    COUNT(*) FILTER (WHERE flow_type = 2) AS inbound,
    COUNT(*)                               AS total_flows,
    LEAST(
        COUNT(*) FILTER (WHERE flow_type = 1),
        COUNT(*) FILTER (WHERE flow_type = 2)
    )                                      AS hub_score
FROM flows f
JOIN lg_warehouses w ON w.id = f.warehouse_id
GROUP BY w.code, w.city
ORDER BY hub_score DESC;

5. Cross-Country vs Domestic Shipment Mix

SELECT
    CASE WHEN w_o.country = w_d.country THEN 'Domestic' ELSE 'International' END AS shipment_type,
    COUNT(*)                            AS shipments,
    SUM(s.freight_cents) / 100             AS total_cost_usd,
    ROUND((AVG(s.freight_cents) / 100.0)::NUMERIC, 2) AS avg_cost_usd,
    ROUND(AVG(DATE_PART('day', s.delivered_at - s.shipped_at))::NUMERIC, 1) AS avg_transit_days
FROM lg_shipments s
JOIN lg_warehouses w_o ON w_o.id = s.warehouse_id
JOIN lg_warehouses w_d ON w_d.id = s.warehouse_id
WHERE s.delivered_at IS NOT NULL AND s.shipped_at IS NOT NULL
GROUP BY 1;

Key Takeaway

LEAST(warehouse_id, warehouse_id) and GREATEST(warehouse_id, warehouse_id) normalise directed edges into undirected pairs — essential for imbalance analysis where (A→B) and (B→A) should be counted together. This technique generalises to any bidirectional relationship in SQL: social connections, messaging pairs, or trade flows.