Reconciliation and Audit Queries

Reconciliation and Audit Queries

Financial reconciliation is the systematic process of verifying that every record in your database is consistent, complete, and accurate. In accounting terms, you are checking that the books "balance" — that invoices match payments, that references are intact, and that no transactions have been entered with errors.

SQL is exceptionally well suited to reconciliation work because it can scan millions of records in seconds and report exactly which rows fail each check. The alternative — manual spot-checks in a spreadsheet — is slow, error-prone, and cannot scale. Once you build a reconciliation suite in SQL, you can run it nightly as part of a scheduled job, or on demand before month-end close, and it will always check 100% of the data.

This lesson covers four categories of reconciliation query that together constitute a complete financial audit suite:

  1. Balance reconciliation — do payments match invoice amounts?
  2. Referential integrity — are there orphaned records that reference non-existent data?
  3. Period close checklist — summarising a full accounting period in a single query
  4. Statistical anomaly detection — flagging transactions that are statistically unusual

Why Reconciliation Fails Without SQL

Most reconciliation problems fall into one of these categories:

  • Timing differences: A payment is processed in one period but the invoice was issued in a prior period
  • Partial payments: A client pays £500 against a £600 invoice — the invoice shows a £100 balance
  • Orphaned records: A payment references an invoice_id that no longer exists (deleted invoice, data migration error)
  • Duplicate entries: The same payment is entered twice with slightly different amounts or dates
  • Logic errors: A payment date that precedes the invoice issue date — physically impossible

Example 1: Payment-Invoice Reconciliation

This is the core reconciliation query. For every invoice in the system, it computes the total invoiced amount (from line items) and the total amount paid (from payments), then calculates the balance and assigns a status.

The design uses LEFT JOIN from the invoices table to both invoice_totals and payment_totals. This means invoices with no line items (total = NULL) and invoices with no payments (paid = NULL) are both included — these are the anomalies worth investigating. Using an INNER JOIN would silently exclude them.

The CASE statement turns the numeric balance into a human-readable status. The ABS(... - ...) < 0.01 check handles floating-point rounding: a payment of £499.999999 for a £500.00 invoice should be considered balanced, not partial.

WITH invoice_totals AS (
    SELECT invoice_id, ROUND(SUM(quantity * unit_price), 2) AS invoice_total
    FROM invoice_items GROUP BY invoice_id
),
payment_totals AS (
    SELECT invoice_id, ROUND(SUM(amount), 2) AS total_paid
    FROM payments GROUP BY invoice_id
),
reconciliation AS (
    SELECT i.invoice_id, cl.company_name, i.issue_date, i.due_date,
        COALESCE(it.invoice_total, 0) AS invoiced,
        COALESCE(pt.total_paid, 0) AS paid,
        COALESCE(it.invoice_total, 0) - COALESCE(pt.total_paid, 0) AS balance,
        CASE
            WHEN it.invoice_total IS NULL THEN 'NO LINE ITEMS'
            WHEN pt.total_paid IS NULL THEN 'UNPAID'
            WHEN ABS(it.invoice_total - pt.total_paid) < 0.01 THEN 'BALANCED'
            WHEN pt.total_paid > it.invoice_total THEN 'OVERPAID'
            ELSE 'PARTIAL'
        END AS status
    FROM invoices i
    JOIN clients cl ON cl.client_id = i.client_id
    LEFT JOIN invoice_totals it ON it.invoice_id = i.invoice_id
    LEFT JOIN payment_totals pt ON pt.invoice_id = i.invoice_id
)
SELECT status, COUNT(*) AS count, ROUND(SUM(balance), 2) AS total_balance
FROM reconciliation GROUP BY status ORDER BY count DESC;

What This Returns

Column Explanation
status BALANCED (invoice fully paid), UNPAID (no payments at all), PARTIAL (some payments but not full), OVERPAID (paid more than invoiced), NO LINE ITEMS (invoice has no items — data quality issue)
count Number of invoices in each status
total_balance Sum of outstanding amounts per status — for UNPAID/PARTIAL this is the total AR exposure

Investigation priority: Address OVERPAID and NO LINE ITEMS first — these indicate data errors. PARTIAL and UNPAID are normal business states that require follow-up with clients, not necessarily database fixes.


Example 2: Orphaned Records Detection

An orphaned record is one that references a foreign key that no longer exists — for example, a payment that references an invoice_id that has been deleted, or an invoice that references a client_id that doesn't appear in the clients table. Orphaned records can cause calculation errors (they're excluded from totals silently), reporting inconsistencies, and application crashes.

This query uses UNION ALL to run five different integrity checks in a single pass, returning one row per check. Each check counts how many records fail a referential integrity rule. The status column immediately shows PASS or FAIL — making this suitable as an automated health check that triggers an alert if any row shows FAIL with count > 0.

WITH checks AS (
    -- Payments referencing non-existent invoices
    SELECT 'orphan_payments' AS issue, COUNT(*) AS count
    FROM payments WHERE invoice_id NOT IN (SELECT invoice_id FROM invoices)

    UNION ALL

    -- Invoice items with no matching invoice
    SELECT 'orphan_invoice_items', COUNT(*)
    FROM invoice_items WHERE invoice_id NOT IN (SELECT invoice_id FROM invoices)

    UNION ALL

    -- Invoices with no client
    SELECT 'invoices_missing_client', COUNT(*)
    FROM invoices WHERE client_id NOT IN (SELECT client_id FROM clients)

    UNION ALL

    -- Invoices with no line items
    SELECT 'invoices_no_line_items', COUNT(*)
    FROM invoices i WHERE NOT EXISTS (
        SELECT 1 FROM invoice_items ii WHERE ii.invoice_id = i.invoice_id
    )

    UNION ALL

    -- Duplicate payments (same invoice, same date, same amount)
    SELECT 'duplicate_payments', COUNT(*) FROM (
        SELECT invoice_id, payment_date, amount, COUNT(*) AS cnt
        FROM payments GROUP BY invoice_id, payment_date, amount HAVING COUNT(*) > 1
    ) dups
)
SELECT issue, count, CASE WHEN count = 0 THEN 'PASS' ELSE 'FAIL' END AS status
FROM checks ORDER BY count DESC;

What This Returns

Row What it checks
orphan_payments Payments with no matching invoice — usually indicates a deleted invoice or migration error
orphan_invoice_items Line items with no parent invoice — orphaned detail rows
invoices_missing_client Invoices assigned to a non-existent client — FK violation
invoices_no_line_items Invoices with no items at all — technically valid but financially incomplete
duplicate_payments Same amount paid for the same invoice on the same date — likely a double-entry

Run this query before every month-end close. A clean database should return all PASS. Any FAIL row needs investigation before period financials are finalised.


Example 3: Month-End Close Checklist

At the end of every accounting period, the finance team performs a "close" — finalising all entries and producing period-end financials. This query automates the first step of that process: producing a three-line summary of the prior month's activity that can be checked against the general ledger system.

The query calculates how much was invoiced in the month, how much was received in payments, and what the open AR balance is. These three numbers should tie to the accounting system (Xero, QuickBooks, Sage, etc.) — if they don't, there's a discrepancy to investigate.

WITH period_start AS (SELECT DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')::DATE AS dt),
period_end AS (SELECT (DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '1 day')::DATE AS dt),

invoices_issued AS (
    SELECT COUNT(*) AS cnt, SUM(ii.quantity * ii.unit_price) AS total
    FROM invoices i JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
    WHERE i.issue_date BETWEEN (SELECT dt FROM period_start) AND (SELECT dt FROM period_end)
),
payments_received AS (
    SELECT COUNT(*) AS cnt, SUM(amount) AS total
    FROM payments WHERE payment_date BETWEEN (SELECT dt FROM period_start) AND (SELECT dt FROM period_end)
),
ar_open AS (
    SELECT COUNT(*) AS cnt, SUM(ii.quantity * ii.unit_price) - COALESCE(SUM(p.amount), 0) AS total
    FROM invoices i JOIN invoice_items ii ON ii.invoice_id = i.invoice_id
    LEFT JOIN payments p ON p.invoice_id = i.invoice_id
    WHERE i.issue_date <= (SELECT dt FROM period_end)
      AND ii.quantity * ii.unit_price > COALESCE(p.amount, 0)
)
SELECT
    'Invoices Issued' AS line_item, ii.cnt::TEXT AS count, ROUND(ii.total, 2)::TEXT AS amount FROM invoices_issued ii
UNION ALL SELECT 'Payments Received', pr.cnt::TEXT, ROUND(pr.total, 2)::TEXT FROM payments_received pr
UNION ALL SELECT 'Open AR Balance', ao.cnt::TEXT, ROUND(ao.total, 2)::TEXT FROM ar_open ao;

What This Returns

A three-line close checklist:

Line Item Count Amount
Invoices Issued N invoices £X
Payments Received N payments £Y
Open AR Balance N invoices £Z

The fundamental accounting relationship: Invoices Issued - Payments Received = Opening AR change. If this doesn't hold within a small rounding tolerance, there is a period mismatch that needs investigating.

The CTEs period_start and period_end are parameterised date range definitions — modify them to close any historical period, not just last month.


Example 4: Anomaly Audit — Unusual Transactions

Statistical anomaly detection applies Z-score analysis (from statistics) to financial transactions. Rather than checking rule violations (an orphan record, a balance mismatch), this query flags transactions that are plausible but unusual — payment amounts far above normal, future-dated payments, or line items with £0 price.

The Z-score formula is (value - mean) / standard_deviation. A Z-score above 3 means the value is more than 3 standard deviations from the mean — under a normal distribution, this should happen less than 0.3% of the time. In practice, some legitimate large clients produce payments with Z > 3 every time they pay. The query gives you a list to review — it doesn't automatically classify them as errors.

WITH payment_stats AS (
    SELECT AVG(amount) AS mean, STDDEV(amount) AS std FROM payments
),
invoice_stats AS (
    SELECT AVG(total) AS mean, STDDEV(total) AS std
    FROM (SELECT SUM(quantity * unit_price) AS total FROM invoice_items GROUP BY invoice_id) t
)
-- Unusually large payments
SELECT 'Large Payment' AS anomaly_type, p.payment_id::TEXT AS record_id,
    cl.company_name, p.amount::TEXT AS value, p.payment_date::TEXT AS date
FROM payments p JOIN invoices i ON i.invoice_id = p.invoice_id
JOIN clients cl ON cl.client_id = i.client_id, payment_stats ps
WHERE p.amount > ps.mean + 3 * ps.std

UNION ALL

-- Future-dated payments
SELECT 'Future Payment', p.payment_id::TEXT, cl.company_name,
    p.amount::TEXT, p.payment_date::TEXT
FROM payments p JOIN invoices i ON i.invoice_id = p.invoice_id
JOIN clients cl ON cl.client_id = i.client_id
WHERE p.payment_date > CURRENT_DATE

UNION ALL

-- Invoice items with £0 price
SELECT 'Zero Price Item', ii.item_id::TEXT, cl.company_name,
    ii.description, i.issue_date::TEXT
FROM invoice_items ii JOIN invoices i ON i.invoice_id = ii.invoice_id
JOIN clients cl ON cl.client_id = i.client_id
WHERE ii.unit_price = 0
ORDER BY anomaly_type;

What This Returns

Anomaly Type What it means Common cause
Large Payment Payment > mean + 3 standard deviations Large enterprise client, or data entry error (extra zero)
Future Payment Payment date is in the future Pre-dated entry, timezone issue, or outright error
Zero Price Item Line item with £0 unit price Free service, promotional item, or data entry omission

Notice that each anomaly check uses UNION ALL to combine results into a single output table. This is the standard pattern for building a "checklist query" — one query, one result set, all checks in one place.

Purchase this course to unlock the full lesson.

Sign up