System Design: Payroll Processing System

Requirements

Functional Requirements:

Process recurring payroll (weekly, biweekly, semi-monthly, monthly) for employees with accurate tax withholding
Calculate gross-to-net pay including federal, state, and local tax deductions, benefits premiums, retirement contributions, and garnishments
Multi-jurisdiction support: employees in different states/countries have different tax rules
Direct deposit via ACH/NACHA file generation and submission to banking partners
Generate pay stubs, W-2s, 1099s, and other tax documents
Off-cycle payroll runs for bonuses, corrections, and termination payouts

Non-Functional Requirements:

Process payroll for 10M employees across 50K companies within a 4-hour processing window
Financial accuracy: calculations must be correct to the cent with full audit trail
99.99% availability during payroll processing windows (missed payroll is a critical business failure)
Strong consistency: no double payments, no missed payments
Compliance with SOC 2 Type II, PCI DSS (for payment data), and IRS regulations

Scale Estimation

10M employees across 50K companies. Payroll runs: 50K companies × 2 runs/month average = 100K payroll runs/month = ~1,400/day. Each run processes an average of 200 employees = 200K employee calculations/day on average, but concentrated in 2-3 peak days per month (10M calculations in 3 days = 3.3M/day = 38 calculations/sec sustained). Each calculation involves 50-100 tax rule evaluations. ACH file generation: 10M payment records per month. Tax document generation: 10M W-2s in January. Storage: 10M employees × 24 pay stubs/year × 100KB = 24TB/year of pay documents.

High-Level Architecture

The system uses a batch-processing architecture with strict consistency guarantees. The Payroll Engine is the core: when an HR administrator initiates a payroll run, the engine fetches employee data (hours worked, salary, tax elections, benefit deductions) from the HR/time-tracking integrations, applies the calculation pipeline, and produces a payroll register (detailed breakdown for every employee). The register is presented for review; upon approval, the Payment Service generates ACH files and submits them to banking partners.

The Calculation Pipeline runs in a strict sequence: (1) Gross Pay computation (regular hours × rate + overtime + bonuses + commissions), (2) Pre-tax deductions (401k, HSA, commuter benefits), (3) Tax calculations (federal income tax using IRS Publication 15-T withholding tables, state taxes using state-specific rules, FICA/Medicare, local taxes), (4) Post-tax deductions (Roth 401k, insurance premiums, garnishments in priority order per federal regulations), (5) Net Pay = Gross - all deductions. Each step produces an immutable calculation record with inputs, outputs, and the tax rule version used.

The Tax Engine is a rules engine that encapsulates tax legislation as versioned, testable rules. Each jurisdiction (federal, 50 states, 1,000+ localities) has a rule module with effective dates. When tax laws change (annually for most jurisdictions), new rule versions are deployed without modifying existing ones — historical payroll calculations always reference the rule version that was active on the pay date. The rules engine uses a DSL (domain-specific language) that tax analysts can modify without engineering involvement.

Core Components

Tax Calculation Engine

The tax engine implements a chain-of-responsibility pattern where each tax jurisdiction's calculator is applied in sequence. Federal income tax uses the IRS withholding tables (Publication 15-T) with inputs: filing status, pay frequency, gross taxable wages, W-4 allowances/additional withholding. State tax calculators handle the diversity of state tax systems: progressive brackets (California), flat rate (Illinois), no state income tax (Texas, Florida). Reciprocity agreements (e.g., a New Jersey resident working in Pennsylvania) are handled by the jurisdiction resolver, which determines the correct tax profile based on work location and residence. Each calculation is deterministic and reproducible: given the same inputs and rule version, the output is identical.

Payment Processing (ACH)

The Payment Service generates NACHA-formatted ACH files for direct deposit. Each file contains batch headers, entry detail records (employee bank routing + account numbers, amounts), and batch control totals. The service performs pre-validation: bank account verification via micro-deposit confirmation, duplicate payment detection (same employee + same amount + same pay date within 48 hours), and balance verification (total payroll amount vs company funding account balance). Files are encrypted (PGP) and transmitted to the banking partner via SFTP. Settlement confirmation is received within 2 business days; the system reconciles confirmed payments and flags any returns (invalid account, insufficient funds) for re-processing.

Audit & Compliance System

Every payroll calculation produces an immutable audit record stored in an append-only ledger (PostgreSQL with immutable table constraints + write-ahead log archival). The audit record includes: employee_id, pay_period, each calculation step's inputs and outputs, tax rule versions applied, approver_id, and timestamps. Year-end tax document generation (W-2, 1099) aggregates these records across all pay periods, reconciling total wages, taxes withheld, and deductions. Quarterly 941 filings (employer's quarterly tax return) are auto-generated from the same data. A reconciliation engine runs after each payroll run, verifying that the sum of all net pay + taxes + deductions equals total gross pay across all employees.

Database Design

The primary database is PostgreSQL with strict ACID guarantees. Core tables: employees (employee_id, company_id, name, ssn_encrypted, pay_rate, pay_frequency, filing_status, state_code, tax_elections JSONB, benefits JSONB, bank_account_encrypted), payroll_runs (run_id, company_id, pay_period_start, pay_period_end, status ENUM(draft, calculating, review, approved, processing, completed, failed), initiated_by, approved_by, total_gross, total_net, total_taxes), payroll_items (item_id, run_id, employee_id, gross_pay, deductions JSONB, taxes JSONB, net_pay, calculation_audit JSONB, created_at).

The calculation_audit JSONB in payroll_items contains the complete calculation trace: [{step: "gross_pay", inputs: {...}, output: 5000.00}, {step: "federal_tax", inputs: {...}, rule_version: "2025-v1", output: 750.00}, ...]. This enables any historical payroll to be fully explained and reproduced. Sensitive data (SSN, bank accounts) is encrypted using AWS KMS with envelope encryption. A separate analytics database (read replica with delayed replication) serves reporting queries without impacting payroll processing.

API Design

POST /api/v1/payroll/runs — Initiate a payroll run; body contains company_id, pay_period, pay_date; returns run_id
GET /api/v1/payroll/runs/{run_id}/preview — Fetch the calculated payroll register for review before approval
POST /api/v1/payroll/runs/{run_id}/approve — Approve a payroll run for payment processing; triggers ACH generation
GET /api/v1/employees/{employee_id}/pay-stubs?year=2025 — Fetch pay stubs for an employee for a given year

Scaling & Bottlenecks

Payroll processing is concentrated in narrow windows: 60% of payroll runs occur on the 1st and 15th of each month. During these peak days, the system processes 3.3M employee calculations in 4 hours = 229 calculations/sec. The calculation pipeline is embarrassingly parallel across employees (no inter-employee dependencies), enabling horizontal scaling with worker pools. Each worker processes one employee's full gross-to-net calculation in ~100ms. A fleet of 50 workers handles peak load with headroom. The bottleneck shifts to the tax engine's rule evaluation: complex multi-jurisdiction employees (living in one state, working in another, with local taxes) require 100+ rule evaluations.

ACH file generation and submission is time-critical: files must be submitted to the banking partner by 5 PM ET for next-day settlement. The Payment Service generates files in parallel per bank (different employees may use different banks) and submits them via SFTP. Bank API rate limits (typically 10 file submissions per minute) are the external bottleneck; batching all employees per bank into a single file (up to 10M records in a NACHA file) avoids this issue.

Key Trade-offs

Versioned tax rules DSL vs hardcoded calculations: The DSL enables tax analysts to update rules without engineering deployments, but adds complexity in DSL design and testing — essential given that tax laws change annually across 1,000+ jurisdictions
Immutable calculation records vs updateable records: Immutability provides audit-grade traceability but means corrections require a new adjustment payroll run rather than modifying existing records — this aligns with accounting best practices (reversals, not edits)
Batch processing vs real-time payroll: Batch processing in a defined window enables review-before-payment (critical for payroll accuracy), but means employees cannot see real-time pay calculations — preview APIs address this for estimation purposes
Single-database strong consistency vs distributed architecture: Payroll's accuracy requirements (correct to the cent, no double payments) favor a single PostgreSQL instance with synchronous replication over a distributed database — the scale (229 calculations/sec peak) is well within a single instance's capacity

System Design: Payroll Processing System

Requirements

Scale Estimation

High-Level Architecture

Core Components

Tax Calculation Engine

Payment Processing (ACH)

Audit & Compliance System

Database Design

API Design

Scaling & Bottlenecks

Key Trade-offs

Master this topic in our 12-week cohort

System Design: Tax Calculation Service

System Design: Audit Logging System

System Design: Fleet Management System

System Design: Tax Filing Platform

System Design: Mortgage Application Platform

How to Design a URL Shortener (TinyURL)