System Design: Tax Filing Platform

Requirements

Functional Requirements:

Taxpayers submit annual returns with income, deductions, and supporting documents
Real-time validation of form completeness and arithmetic before submission
Integration with employer wage reporting APIs to pre-populate income fields
Calculate refund or amount owed and initiate payment/disbursement
Issue confirmation receipts with a legally binding submission timestamp
Support amended returns, extensions, and installment payment plans

Non-Functional Requirements:

Handle 10M submissions during the final week before the filing deadline
99.95% availability; filing system downtime during deadline week is a legal issue
All financial data encrypted at rest with AES-256; in transit with TLS 1.3
Immutable audit trail for every state change in a submission record
Accessibility: WCAG 2.1 AA compliance across all interfaces

Scale Estimation

For a country with 150M taxpayers filing annually: ~60% file in the final 4 weeks = 90M returns over 28 days = 3.2M per day average, with a 3x spike in the final week = ~9.6M/day or ~111 returns/second. Document storage: average 5 attachments × 1MB per return × 90M = 450TB per season. Validation API calls to employer wage services: ~500M calls/season.

High-Level Architecture

The platform is structured around a Submission Pipeline: an API Gateway receives submissions, a Validation Service checks business rules synchronously, and a Processing Queue handles the asynchronous computation (tax calculation, fraud scoring, payment initiation). This decoupling lets the API tier respond quickly with an acknowledgment while heavy computation runs in the background.

Third-party integrations (employer wage APIs, bank verification, payment processors) are wrapped in an Integration Hub with circuit breakers and retry queues. When an upstream employer API is slow during peak season, submissions queue up rather than timing out. Pre-population of returns is done in a separate pre-filing window: the system calls employer APIs in batch during off-peak hours and stores prefill data in a staging area for taxpayers to review.

A Fraud Detection Service consumes every submission event from Kafka and runs rule-based and ML-based checks asynchronously. Flagged returns are routed to a Review Queue for auditor attention rather than blocking the filing confirmation. Audit investigators access submissions through a separate internal portal with its own access control tier.

Core Components

Submission & Validation Service

Handles the synchronous leg of the filing flow. Validates schema conformance, required fields, arithmetic consistency (income - deductions = taxable income), and cross-field business rules. Returns structured validation errors in a single response, not one error at a time. Once validation passes, assigns a unique submission ID, writes to the database with status PENDING, and enqueues a processing message. The response to the taxpayer includes the submission ID and a timestamped receipt hash.

Tax Calculation Engine

A stateless computation service consuming from the processing queue. Applies the tax year's rate tables, credit rules, and deduction limits to compute the liability or refund. Rate tables are loaded from a versioned configuration store (not hardcoded) so annual tax law changes are deployable without code changes. Results are written back to the submission record with status CALCULATED. For complex returns (business income, foreign assets), the engine routes to a specialized calculation worker pool.

Payment & Disbursement Service

For refunds: initiates ACH transfers via a banking API, polling for confirmation and updating status through PAYMENT_INITIATED → PAYMENT_CONFIRMED. For amounts owed: generates payment instructions with a unique payment reference, monitors the payment processing system for receipt confirmation, and issues a final settled receipt. All payment state transitions are logged to the audit trail. Failed payments trigger a retry workflow with exponential backoff and taxpayer notification.

Database Design

Submission records are stored in PostgreSQL with status machine fields: submission_id, taxpayer_id, tax_year, status ENUM, submitted_at, calculated_at, refund_amount, tax_owed. A companion submission_history table (append-only) records every status transition with timestamp and actor, forming the immutable audit trail. Row-level security policies restrict access so agents can only read returns assigned to their region.

Attachments (W-2s, 1099s, receipts) are stored in S3 with server-side encryption (SSE-KMS) with customer-managed keys. S3 Object Lock with Compliance mode prevents deletion or overwrite for a configurable retention period (7 years for tax records). A relational index maps submission_id → [s3_keys]. PII fields (SSN, bank account) are tokenized using a format-preserving encryption service before storage.

API Design

POST /api/v1/returns — submits a new tax return; synchronous validation, async processing; returns {submission_id, receipt_hash, status: "PENDING"}.

GET /api/v1/returns/{submissionId} — returns current status, calculated amounts, and any review flags.

POST /api/v1/returns/{submissionId}/amend — initiates an amended return, locking the original from further amendment until resolved.

GET /api/v1/prefill/{taxYear} — returns pre-populated income data from employer integrations for the authenticated taxpayer.

Scaling & Bottlenecks

The filing deadline creates the most predictable traffic spike in government systems. Auto-scaling policies for the submission tier are pre-configured to scale to 10x normal capacity starting 3 days before the deadline. Database connection pooling (PgBouncer) is critical — each submission service instance should not hold direct connections. Read replicas handle status polling traffic (most users check status repeatedly), routing away from the primary write path.

The validation service is CPU-bound during complex return validation. A separate high-CPU compute tier handles these, while simple returns (W-2 only) are routed to a lightweight validation pool. The integration hub's employer API calls are the main latency source; pre-batch-fetching during off-peak hours and caching the results dramatically reduces real-time dependency on third-party APIs during the deadline rush.

Key Trade-offs

Synchronous vs. asynchronous processing: Fully synchronous processing gives taxpayers immediate results but blocks during computation; the hybrid approach (sync validation, async calculation) gives fast acknowledgment with a polling model for results.
Pre-population accuracy vs. staleness: Prefilling from employer data reduces errors but if fetched days earlier may miss late-amended W-2s; a refresh mechanism on submission detects discrepancies.
Audit trail granularity: Logging every field change provides maximum auditability but at significant storage cost; logging status transitions and full snapshots at key milestones is a practical middle ground.
Monolith vs. microservices: A monolith simplifies the transaction model for submission + calculation + payment; microservices allow independent scaling of the calculation tier during peak but introduce distributed transaction complexity.

System Design: Tax Filing Platform

Requirements

Scale Estimation

High-Level Architecture

Core Components

Submission & Validation Service

Tax Calculation Engine

Payment & Disbursement Service

Database Design

API Design

Scaling & Bottlenecks

Key Trade-offs

Master this topic in our 12-week cohort

System Design: Audit Logging System

System Design: Digital Identity System

System Design: Electronic Voting System

System Design: Tax Calculation Service

System Design: Payroll Processing System

System Design: Code Execution Sandbox