SYSTEM_DESIGN
System Design: Government Benefits Eligibility System
Design a government benefits eligibility determination system supporting millions of applicants across multiple programs. Covers rules engine design, multi-agency data federation, auditability, and accessible citizen-facing interfaces.
Requirements
Functional Requirements:
- Citizens apply for benefits (unemployment, food assistance, housing, healthcare subsidies) through a unified portal
- Rules engine evaluates eligibility across multiple programs simultaneously from a single application
- Integrate with authoritative data sources: tax records, wage data, DMV, social security, healthcare registries
- Issue determinations with detailed explanations of eligibility or denial reasons
- Support appeals workflow: citizen challenges a determination; case worker reviews and can override
- Track benefit issuance, renewal deadlines, and life event changes affecting eligibility
Non-Functional Requirements:
- Eligibility determination returned within 5 seconds for 80% of cases (data lookup dependent)
- 99.9% availability; many applicants are in crisis situations
- Audit trail: every determination must be reproducible from stored inputs and rule version
- WCAG 2.1 AA accessibility; support for 20+ languages
- Strict data minimization — only collect and retain data required for benefit programs
Scale Estimation
For a national system: 300M citizens, ~15% engaged with at least one benefit program = 45M active cases. Annual new applications: 10M/year = ~27k/day. Real-time eligibility checks (renewals, life event changes): 500k/day = ~6/second. Rules updates when legislation changes: ~50 major updates/year, each requiring re-evaluation of affected cases. Integration API calls to external agencies: ~3-5 calls per determination = up to 135k API calls/day.
High-Level Architecture
The system is organized around an Application Intake Service, a Data Aggregation Service, a Rules Engine, and a Case Management Service. The Application Intake Service provides the citizen-facing portal, handles form submissions with progressive disclosure (only shows relevant sections based on prior answers), and validates completeness. Submitted applications are queued for processing.
The Data Aggregation Service orchestrates calls to authoritative external sources: IRS wage data, Social Security Administration records, state DMV, healthcare registries. It fans out calls in parallel, normalizes responses into a canonical applicant data model, caches results with short TTLs (data freshness is critical for accuracy), and handles partial data gracefully — a determination can proceed with available data while noting which lookups failed or timed out.
The Rules Engine receives the normalized applicant data model and evaluates all applicable benefit programs in a single pass. Rules are authored in a domain-specific language by policy analysts (not engineers), version-controlled, and deployed without code changes. Each evaluation records the rule version used, inputs, outputs, and intermediate evaluations — enabling any past determination to be exactly reproduced for audit or appeal purposes.
Core Components
Application Intake & Progressive Disclosure Engine
A React-based form engine driven by a JSON schema that defines conditional field visibility based on prior answers. The schema is authored by policy analysts and versioned alongside benefit rules. Partial applications auto-save to a draft store every 30 seconds. Completed applications are submitted to a durable queue. Multilingual support is implemented via i18n message catalogs with human-translated strings for all 20+ supported languages — machine translation is never used for legal benefit determinations.
Rules Engine
A forward-chaining rules engine (similar to Drools or a custom implementation) evaluating eligibility criteria: income thresholds, household size, residency requirements, asset limits, work history. Rules are stored as versioned JSON documents in a rules repository. Each determination run loads the rules version active at submission time. The engine produces a decision record: {program_id, eligible: bool, reasons: [...], rule_version, evaluated_at, input_hash}. This record is stored immutably — it cannot be modified even if rules change later.
Appeals & Case Management Service
Manages the lifecycle of a benefit case from application through active enrollment, renewal, and closure. Handles appeals by creating a case review record linked to the original determination, assigning to a case worker queue, and presenting the case worker with both the original determination record and the ability to run a fresh evaluation with corrected data. Case workers can issue manual overrides with a mandatory written justification, which is appended to the case audit trail. SLA tracking ensures appeals are resolved within the statutory response window.
Database Design
PostgreSQL stores application and case records with JSONB columns for flexible benefit-program-specific attributes. Core tables: applications (application_id, citizen_id, submitted_at, status, program_codes[]), determinations (determination_id, application_id, program_id, eligible, reasons JSONB, rule_version, input_snapshot JSONB, created_at), cases (case_id, determination_id, status, enrolled_at, expires_at, case_worker_id). The input_snapshot column stores the full normalized applicant data at determination time — essential for audit reproducibility.
External agency data cached by the Data Aggregation Service is stored in Redis with program-specific TTLs (wage data: 24 hours; healthcare registry: 1 hour). A PostgreSQL data_fetch_log table records every external API call with the source, timestamp, and a hash of the returned data — if a citizen disputes data accuracy, this log identifies exactly what was received and when.
API Design
POST /api/v1/applications — submits a completed benefit application; returns {application_id, estimated_determination_time}.
GET /api/v1/applications/{applicationId}/determination — returns the eligibility determination once complete, including per-program results and reason codes.
POST /api/v1/cases/{caseId}/life-event — citizen reports a life event (job loss, new child, address change) triggering a re-determination.
POST /api/v1/cases/{caseId}/appeal — initiates an appeal for a denied or reduced benefit, with optional supporting documentation upload.
Scaling & Bottlenecks
The primary bottleneck is external agency API latency. IRS and SSA APIs can have response times of 3-8 seconds during peak filing season. The Data Aggregation Service handles this with aggressive parallelism (fan-out to all sources simultaneously), per-source timeouts, and a degraded-mode determination that flags which data sources were unavailable and sets a follow-up task to re-evaluate once data becomes available — preventing applicants from being denied due to temporary upstream failures.
Rules engine performance degrades when evaluating complex multi-program eligibility with hundreds of rules. Compiled rule execution (pre-compiling rules to JVM bytecode or native code at deploy time) reduces per-evaluation latency from hundreds of milliseconds to under 10ms. Rules re-compilation is triggered by the rules repository CI/CD pipeline and deployed as a versioned artifact, with zero-downtime rolling deployment.
Key Trade-offs
- Real-time vs. deferred determination: Real-time determinations satisfy citizens expecting instant results but are blocked by external API latency; deferred async determination (with notification on completion) decouples the citizen experience from upstream dependencies.
- Rules DSL vs. code: A domain-specific language allows policy analysts to author rules directly, reducing the engineering bottleneck for legislative changes, but requires investment in tooling and validation infrastructure for the DSL itself.
- Data minimization vs. comprehensive eligibility: Collecting only required fields protects privacy but may miss eligibility for programs the citizen didn't know to apply for; a broader intake with post-facto data minimization balances coverage and privacy.
- Automated vs. manual determinations: Automating all determinations maximizes throughput but fails for edge cases; a confidence-scored hybrid approach routes high-confidence cases to automatic determination and borderline cases to case worker review.
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.