SYSTEM_DESIGN

System Design: Web Application Firewall (WAF)

Design a high-throughput Web Application Firewall that inspects HTTP/HTTPS traffic to detect and block OWASP Top 10 attacks including SQL injection, XSS, and SSRF. Covers rule engines, ML-based anomaly detection, TLS termination, and bypass prevention.

14 min readUpdated Jan 15, 2025
system-designwafweb-securityowaspintrusion-detection

Requirements

Functional Requirements:

  • Inspect all inbound HTTP/HTTPS requests for OWASP Top 10 attack patterns
  • Block or challenge requests matching SQL injection, XSS, CSRF, path traversal, SSRF, and command injection signatures
  • Apply positive security model: allow-list known-good request patterns for critical endpoints
  • Support rate limiting per IP, per endpoint, and per API key
  • Log all blocked requests with full request details for forensic investigation
  • Allow security teams to create and deploy custom rules without downtime

Non-Functional Requirements:

  • WAF inspection adds less than 2ms to request latency for pass-through traffic
  • False positive rate under 0.1% (blocking legitimate requests)
  • Process 200,000 HTTP requests per second without packet loss
  • Rule updates take effect within 30 seconds of deployment
  • 99.999% availability; WAF failure must fail-open (allow traffic) to prevent outage

Scale Estimation

200,000 RPS * average request size 4 KB = 800 MB/s inbound traffic. Each request must be inspected within 2ms. Rule evaluation: 1,000 rules * 0.001ms per rule = 1ms per request for rules engine. At 200,000 RPS: 200,000 rule evaluations/ms across the WAF cluster. Horizontal scaling: each WAF instance handles 20,000 RPS; 10 instances handle 200,000 RPS with 2x headroom.

High-Level Architecture

The WAF operates as a transparent reverse proxy in front of the origin servers. TLS termination happens at the WAF layer (using the origin's certificate, installed on the WAF), enabling full HTTP body inspection before forwarding to the origin. The WAF processes each request through a pipeline: TLS termination → HTTP parsing → Rule evaluation → Scoring → Decision (block/allow/challenge) → Forward or block response.

The rule engine uses a multi-phase inspection approach (inspired by ModSecurity's phase model): Phase 1 inspects request headers (User-Agent, Content-Type, X-Forwarded-For), Phase 2 inspects the request URL and query string, Phase 3 inspects the request body (POST data, JSON, XML), Phase 4 inspects response headers, and Phase 5 inspects the response body. Each phase applies a rule set; a rule can set anomaly score and block immediately (for critical patterns) or accumulate score for threshold-based blocking.

An ML-based anomaly detection layer runs alongside the rule engine. A trained isolation forest model scores each request based on 50 features: URL entropy, parameter count, body size, header counts, HTTP verb distribution, and timing features. Requests with anomaly scores above a threshold are flagged for additional inspection or sent to a human review queue. The ML model is retrained weekly on labeled traffic (blocked attacks + legitimate samples) to adapt to evolving attack patterns.

Core Components

Rule Engine

Regex-based rules match attack signatures in URL, query parameters, headers, and body. Core rule sets: OWASP Core Rule Set (CRS) 3.x with 900+ rules. Performance optimization: rules are compiled to DFA (Deterministic Finite Automata) using Hyperscan (Intel's high-performance regex library), achieving 10 billion character/second matching throughput. Rules are organized into groups with short-circuit evaluation: if a critical rule matches (SQL injection UNION SELECT), inspection stops and the request is blocked immediately without evaluating remaining rules.

Positive Security Model

For critical endpoints (login, payment, API key endpoints), a positive security model defines the exact allowed request schema: allowed HTTP methods, required headers, parameter names and types, body schema (JSON Schema validation), and value constraints (e.g., amount must be a positive number < 1,000,000). Any deviation triggers a block. The positive model is maintained as JSON Schema configurations deployed via the rule management API. Schema validation uses a compiled JSON Schema validator (ajv for JavaScript, serde_json for Rust) running in <0.5ms.

TLS Inspection & Certificate Management

The WAF terminates TLS using the origin's wildcard or SAN certificate, provisioned via ACME (Let's Encrypt) or uploaded by the customer. TLS 1.3 is enforced; TLS 1.0/1.1 are rejected. HTTP/2 and HTTP/3 (QUIC) are supported, with HTTP/3 requiring additional UDP processing. For mutual TLS (mTLS) endpoints (B2B APIs), the WAF validates client certificates against a trusted CA bundle. Revoked certificates are checked against OCSP stapling; a fallback to CRL download handles OCSP failures.

Database Design

Rule definitions (PostgreSQL): rules (rule_id, rule_set, phase INT, pattern TEXT, action ENUM(BLOCK, ALLOW, SCORE, CHALLENGE), score INT, tags TEXT[], version, is_active). Block log (ClickHouse for high-write throughput): (ts TIMESTAMP, request_id UUID, client_ip INET, method VARCHAR, host VARCHAR, path VARCHAR, triggered_rule_ids TEXT[], anomaly_score FLOAT, action ENUM, response_code INT) partitioned by day. Redis for rate limiting counters and IP reputation cache. S3 for raw blocked request bodies (sensitive, access-controlled, 90-day retention).

API Design

POST /rules — Deploy a new WAF rule with pattern, phase, action, and score; takes effect within 30 seconds. GET /events?from={ts}&to={ts}&ip={ip} — Query block events with filtering by time range, IP, and triggered rule. POST /allowlist/ips — Add an IP or CIDR to the bypass allowlist (e.g., for internal monitoring tools). GET /stats/live — Real-time WAF statistics: requests/second, block rate, top triggered rules, top attacking IPs.

Scaling & Bottlenecks

Request body inspection (parsing JSON/XML, evaluating rules against parsed structure) is the most CPU-intensive operation. Request bodies over 1 MB are truncated to 1 MB before inspection (configurable per endpoint) to prevent DoS via large body attacks consuming WAF CPU. Binary content types (image/jpeg, application/octet-stream) skip body inspection (not amenable to text-based rule matching) except for content-type verification.

Hyperscan multi-pattern matching processes all 900 CRS regexes simultaneously in a single pass over the request body, achieving O(N) complexity in body size rather than O(NM) for sequential evaluation. For non-Hyperscan deployments, rule prioritization (critical SQL injection/XSS rules first, lower-priority rules only if high-value score hasn't been reached) reduces average rules evaluated per request from 900 to ~200.

Key Trade-offs

  • Signature-based vs. ML-based detection: Signature rules are precise for known attack patterns and have low false positives; ML detects novel attacks and zero-days but has higher false positive rates requiring careful threshold tuning.
  • Fail-open vs. fail-closed on WAF failure: Fail-open (allow traffic if WAF crashes) prevents outages but exposes the origin during WAF failure; fail-closed blocks all traffic until WAF recovers, avoiding exposure but causing downtime.
  • Inline inspection vs. out-of-band analysis: Inline WAF blocks attacks in real time but adds latency; out-of-band analysis (mirror traffic, detect after the fact) has zero latency impact but cannot block attacks.
  • Managed rule sets vs. custom rules: OWASP CRS managed rules provide broad coverage with low tuning effort but may conflict with legitimate application behavior; custom rules are precisely tuned to the application but require ongoing maintenance.

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.