SYSTEM_DESIGN
System Design: Price Alert System (Travel)
Design a travel price alert system that monitors fares and sends timely notifications when prices drop below user-defined targets — covering efficient polling, fanout, and alert deduplication.
Requirements
Functional Requirements:
- Users set price alerts for specific flight or hotel searches (origin, destination, dates, target price)
- System monitors prices and notifies users when fare drops at or below target
- Alerts sent via email, push notification, and/or SMS based on user preference
- Users can view alert history, pause alerts, and set expiry dates
- Price drop magnitude shown in notification ("Price dropped $120 — now $280")
- Support both exact-date and flexible-date alerts (any Tuesday in March)
Non-Functional Requirements:
- Price drops detected and notification sent within 15 minutes of fare change
- Support 50 million active alerts across 20 million users
- Notification delivery: email within 2 minutes of detection, push within 30 seconds
- Alert evaluation must not overload the fare data source (GDS API)
- 99.9% uptime; missed price alerts erode user trust
Scale Estimation
50 million active alerts; fare data refreshed every 15 minutes for top routes. Alerts are grouped by (origin, destination, date_range, cabin_class) — many users watch the same route. Distinct route-date combinations: ~5 million (much less than 50 million alerts). Each combination needs one fare lookup per refresh cycle. At 5 million lookups per 15 minutes = 5,556 fare lookups/second. With GDS rate limits (~1,000 QPS), fare data must be shared aggressively across alerts for the same route. Notifications fired per cycle: assuming 0.5% of alerts trigger per cycle = 250,000 notifications/15 minutes = 278/second.
High-Level Architecture
The key insight: multiple users watching the same route share the same fare lookup. The system groups alerts by canonical route-date key, fetches fare data once per key per refresh cycle, and evaluates all alerts for that key against the fetched fare in memory.
The architecture has three layers: Fare Data Collection (fetching current fares for all watched route-date keys), Alert Evaluation (comparing current fares against alert targets), and Notification Dispatch (delivering alerts to users).
Fare Collection runs as a scheduled job every 15 minutes: it builds a set of unique route-date keys from active alerts, batches these into fare API requests, and stores results in a Fare Cache (Redis, keyed by canonical route key, TTL = 16 minutes). Alert Evaluation reads the Fare Cache and active alerts from PostgreSQL, identifies triggered alerts, and publishes them to a Notification Queue (Kafka). Notification Workers consume from Kafka and dispatch via email (SendGrid), push (Firebase), or SMS (Twilio).
Core Components
Route Deduplication & Fare Collection
The Route Aggregator runs every 15 minutes, querying PostgreSQL for the set of distinct (origin, destination, date_range, cabin_class) tuples from active alerts. It batches these into GDS API requests (Amadeus Flight Offers Search supports batches of 10 route-date combinations per call). Results are written to Redis with a fare_fetched_at timestamp. For top-1,000 routes (covering ~80% of alert volume), fare collection runs every 5 minutes for lower latency. Redis stores: fare:{route_key} → {price, airline, last_updated}.
Alert Evaluation Engine
The Evaluation Engine runs immediately after each fare collection cycle. It reads all active alerts from PostgreSQL (indexed by route_key for fast lookup), compares each alert's target_price against the latest fare from Redis, and identifies triggered alerts (current_fare ≤ target_price AND alert not triggered in last 24 hours). Deduplication: an alert should not fire multiple times on the same day for small price fluctuations. A triggered_today flag in Redis (TTL = 24 hours keyed by alert_id) prevents repeat notifications. Triggered alerts are published to the Notification topic in Kafka.
Notification Dispatch Service
Kafka consumers read triggered alert events and fan out to three channels: Email (SendGrid API with rich HTML template showing price history graph), Push notification (Firebase Cloud Messaging with deep link to booking page), SMS (Twilio, reserved for users who opted in and set a high-value threshold). Delivery status is tracked per channel; failed deliveries retry with exponential backoff up to 3 times. Notification history stored in Cassandra (user_id, alert_id, sent_at, channel, price_at_trigger).
Database Design
Alerts in PostgreSQL: (alert_id, user_id, origin, destination, date_range_start, date_range_end, cabin_class, target_price, status ACTIVE|PAUSED|EXPIRED, created_at, expires_at, last_triggered_at). Indexed by route_key (derived column = hash of origin+destination+dates+cabin) for batch evaluation queries. Fare cache in Redis: fare:{route_key} → {current_price, airline, expires_at}. Deduplication flags in Redis: triggered:{alert_id}:{YYYY-MM-DD} → 1, TTL 25 hours. Notification history in Cassandra (partition by user_id, cluster by sent_at DESC) for user-facing alert history view.
API Design
- POST /v1/alerts — Creates a price alert: accepts origin, destination, dates, cabin_class, target_price, notification_channels; returns alert_id and current fare for reference
- GET /v1/alerts — Returns all active alerts for the authenticated user with current fare vs. target for each
- PUT /v1/alerts/{alert_id} — Updates target price, notification preferences, or expiry; PATCH semantics
- DELETE /v1/alerts/{alert_id} — Deletes an alert; soft-delete with 30-day retention for audit
Scaling & Bottlenecks
GDS API rate limits are the primary bottleneck. With 5 million distinct route-date keys and a 15-minute refresh cycle, the required QPS (5,556) far exceeds typical GDS quotas. Solutions: (1) tiered refresh — only top-1,000 routes (by alert count) refresh every 15 minutes; long-tail routes refresh every 2–6 hours; (2) Fare partnerships — Google Flights and Kayak maintain direct data feed agreements with airlines bypassing GDS per-query pricing; (3) Web scraping as complement for LCC airlines that don't participate in GDSes.
Alert evaluation at 50 million records is a batch processing challenge. PostgreSQL GROUP BY route_key with a partial index on status=ACTIVE efficiently retrieves the ~5 million distinct route keys in <30 seconds. Evaluation fan-out (evaluating all alerts for a route key in memory) is a Kafka Streams job that reads (route_key, current_fare) pairs and joins against an in-memory alert table (loaded from PostgreSQL at job start, refreshed every 5 minutes).
Key Trade-offs
- Polling frequency vs. API cost — more frequent polling (every 5 minutes) catches more price drops but multiplies GDS costs; tiered polling by alert popularity optimizes this trade-off
- Exact date vs. flexible date alerts — flexible alerts ("cheapest day in March") require evaluating 30× more fare combinations; they are more useful for leisure travelers but proportionally more expensive to serve
- Notification fatigue — sending alerts every time a fare drops by $1 trains users to ignore them; a minimum drop threshold ($20 or 5%) and 24-hour deduplication prevent fatigue
- Alert expiry policy — perpetual alerts accumulate but most flights are booked within 12 weeks of travel; a default 6-month expiry with opt-in renewal reduces dead alert volume while respecting users who book far in advance
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.