System Design: Dynamic Pricing for Travel

Requirements

Functional Requirements:

Compute optimal price for each room/flight/seat based on current demand and supply
Update prices in near-real-time as bookings are made and cancellations occur
Price rules: minimum/maximum price floors/ceilings set by property managers
Competitive pricing: monitor competitor prices and adjust accordingly
Seasonal and event-based pricing adjustments (holidays, local events)
A/B testing framework for pricing strategy experiments

Non-Functional Requirements:

Price computation completes within 500ms of triggering event (booking, cancellation)
Competitive price data refreshed every 30 minutes
Support 2 million properties × 90-day booking window = 180 million price slots
Price model retraining pipeline runs nightly within 2 hours
99.9% availability; pricing errors directly impact revenue

Scale Estimation

180 million price slots (2M properties × 90 days). Trigger events causing repricing: 2 million bookings/day + 500,000 cancellations = 2.5 million events/day = 29 events/second. Each event may reprice up to 90 future dates for the affected property = 2.5 million × 90 = 225 million price calculations/day = 2,604/second. Competitor price monitoring: 2 million properties monitored × 10 OTA competitor prices each = 20 million price data points refreshed every 30 minutes = 11,111 competitor price fetches/second.

High-Level Architecture

Dynamic pricing operates on two loops: a slow loop (demand forecasting and base price optimization, runs nightly) and a fast loop (real-time price adjustments on booking events, runs in milliseconds).

The Slow Loop: a nightly batch pipeline reads historical booking data, current occupancy, and competitive pricing into a Revenue Management Model (RMM — a constrained optimization model from airline revenue management). The model outputs a base price recommendation per property per future date, respecting floor/ceiling constraints set by property managers. These recommendations are written to the Price Store.

The Fast Loop: when a booking occurs, the Pricing Engine reads the current occupancy for the property and updates the price multiplier for the remaining inventory. If occupancy crosses a threshold (e.g., 70%, 85%, 95%), the price steps up according to a pre-computed multiplier curve. This is a lookup into the Price Store followed by a threshold evaluation — sub-millisecond for individual properties.

Core Components

Demand Forecasting Model

The demand forecast predicts: for property P on date D, how many bookings will occur in the next N days at price X? Input features: day of week, days until arrival, lead time distribution, local events (scraped from Eventbrite/Ticketmaster), school holidays calendar, historical booking velocity, current search volume (from the search platform's clickstream). Model: a gradient boosted regression (LightGBM) trained on 3 years of booking data per property cluster. Output: demand_curve(price) — the expected number of bookings at each price point. The RMM uses this curve to find the revenue-maximizing price.

Real-Time Pricing Engine

The Pricing Engine is a stateless service that maintains all active price slots in an in-memory cache (Redis). On a booking event, it reads (property_id, date) → current_price, occupancy_pct from Redis, applies the occupancy multiplier curve (a pre-loaded lookup table: [0–50%: 1.0×, 50–70%: 1.1×, 70–85%: 1.25×, 85–95%: 1.5×, 95–100%: 2.0×]), and writes the updated price back to Redis. The price update is published to Kafka and consumed by the search index updater (to reflect new prices in search results) and the analytics pipeline.

Competitive Intelligence Service

Monitoring 20 million competitor price data points refreshed every 30 minutes requires a crawler fleet. The Competitor Price Scraper runs headless Chrome instances (using Playwright) against OTA websites, parsing price data for each property. Rate limiting and IP rotation avoid bot detection. Scraped prices are written to the Competitor Price Store (Cassandra: partition by property_id, cluster by scraped_at DESC). A Competitive Pricing Rule Engine checks: if own_price > competitor_avg_price × threshold (e.g., 1.15), consider a price adjustment recommendation (subject to floor/ceiling constraints).

Database Design

Price Store in Redis: price:{property_id}:{date} → {base_price, current_price, occupancy_pct, last_updated}. 180 million keys × ~100 bytes = 18 GB — fits in a Redis Cluster with 4 shards. Price history in Cassandra: (property_id, date, timestamp, price, trigger_event) partitioned by property_id for fast range queries. Demand forecast outputs in PostgreSQL: (property_id, forecast_date, predicted_demand, recommended_price, model_version). Competitor prices in Cassandra: (property_id, competitor_id, date, price, scraped_at). Experiment assignments in Redis (user_id → experiment_variant) with 1-hour TTL for consistent user experience within a session.

API Design

GET /v1/pricing/{property_id}?dates={} — Returns current prices for specified dates; reads from Redis Price Store sub-millisecond
POST /v1/pricing/events — Internal endpoint; receives booking/cancellation events from Reservation Service; triggers real-time price update
PUT /v1/pricing/{property_id}/rules — Property manager sets floor price, ceiling price, and blackout dates (no dynamic pricing); stored in PostgreSQL and enforced by the Pricing Engine
GET /v1/pricing/{property_id}/recommendations — Returns AI price recommendations with demand forecast explanation for property manager dashboard

Scaling & Bottlenecks

The Redis Price Store is the hot path: 2,604 price writes/second and 10,000 price reads/second (from search). Redis Cluster with 4 shards handles ~50,000 operations/second per shard — well above requirements. The 18 GB dataset comfortably fits with 32 GB per shard. The Kafka topic for price update events has 20 partitions, consumed by the search index updater (Elasticsearch document updates for affected properties).

The competitor price scraping fleet is the operational complexity. 20 million scrapes per 30 minutes = 11,111/second requires 1,000+ concurrent scraper instances (each doing 10–15 scrapes/second). Playwright-based scrapers running in Kubernetes pods with residential IP proxies handle this volume. False scrape failures are retried; properties with consistent scrape failures fall back to last-known price.

Key Trade-offs

Demand-based vs. competitive-based pricing — pure demand optimization maximizes own revenue but can price property out of the market if competitors are lower; a blend of 70% demand signals + 30% competitive signals is a common industry setting
Granularity of price updates — updating prices per-booking (maximum responsiveness) vs. per-hour (batch) reduces Elasticsearch update load; per-booking is used for last-5-rooms scenarios; per-hour for low-occupancy properties
Price stability vs. optimization — frequent large price changes confuse consumers and reduce booking confidence; maximum price change per day (20% cap) and minimum price stability window (2 hours) are policy guardrails
Transparency to property managers — fully automated pricing builds platform dependency; showing managers the demand forecast and reasoning behind recommendations builds trust and reduces override rate