SYSTEM_DESIGN
System Design: Booking.com
Design Booking.com's accommodation marketplace covering multi-property search, rate/availability sync, and reservation management across millions of hotel partners.
Requirements
Functional Requirements:
- Travelers search properties by destination, dates, guest count, and filters
- Real-time availability and rate display from 2+ million properties worldwide
- Instant confirmation for most bookings; request-based for some boutique properties
- Multi-room, multi-night bookings with complex rate rules (non-refundable, early bird)
- Property management system (PMS) integration via channel manager APIs
- Cancellation and modification handling with policy-based fee calculation
Non-Functional Requirements:
- Search returns results in under 300ms at the 95th percentile
- Availability sync from property PMSes within 60 seconds of update
- Support 2 million properties; 550 million guest reviews; 1.5 million room nights/day
- Overbooking rate must be below 0.1% — Booking.com compensates guests for walk-ins
- 99.99% uptime for booking path
Scale Estimation
2 million properties × average 50 rooms × 365 nights = 36.5 billion room-night inventory records. At 1.5 million bookings/day = 17 bookings/second average, 150/second peak. Search load: ~2 billion searches/year = 63 searches/second average, 2,000/second peak. Availability sync: PMSes update rates/availability throughout the day; with 2 million properties sending an average of 50 updates/day = 1.16 million updates/day = 13 updates/second average, 200/second peak.
High-Level Architecture
Booking.com's core challenge is synchronizing availability and rates from 2 million independent property systems while serving sub-second search results to millions of concurrent travelers.
The Inventory Management layer connects to property PMSes via Channel Manager APIs (SiteMinder, STAAH, direct PMS integrations). Updates flow into a Rate & Availability (RnA) Processor that normalizes data from 500+ different PMS formats into a canonical availability model. Normalized data is written to the Inventory Store (a custom distributed store optimized for availability-range queries) and published to the Search Index via a CDC (Change Data Capture) pipeline.
The Search Service queries Elasticsearch for property discovery (filters, text, geo) and the Inventory Store for real-time availability checking of candidate properties. A Rate Calculator Service computes the final price for each combination of dates, room type, and rate plan (with promotion codes, loyalty discounts, and currency conversion).
Core Components
Rate & Availability Processor
Channel managers post availability updates via REST webhooks. Each update specifies: property_id, room_type_id, date_range, availability (rooms_left), and rate_plan_id → price. The RnA Processor validates, deduplicates (idempotent by checksum), and writes to the Inventory Store with compare-and-swap semantics (only apply update if it's newer than stored version). A Kafka CDC stream carries normalized updates to the Elasticsearch search index and analytics pipelines.
Inventory Store
The custom Inventory Store is a distributed system (inspired by Booking.com's open-sourced approach) that stores availability as compressed bitmaps per (property_id, room_type_id) — each bit represents one day, allowing fast bitwise AND operations for multi-night availability queries. Rate data (price per night per room_type_id per rate_plan_id) is stored in a columnar format optimized for range scans by date. The store is sharded by property_id and replicated across 3 data centers in Europe, US, and APAC.
Search & Ranking Service
Elasticsearch stores property-level documents (location, amenities, category, review score, star rating) and pre-computed availability windows (compressed date-range bitmaps synced from the Inventory Store). Search queries filter by geo bounding box, date availability, and guest capacity. The initial result set (top 200 properties) is then enriched with real-time rates from the Inventory Store and re-ranked by a gradient boosted ranking model considering price competitiveness, review score, booking conversion rate, and personalization signals.
Database Design
Property master data in PostgreSQL sharded by property_id: (property_id, name, address, location_geom, star_rating, amenities JSONB, cancellation_policy). Booking records in MySQL with strict ACID guarantees; partitioned by created_at month. The booking table has a unique constraint on (property_id, room_type_id, date, booking_status) to prevent overbooking via database-level integrity. Availability bitmaps in the custom Inventory Store (RocksDB-backed). Reviews in Cassandra: (property_id, review_id, guest_id, rating, text, helpful_votes, created_at) partitioned by property_id.
API Design
- GET /v1/search?destination={}&checkin={}&checkout={}&rooms={}&adults={} — Returns ranked property list with prices and availability confirmed in real time
- POST /v1/bookings — Creates booking: validates availability with SELECT FOR UPDATE, charges card, confirms reservation, sends confirmation email to guest and property
- PUT /v1/properties/{id}/availability — Channel manager posts availability/rate update; processed asynchronously, confirmed within 60 seconds in search index
- DELETE /v1/bookings/{booking_id} — Guest cancels; applies cancellation policy rules (full refund, partial, no refund), processes refund via original payment method
Scaling & Bottlenecks
The inventory sync bottleneck is peak holiday season when thousands of properties simultaneously push availability updates after a batch PMS sync (many PMSes run nightly batch exports). The RnA Processor uses a Kafka consumer group with 50 partitions to parallelize processing. For the most active properties (top 1% generating 50% of bookings), availability is cached in Redis with a 30-second TTL and eagerly refreshed on each update, keeping the search path fast.
Search scaling uses multi-region Elasticsearch clusters: EU cluster serves European searches, US cluster serves Americas, APAC cluster serves Asia. Property documents are replicated to all three clusters. Real-time availability enrichment (calling Inventory Store for top-200 results) is the latency bottleneck — parallel fan-out with a 150ms timeout (return best available results if Inventory Store is slow) prevents availability latency from degrading search SLA.
Key Trade-offs
- Real-time availability vs. eventual consistency in search — full real-time availability for all 2 million properties on every search is infeasible; Booking.com uses cached availability windows in Elasticsearch (updated within 60 seconds) for initial filtering and live Inventory Store checks only for the final booking step
- Overbooking risk vs. sync delay — accepting the 60-second sync delay means a property could be shown as available 1 minute after their last room was taken; the <0.1% overbooking rate is managed by real-time final checks at booking and aggressive property SLA enforcement
- Instant confirmation vs. request — 95% of properties are instant-confirm; the 5% request-based are boutique/unique properties; the hybrid model maximizes inventory while accommodating property preferences
- Commission model — Booking.com charges 15% commission, creating a misalignment (properties want to drive direct bookings); competitive positioning requires constant investment in search quality and loyalty programs
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.