SYSTEM_DESIGN

System Design: Uber

A deep dive into designing Uber's ride-hailing platform covering real-time matching, GPS tracking, surge pricing, and global scale. Essential reading for engineers preparing for system design interviews.

18 min readUpdated Jan 15, 2025
system-designuberride-hailingreal-timegeospatialmatching

Requirements

Functional Requirements:

  • Riders can request a ride by specifying pickup and dropoff locations
  • System matches riders with nearby available drivers in real time
  • Drivers and riders can track each other's location on a live map
  • System calculates fare estimates before and final cost after the ride
  • Surge pricing activates automatically during high demand periods
  • Riders and drivers can rate each other after trip completion

Non-Functional Requirements:

  • Match latency under 2 seconds for 99th percentile requests
  • 99.99% availability — downtime means lost revenue and stranded riders
  • Support 20 million concurrent users across 70+ countries
  • Location updates processed at under 500ms end-to-end latency
  • Strong consistency for payments; eventual consistency acceptable for location data

Scale Estimation

Uber processes roughly 25 million trips per day globally. With ~5 million active drivers, each sending GPS pings every 4 seconds, that's ~1.25 million location updates per second. Each update is ~100 bytes, yielding ~125 MB/s inbound location data. Trip records at ~1 KB each produce ~25 GB/day of trip data. A 5-year retention policy demands ~45 TB of trip storage, easily managed with columnar storage on S3 + Redshift for analytics.

High-Level Architecture

Uber's architecture is organized around three core flows: supply tracking (driver locations), demand intake (rider requests), and the matching engine that joins them. Drivers run a mobile SDK that emits location pings via a persistent WebSocket or gRPC connection to a fleet of Location Ingestion Services. These services write to a geospatial index (backed by Redis with geohash or an in-house system called Ringpop) so that at any moment the system knows which drivers are within a given bounding box.

When a rider submits a request, the Dispatch Service queries the geospatial index for nearby available drivers, scores them by ETA (computed via a routing engine like OSRM or Google Maps Platform), and issues a match offer to the best candidate. The driver app receives the offer via a push notification or persistent connection, accepts or declines within a timeout window, and if accepted the trip FSM (finite state machine) transitions from REQUESTED → MATCHED → EN_ROUTE → IN_PROGRESS → COMPLETED.

A separate Pricing Service runs surge calculations every 60 seconds by comparing supply/demand ratios in each geofenced zone. Completed trips flow into a Payment Service that charges the rider's stored card through Stripe or Braintree, then queues a driver payout via ACH or Instant Pay.

Core Components

Geospatial Index

The geospatial index is the beating heart of Uber's matching. Drivers are partitioned into geohash cells (roughly 1.2 km × 0.6 km at precision 6). Redis GEOADD/GEORADIUS commands provide O(N+log M) nearest-neighbor queries. At Uber's scale, a dedicated H3 hexagonal grid (Uber's open-source library) shards the planet into ~4 million cells at resolution 8, each owning a small sorted set of driver IDs. Writes are extremely hot so the index is replicated across multiple Redis clusters behind a consistent-hash ring.

Dispatch & Matching Engine

The Dispatch Service runs as a stateless horizontally scaled microservice. On each rider request, it fans out to the geospatial index for a candidate set (typically top-20 nearest drivers), computes ETA for each via a pre-computed road graph, ranks by ETA + acceptance rate + rating, and sends an offer to the top candidate. If declined or timed out (8 seconds), it moves to the next candidate. The entire flow is orchestrated with a saga pattern to handle partial failures.

Trip State Machine

Each trip is a document in a distributed key-value store (Cassandra for scale) with a well-defined FSM: CREATED → MATCHING → ACCEPTED → ARRIVING → IN_PROGRESS → COMPLETED | CANCELLED. State transitions are idempotent and write-ahead-logged. Kafka topics carry transition events downstream to analytics, billing, and notification services.

Database Design

Trip data lives in Cassandra partitioned by trip_id (UUID) with secondary indexes on rider_id and driver_id for history queries. Location history uses a time-series approach: each driver gets a row in Cassandra keyed by (driver_id, date) with a map of timestamp→coordinates, capped at 24 hours before archival to S3 Parquet. User profiles and payment methods are stored in a sharded PostgreSQL cluster (CockroachDB or Vitess) for ACID guarantees. Geospatial queries use the Redis geohash layer rather than PostGIS to avoid latency.

API Design

  • POST /v1/rides — Rider submits pickup_lat, pickup_lng, dropoff_lat, dropoff_lng; returns ride_id, estimated fare, and driver ETA
  • GET /v1/rides/{ride_id} — Polls trip state, driver location, and dynamic ETA; used by the rider app to update the live map
  • PATCH /v1/drivers/location — Driver SDK sends current lat/lng + heading + speed every 4 seconds via authenticated bulk update endpoint
  • POST /v1/rides/{ride_id}/rating — Either party submits a 1–5 star rating with optional text after trip completion

Scaling & Bottlenecks

The geospatial index is the primary write bottleneck at 1.25 million updates/second. Mitigation involves sharding Redis clusters by geographic region (Americas, EMEA, APAC) and within each region by city, so no single Redis instance handles more than ~50k updates/second. Read replicas serve ETA fan-out queries. Location updates are batched client-side every 4 seconds to reduce connection overhead.

The matching engine scales horizontally since it is stateless — each Dispatch pod handles ~2,000 concurrent match attempts. The routing ETA service caches road-graph segments in memory (OSRM pre-computes contraction hierarchies) so ETA lookups are sub-millisecond. Kafka decouples the high-throughput event stream from downstream consumers like analytics and billing, preventing backpressure from slower services from affecting trip latency.

Key Trade-offs

  • Eventual consistency for location vs. strong consistency for payments — location data can lag 4 seconds acceptably, but double-charges are catastrophic, so payments use synchronous ACID transactions
  • ETA accuracy vs. computation cost — pre-computed contraction hierarchies sacrifice real-time traffic precision for speed; a live traffic layer (Waze data feed) is blended in as a correction factor
  • Push vs. poll for driver offers — persistent WebSocket connections reduce latency but require more server-side state; connection pools are managed by a dedicated gateway service
  • Geohash precision vs. index size — finer precision (higher resolution H3 cells) improves match quality near cell boundaries but multiplies index storage; precision 8 (~0.7 km² cells) is the production sweet spot

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.