System Design: Ride Matching Algorithm

Requirements

Functional Requirements:

Match a rider request to the nearest available driver within seconds
Consider driver rating, acceptance rate, and vehicle type in ranking
Support pool/shared ride matching (multiple riders per vehicle)
Handle driver preferences (maximum pickup distance, preferred zones)
Provide estimated pickup time before match is confirmed
Re-match automatically if the initial driver cancels

Non-Functional Requirements:

Match latency: median under 500ms, 99th percentile under 2 seconds
System must handle 50,000 concurrent match requests during peak hours
Matching quality: minimize average rider wait time across all active requests
High availability — a matching outage directly causes revenue loss
Scalable to millions of drivers across hundreds of cities

Scale Estimation

Consider a platform with 5 million active drivers globally and 10 million daily ride requests. During peak (6–9 PM), ~500,000 requests/hour = ~140 requests/second. Each match attempt queries the geospatial index for ~20 candidate drivers and runs ~20 parallel ETA calculations. That's 2,800 ETA computations/second at peak. The geospatial index receives 5 million driver updates every 4 seconds = 1.25 million writes/second sustained.

High-Level Architecture

The ride matching system sits between two data sources: the live driver supply index (continuously updated by the Location Service) and the rider demand queue (fed by the API gateway when riders submit requests). The Matching Engine reads from both to produce driver-rider pairs.

The Location Service maintains a geospatial index backed by Redis GEO or a custom H3-based structure. Each driver update triggers an upsert to the index. The index supports radius queries returning driver IDs sorted by distance within a given bounding box.

The Matching Engine is an event-driven service. When a ride request enters the Demand Queue (Kafka topic), a Matching Worker picks it up, queries the supply index, computes ETAs via the Routing Service, applies the ranking model, and dispatches an offer. The result (match or timeout) is written back to Kafka for downstream processing (notifications, trip creation).

Core Components

Geospatial Supply Index

Drivers are indexed using H3 hexagonal cells at resolution 8 (~0.74 km² per cell). Each cell maps to a Redis sorted set of (driver_id, last_updated_timestamp) pairs. A radius query for drivers within 5 km of a pickup resolves to a set of H3 cells at resolution 8 and fetches their sorted sets. Writes use a pipeline batch (GEOADD + EXPIRE) to reduce round-trips. Index sharding is by continent → country → city cluster to keep individual Redis instances under 50k writes/second.

ETA Computation Service

Given a driver's current location and a rider's pickup point, the ETA Service computes driving time using a pre-loaded road graph (contraction hierarchy). For 20 candidates per request at 140 requests/second, that's 2,800 ETA computations/second. Each ETA call is sub-millisecond on a warm contraction hierarchy, so a pool of 50 routing workers handles this load easily. ETAs are adjusted for current traffic using segment speed data from the traffic pipeline (refreshed every 2 minutes).

Ranking & Scoring Model

After ETA computation, candidates are scored: Score = w1 × (1/ETA) + w2 × acceptance_rate + w3 × rating + w4 × (1/distance_to_destination). Weights are tuned offline via a bandit algorithm that A/B tests weight combinations and optimizes for rider wait time and driver utilization. For shared rides (pool), a separate optimizer solves a multi-objective assignment problem: minimize total detour for existing riders while maximizing vehicle fill rate.

Database Design

The supply index lives entirely in Redis (in-memory geospatial). Active ride requests are stored in a Redis hash (request_id → JSON blob) with a TTL of 10 minutes to auto-expire stale requests. Match outcomes (driver_id, rider_id, match_timestamp, ETA) are written to Cassandra for audit and analytics. A PostgreSQL table tracks driver preferences (max_pickup_distance, preferred_zones, vehicle_type) queried once per match attempt and cached per driver_id in Redis with a 5-minute TTL.

API Design

POST /v1/match/request — Internal API called by the Ride Request Service; accepts rider_id, pickup coordinates, ride_type; returns match_id and initial status
GET /v1/match/{match_id}/status — Returns current state: SEARCHING, MATCHED (with driver details and ETA), or FAILED (no drivers available)
POST /v1/drivers/{driver_id}/availability — Driver app updates availability state (AVAILABLE, ON_TRIP, OFFLINE) which triggers supply index update
POST /v1/match/{match_id}/outcome — Driver app reports ACCEPTED or DECLINED; triggers re-match flow on decline

Scaling & Bottlenecks

The primary bottleneck is the geospatial index write throughput (1.25 million updates/second). Mitigation: shard by geography (each city-cluster gets its own Redis primary); batch driver updates on the client side (send every 4 seconds, not more); use Redis pipelines to reduce RTT overhead. A city with 100,000 active drivers produces only 25,000 updates/second on its dedicated Redis — well within single-instance limits.

The matching workers are stateless and horizontally scalable via Kafka consumer groups. Adding partitions to the demand topic and workers to the consumer group linearly increases throughput. The routing service is the other CPU-intensive component; it scales horizontally with geographic graph sharding (each shard covers one continent loaded into memory on dedicated routing servers).

Key Trade-offs

Global optimum vs. greedy matching — a global optimizer (Hungarian algorithm) minimizes total wait time but is O(n³); greedy nearest-driver matching is O(n log n) and only slightly suboptimal at scale
Timeout vs. match quality — a longer driver offer timeout (15s vs. 8s) increases the chance of acceptance but delays re-match; 8 seconds is the empirically validated sweet spot
Single offer vs. broadcast — broadcasting to multiple drivers simultaneously speeds matching but causes double-bookings requiring cancellation; Uber and Lyft both use sequential single-offer dispatch
Pool matching delay — pool rides wait up to 90 seconds to batch multiple riders in the same direction, improving vehicle utilization at the cost of rider wait time