SYSTEM_DESIGN
System Design: Multiplayer Matchmaking System
Design a scalable multiplayer matchmaking system that groups players into balanced game sessions based on skill rating, latency, and preferences, with minimal wait times and high match quality.
Requirements
Functional Requirements:
- Match players into game sessions based on skill rating (MMR/Elo), latency (ping to server region), game mode, and player preferences
- Support group queue: parties of 2-5 players queue together and match as a unit
- Dynamic skill expansion: widen acceptable skill range as wait time increases (quality-time trade-off)
- Backfill: replace players who disconnect before match start with suitable players from queue
- Leaver penalty: track and penalize players who frequently abandon queued matches
- Ranked and unranked modes with different matching algorithms
Non-Functional Requirements:
- Median match wait time under 45 seconds during peak hours
- Match skill imbalance (average team MMR difference) under 100 points for 90% of matches
- System must handle 500,000 concurrent players in queue
- Matchmaking decisions must be made and communicated within 2 seconds of a viable match forming
- Regional servers: separate matching pools per region (NA, EU, Asia) with cross-region fallback
Scale Estimation
500k concurrent players in queue globally across 5 regions = 100k per region on average. For a 5v5 game, we need to form groups of 10 balanced players. If median wait time is 45 seconds and we have 100k players in queue, throughput is 100k / 45 seconds × 10 players per match = ~2,200 matches formed/second per region. The matchmaker must evaluate match quality for candidate groups — a naive O(N²) comparison of all player pairs is infeasible at 100k players. The matchmaker uses a sorted structure (by MMR) to limit comparison to players within the current skill window.
High-Level Architecture
The matchmaking system is a set of regional matching services, each managing its own queue pool. Players enter a queue via the queue API (with their MMR, latency to regional servers, and preferences). The queue API adds the player to a Redis Sorted Set (score = MMR) within their region's queue and publishes a queue-entry event to the regional matching engine.
The matching engine is a stateful loop running every 500ms. Each iteration: (1) read all players from the Redis queue sorted by MMR, (2) apply a sliding window algorithm to find groups of 10 players within the current skill tolerance (starting at ±50 MMR, expanding by 10 MMR every 10 seconds of wait), (3) for each candidate group, verify latency compatibility (all players must have sub-100ms ping to the assigned server region), (4) lock the matched players (atomic Redis multi-key lock), (5) allocate a game server session (via the game server allocation API), (6) notify all matched players of their session details via WebSocket push. If a player doesn't acknowledge within 10 seconds, the match is cancelled and non-accepting players are penalized.
Group queuing (parties) is handled by treating the party as a single queue entity with the average MMR of the party members and the highest individual latency. The matchmaker fills the opposing team with 5 individual players or a matching party, applying an MMR adjustment to account for party coordination bonuses.
Core Components
Queue Manager
The queue manager maintains per-region Redis Sorted Sets for each game mode and rank tier. Players are inserted with a composite key encoding their MMR and queue entry timestamp. The entry_time is stored as a Redis hash field alongside the player's full queue entry (MMR, latency map, preferences, party_id if applicable). The queue manager handles: queue entry validation (not already queued, no active ban), duplicate detection (reconnecting client re-queuing), and queue removal (player cancels, disconnects, or is matched). Queue state is authoritative in Redis with a 10-minute TTL per entry to auto-remove stale connections.
Match Quality Evaluator
The match quality evaluator runs within the matching loop to score candidate matches before committing. Quality score components: (1) MMR balance score — 1 - (|team_A_avg_mmr - team_B_avg_mmr| / max_tolerance), (2) intra-team MMR variance score — lower variance = more consistent team, (3) latency score — average server latency across all players, (4) wait time fairness — players who have waited longest get priority. The composite quality score (weighted sum) must exceed a threshold before a match is formed. This prevents the system from forming technically valid but low-quality matches when slightly better matches are 10 seconds away.
Skill Rating (MMR) Service
Post-match, the MMR service computes Elo-style rating updates based on match outcome and pre-match win probability. Win probability is calculated as: P(A wins) = 1 / (1 + 10^((MMR_B - MMR_A) / 400)). The K-factor (how much MMR changes per match) scales with match count — new players have K=32 for faster calibration, stabilizing at K=16 after 30 matches. The service updates MMR in PostgreSQL (authoritative) and pushes the new value to Redis (for the queue). It also maintains separate ranked and casual MMR tracks — ranked MMR is the matchmaking basis; casual MMR converges toward ranked but updates more slowly.
Database Design
Redis: queue:{region}:{game_mode}:{tier} (Sorted Set: player_id → MMR), queue:player:{player_id} (hash: MMR, latency_map_json, party_id, queued_at, expanded_range), match_lock:{match_id} (string with TTL: locks players during match acceptance phase). PostgreSQL: player_mmr (player_id, game_mode, mmr, uncertainty, match_count, updated_at), matches (match_id, region, game_mode, started_at, team_a_players[], team_b_players[], team_a_mmr_avg, team_b_mmr_avg, outcome), queue_metrics (hourly aggregates: avg_wait_time, avg_mmr_diff, match_count per region/mode), leaver_penalty (player_id, penalty_count, last_penalty_at, ban_until). ClickHouse: match_quality_events (match_id, quality_score components) for matchmaker tuning analytics.
API Design
POST /queue/join— body:{player_id, game_mode, region_preferences: ["NA","EU"]}, validates, adds to queue, returns{queue_id, estimated_wait_seconds}; WebSocket connection maintained for match notificationDELETE /queue/leave— body:{player_id}, removes from queue; applies leaver penalty if match was already in acceptance phase- WebSocket
/queue/notifications?player_id={id}— server pushes:MatchFound{session_id, server_endpoint, team_assignment},MatchCancelled{reason},WaitTimeUpdate{estimated_seconds} POST /queue/accept— body:{match_id, player_id}, registers acceptance; after all 10 players accept, game server is provisionedGET /queue/status?region={r}&game_mode={m}— public endpoint: returns current queue size and estimated wait time by tier
Scaling & Bottlenecks
The matching loop's O(N) scan of the queue sorted set (reading 100k players per iteration every 500ms) is expensive at 200k reads/500ms = 400k Redis ops/second per region — within Redis limits but approaching single-instance capacity. Optimize with tier-based segmentation: divide the MMR range into 10 tiers (Bronze through Diamond) and run independent matching loops per tier. Each tier has ~10k players, reducing scan size by 10× and enabling parallel loop execution.
Match acceptance (10 players must confirm within 10 seconds) creates a thundering herd on the game server allocation service when many matches form simultaneously. Buffer allocation requests in a queue (SQS) to prevent spikes from overwhelming the Agones allocation API. If a player doesn't accept within 10 seconds, immediately return the other 9 players to the front of their queue (priority re-queue), not the back — they should get the next available match without losing their wait time.
Key Trade-offs
- Skill balance vs. wait time: Strict skill matching minimizes MMR imbalance but causes long queues for players with extreme ratings (top 1% or bottom 1%); expanding search range over time is a continuous trade-off. Publishing the expansion policy to players ("expanding search in 30s") manages expectations.
- Regional vs. global pools: Regional pools minimize latency but create thin queues for niche game modes in small regions; cross-region fallback (accepting 20ms extra latency for faster matching) is a pragmatic solution for small player populations.
- Deterministic vs. approximate matching: Solving the optimal matching problem (minimize total MMR imbalance across all concurrent matches) is NP-hard; the sliding window greedy approach provides near-optimal results in O(N log N) per iteration.
- Party MMR averaging vs. individual matching: Averaging party MMR means a high-MMR player can "carry" low-MMR friends into higher-tier matches; applying an MMR premium to parties (add 50 to team MMR for each party member advantage) reduces this but penalizes legitimate friend groups.
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.