SYSTEM_DESIGN

System Design: Game Replay System

Design a scalable game replay system that records, stores, and streams match replays for review, highlight sharing, and spectating with frame-accurate playback at any speed.

14 min readUpdated Jan 15, 2025
system-designgamingreplaystoragestreaming

Requirements

Functional Requirements:

  • Record all game sessions as replay files: complete authoritative input log and initial state snapshot
  • Players can watch their own replays, share replay links publicly, and bookmark specific timestamps
  • Replay playback at variable speeds (0.25×, 1×, 2×, 4×) with scrubbing to any point
  • Highlight clips: players or the system can mark specific timestamp ranges for sharing as short clips
  • Professional spectator mode: low-latency live spectating of ongoing matches
  • Anti-cheat use: analysts can review replays in detail, including per-player input traces

Non-Functional Requirements:

  • Replay files stored for 90 days; shared/bookmarked replays stored indefinitely
  • Recording overhead on game server: under 3% CPU overhead
  • Support 500,000 replay viewers simultaneously
  • Replay file size: under 50 MB for a 30-minute match
  • Scrubbing latency: reaching any point in a replay within 2 seconds

Scale Estimation

100k concurrent game sessions each recording at ~20 KB/second (compressed input log = ~1 KB/second, minimal overhead vs. game bandwidth) = 2 GB/second of replay data ingestion globally. Over 90 days with 100k sessions/day at 30 minutes average, that's 100k × 50 MB × 90 days = 450 TB of replay storage. With deduplication and compression, practical storage is ~200 TB at 2× compression for input logs. Replay playback is simulation-based (deterministic replay): the server re-runs the match simulation from the initial state, applying the recorded inputs tick by tick. This means replay playback costs the same as live game simulation — replay servers need the same game server binary and game state capacity as live servers.

High-Level Architecture

Replay recording is built directly into the game server: the recording module is a passive observer that writes every input packet received and every authoritative game event (spawn, death, objective capture) to a replay buffer. The replay consists of: (1) an initial state snapshot (full game world state at tick 0), (2) a tick-indexed input log (all player inputs at each tick), and (3) a keyframe log (full state snapshots every 5 seconds, enabling O(1) scrubbing to arbitrary timestamps without replaying from the beginning). The recording module writes to an in-process ring buffer (zero-copy from the game loop's input queues) and flushes to S3 every 5 seconds using background I/O threads — zero overhead on the game loop's critical path.

Replay storage is tiered: hot replays (last 7 days, high-demand shared replays) are stored in S3 Standard. Cold replays (7-90 days, infrequently accessed) migrate to S3 Infrequent Access. Replays older than 90 days are deleted unless the player has bookmarked them, in which case they move to S3 Glacier. A metadata database (PostgreSQL) stores replay index: session_id, player_ids, duration, game_mode, file_size, status (public/private), share_token.

Replay playback is simulation-based server-side, not client-side video streaming. A dedicated replay server fleet (separate from live game servers) receives playback requests, downloads the replay file from S3 (or from local cache for hot replays), loads the initial state snapshot, and begins ticking through the input log, advancing the simulation. The replay server streams the resulting state delta packets to the viewer's client using the same game state sync protocol as live gameplay. This approach has two major advantages: the replay file is tiny (inputs only, not video) and the viewer can switch between player perspectives freely (since the full world state is computed server-side).

Core Components

Replay Recording Module

The recording module is integrated into the game server as a passive component. It taps into two data streams: the input demultiplexer (receives all player input packets before they enter the game loop) and the game event bus (receives authoritative events like player spawn/death). These are written to a in-memory circular buffer (256 MB, ~10 minutes of data at 2 MB/second). A background thread reads from the buffer and writes compressed chunks to a local NVMe SSD (fast local staging) and asynchronously uploads to S3. The local NVMe copy provides fast access for the first 10 minutes (used for post-match highlight generation without waiting for S3 upload). Keyframes (full state snapshots) are written every 5 seconds by the recording module, enabling random access for scrubbing. The module uses LZ4 compression (fast but low ratio) for real-time recording to minimize CPU overhead, then recompresses to Zstandard (higher ratio) asynchronously for S3 storage.

Replay Playback Server

Replay playback servers are identical to game servers in terms of game binary and simulation capability, but run in "replay mode" — no live player connections, driven by the recorded input log instead of live inputs. On a playback request: the server downloads the replay file from S3 to local SSD, loads the initial state snapshot into memory, and begins the simulation loop. Playback at 2× speed runs the tick loop at 2× normal tick rate; scrubbing to timestamp T locates the nearest keyframe before T and fast-forwards from there at 10× speed to reach T — typically completing in <2 seconds for a keyframe every 5 seconds. The playback server broadcasts state deltas to the requesting client using the standard game sync protocol, with the client's game engine rendering the simulation output just as it would a live game.

Highlight Clip Service

Highlight clips are short segments of a replay (10-60 seconds). Two sources: manual (player bookmarks a timestamp range) and automatic (the system detects highlight-worthy moments — multi-kill streaks, game-winning shots — from the game event log). For manual clips, the clip service trims the relevant replay segment (from nearest prior keyframe to the end of the clip), compresses it into a shareable replay fragment, and stores it in S3 with a public share URL. For video exports (for social media sharing), a separate video rendering service runs the replay playback at 1× speed with a headless game renderer and encodes the output as an MP4. Video rendering is async (takes 2-5 minutes for a 30-second clip) and the result is stored in S3 and linked from the clip page.

Database Design

PostgreSQL: replays (replay_id, session_id, game_mode, region, recorded_at, duration_seconds, file_size_bytes, s3_key, status, keyframe_index_s3_key), replay_access (replay_id, player_id, access_type[recorded/shared/spectated], last_accessed_at), highlights (highlight_id, replay_id, player_id, start_tick, end_tick, auto_generated, share_token, view_count, created_at), replay_metadata (replay_id, player_ids[], team_data_json, final_score_json). Redis: replay:hot:{replay_id} (cached S3 key and metadata for hot replays), playback:session:{session_id} (active playback server assignment for a viewer session), replay:viewcount:{replay_id} (real-time view counter, synced to DB hourly). S3: raw replay files, keyframe index files, highlight clip fragments, rendered MP4 exports.

API Design

  • GET /replays/{replay_id} — returns replay metadata, player list, duration, share status; includes signed S3 URL for direct client download (for offline analysis tools)
  • POST /replays/{replay_id}/play — allocates a replay playback server, returns {playback_session_id, server_endpoint} for WebSocket connection
  • POST /replays/{replay_id}/highlights — body: {start_tick, end_tick, title}, creates highlight clip; async, returns highlight_id
  • GET /highlights/{share_token} — public endpoint for shared highlight pages; returns highlight metadata and video URL if rendered
  • GET /sessions/{session_id}/live-spectate — returns spectator WebSocket endpoint for an ongoing session; live stream with 5-second delay

Scaling & Bottlenecks

Replay storage ingestion at 2 GB/second is handled by S3's virtually unlimited ingestion capacity. The bottleneck is the per-game-server upload bandwidth — each server uploading at 20 KB/second is negligible. S3 multi-part upload (5 MB parts every 250 seconds) is appropriate for replay file sizes. Hot replay cache: with 10% of replays getting 90% of views, caching the top 10k replay files (S3 + local SSD on playback servers) handles most playback requests without S3 fetches.

Replay playback server scaling: 500k simultaneous viewers each require a dedicated playback server process (same resource cost as a live game session). A pool of 5,000 replay server instances (each handling 100 viewers on different replays) is required at peak, auto-scaling based on active playback session count. Video rendering (for highlight exports) uses a separate GPU fleet (one render job per CPU or GPU core at 1× speed) — queue-based with expected render time displayed to users.

Key Trade-offs

  • Input-based replay vs. video recording: Input-based replay files are tiny (50 MB for 30 minutes) and allow perspective switching, but require the game simulation to be deterministic (same inputs always produce same output) — hard to guarantee with floating-point physics; video recording is always accurate but is 2 GB/30 minutes and locks the perspective.
  • Server-side playback vs. client-side: Server-side playback re-uses the game simulation on a dedicated server and streams state to the viewer, providing perspective freedom and anti-cheat analysis capability; client-side playback (sending the replay file to the client to simulate locally) saves server resources but exposes the full game state (including hidden information) to the client.
  • 5-second live spectator delay vs. zero delay: A 5-second delay for spectators prevents real-time scouting (spectators relaying live enemy positions to playing teammates) while still providing an exciting spectator experience; competitive events may require longer delays (30-60 seconds).
  • Keyframe density vs. storage overhead: Keyframes every 5 seconds allow fast scrubbing but add 10-20% overhead to replay file size; keyframes every 30 seconds reduce storage cost but make scrubbing slower (potentially 25-second fast-forward to reach the target timestamp).

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.