System Design: Google Maps

Requirements

Functional Requirements:

Users can search for places, addresses, and businesses by name or category
System renders interactive map tiles at any zoom level worldwide
Turn-by-turn navigation with real-time traffic rerouting
Users can report incidents (accidents, road closures, speed traps)
ETA calculations account for historical and live traffic data
Offline maps downloadable for specific regions

Non-Functional Requirements:

Map tiles served in under 100ms globally via CDN edge nodes
Routing computation returns results in under 2 seconds for any global pair
99.99% availability — navigation failures are safety-critical
Support 1 billion active users; 100 million concurrent navigation sessions
Real-time traffic updates propagated to all clients within 60 seconds

Scale Estimation

Google Maps serves ~1 billion users/month. At peak, ~100 million users are navigating concurrently. Each navigation session emits a location ping every 3 seconds (~33 million pings/second system-wide). Map tiles: at zoom 21 the entire planet requires ~4.5 trillion tiles; popular tiles (downtown cores, highways) are cached at edge nodes. Tile cache hit rate exceeds 99% for the top 0.1% of tiles that account for 80% of traffic. The tile storage footprint is ~20 PB; routing graph for the whole road network compresses to ~500 GB using contraction hierarchies.

High-Level Architecture

Google Maps decomposes into four major subsystems: Tile Pipeline (map rendering), Search & Geocoding, Routing Engine, and Real-Time Traffic. These are served through a unified API gateway to mobile and web clients.

The Tile Pipeline ingests raw geospatial data (OpenStreetMap contributions, satellite imagery, Street View imagery, and licensed data) through a batch processing pipeline built on Flume/Dataflow. Vector data is rendered into raster and vector tiles at each zoom level (0–21) and stored in a distributed object store (Colossus). Tiles are pushed to a global CDN (Google's own edge network, 200+ PoPs) so that cache-hit requests are served in under 20ms. Dynamic tiles (traffic overlays, business hours) bypass the static cache and are composed at a tile composition layer.

The Routing Engine maintains a compressed road graph updated daily from authoritative map data plus near-real-time edits. Contraction hierarchies preprocess the graph so that any A→B shortest path query over 1 billion nodes runs in milliseconds. A separate Traffic Model layer blends historical speed profiles (by hour-of-week) with live probe data (anonymous GPS pings from Android devices) to produce predicted travel times per road segment.

Core Components

Tile Serving Infrastructure

Tiles are identified by (zoom, x, y) tuples and stored in Bigtable with a space-filling curve (Hilbert curve) key ordering for spatial locality. Hot tiles live in an in-memory Memcached layer. CDN edge nodes cache tiles with a 7-day TTL for static base tiles and a 60-second TTL for traffic-overlay tiles. When a client pans or zooms, the app pre-fetches a ring of adjacent tiles (typically 3×3 minus center = 8 tiles) to mask loading latency.

Routing Engine

The road network is modeled as a directed weighted graph: nodes are road intersections, edges are road segments with attributes (speed limit, road class, turn restrictions, toll cost). Contraction hierarchies reduce query time from O(V log V) Dijkstra to sub-millisecond for typical city-to-city routes. The engine considers multiple objectives: fastest, shortest, and eco-routing (minimizes fuel consumption). For multi-modal routing (drive + transit), a combined graph merges road segments with GTFS transit schedules.

Real-Time Traffic System

Android devices with location consent passively send anonymized probe data (GPS coordinates, speed, heading) to a Traffic Ingestion Service at ~30 million pings/second. A stream processing layer (Apache Beam on Dataflow) aggregates pings by road segment using map-matching (snapping GPS traces to the road graph) and computes observed speeds. These are fused with historical speed models using a Kalman filter, producing a current speed estimate per segment updated every 2 minutes. Incident reports from users trigger manual speed overrides for affected segments.

Database Design

The core road graph is stored in Spanner for transactional updates from the map editing pipeline, then exported to a read-optimized binary format (custom compressed adjacency lists) loaded into Routing Service memory. Place data (businesses, landmarks) lives in a Spanner table with full-text search backed by a custom inverted index. User data (saved places, history, preferences) is sharded across Spanner by user_id. Tile metadata in Bigtable uses row key = (zoom_level)(hilbert_index) for range scans that match client viewport requests.

API Design

GET /maps/api/directions/json?origin=…&destination=…&mode=driving — Returns step-by-step route, polyline, total distance, and ETA with traffic
GET /vt/pb?x={x}&y={y}&z={z} — Tile server endpoint returns protocol-buffer vector tile; cached aggressively at CDN
GET /maps/api/place/textsearch/json?query=… — Searches for places by name/category, returns list with coordinates and metadata
POST /maps/api/traffic/report — Accepts user-submitted incident reports with type, coordinates, and optional photo

Scaling & Bottlenecks

Tile serving scales through CDN edge caching — the origin fleet only handles cache misses and dynamic tile composition. Since >99% of base tile traffic is served from edge, origin scale is modest (~10,000 servers for the long tail). Dynamic traffic tiles are the scaling challenge: they must be composed fresh every 60 seconds for millions of viewport combinations. Google solves this by pre-rendering traffic tiles for the top-K most-viewed viewports and computing others on demand with aggressive negative caching.

Routing at scale uses a hierarchical approach: for long-distance routes, queries are decomposed into a highway-only graph (contracted to ~50 million edges from a full ~8 billion edge graph), then local first/last mile segments are appended. This reduces query time from seconds to under 500ms. The routing fleet is CPU-bound; Google runs ~50,000 dedicated routing servers globally, each holding the full contracted graph in RAM (~500 GB compressed per continent shard).

Key Trade-offs

Raster vs. vector tiles — vector tiles are smaller and scale well on retina displays but require client-side rendering; Google serves both, defaulting to vector on capable devices
Precomputed routes vs. on-demand — contraction hierarchies require expensive preprocessing (hours for a continent) but deliver millisecond query times; this trade-off favors precomputation given query volume
Privacy vs. traffic accuracy — more probe data = better traffic; Google anonymizes traces with differential privacy noise, accepting slight accuracy degradation
Freshness vs. CDN cost — longer tile TTLs save origin cost but delay map updates (new roads, business changes); Google uses cache invalidation for specific tiles when map edits are published