System Design

Designing a Maps & Navigation System at Scale: Google Maps Architecture, Routing & Live Traffic

Google Maps serves over 1 billion users monthly, answering routing queries in milliseconds, rendering interactive maps with billions of tiles, and incorporating live traffic from millions of GPS-traced devices. This guide covers the complete system: from how map tiles are generated and cached, to routing graph algorithms, ETA prediction with ML, and geocoding at global scale.

Md Sanwar Hossain April 6, 2026 20 min read System Design
Google Maps system design: tile rendering, routing algorithms, live traffic, ETA prediction, and geocoding at scale

TL;DR — Core Architecture Decisions

"A maps platform needs: (1) a tile rendering pipeline that pre-generates billions of PNG/WebP map tiles at zoom levels 0–21 and serves them from a global CDN, (2) a road graph (OpenStreetMap-scale: 1B+ nodes) pre-processed with CH (Contraction Hierarchies) for sub-millisecond routing, (3) live traffic ingestion from GPS probes via Kafka into a stream processor that updates edge weights in real time, (4) an ETA model combining graph travel time + ML corrections for traffic patterns, and (5) a geocoding service that resolves place names ↔ coordinates using a multi-layer index (exact → prefix → fuzzy)."

Table of Contents

  1. System Architecture Overview
  2. Map Data Ingestion & Processing Pipeline
  3. Tile Rendering & CDN Delivery
  4. Road Graph & Routing Algorithms
  5. Live Traffic Ingestion & Graph Updates
  6. ETA Prediction: From Dijkstra to ML
  7. Geocoding & Reverse Geocoding
  8. Places Search & POI Data
  9. Turn-by-Turn Navigation
  10. Scale, Fault Tolerance & Cost
  11. Capacity Estimation & Conclusion

1. System Architecture Overview

A maps platform consists of several distinct subsystems that share geographic data but serve fundamentally different user needs:

Google Maps system architecture: map data pipeline, tile rendering CDN, routing graph, live traffic, geocoding, and navigation engine
Complete maps & navigation platform architecture — from raw geo data to real-time turn-by-turn guidance. Source: mdsanwarhossain.me

2. Map Data Ingestion & Processing Pipeline

The foundation of any maps platform is geographic data. Google Maps combines dozens of data sources into a unified, authoritative representation of the physical world.

Data Sources

Data Processing Pipeline

All data sources feed into a batch processing pipeline (Google uses MapReduce/Dataflow) that: normalizes formats → detects conflicts between sources → applies authority ranking (official data beats crowdsourced) → validates topology (roads must connect, no orphan nodes) → generates the canonical graph and place database. This pipeline runs continuously; changes from Street View ML detectors or user edits trigger incremental re-processing of affected regions.

3. Tile Rendering & CDN Delivery

The classic map tile system divides the world into a grid of square images at each zoom level. At zoom level 0, the entire world fits in a single 256×256 pixel tile. Each zoom level quadruples the number of tiles. At zoom level 21 (street level), there are 4.4 trillion tiles.

Tile Generation Pipeline

  1. Style definition: Cartographers define styling rules: which features appear at which zoom level, colors, fonts. Google Maps uses a proprietary style language; open-source systems use MapboxGL styles.
  2. Batch rendering: A MapReduce job iterates over all (zoom, x, y) tile coordinates for the changed region, renders each tile using the geographic data + style, outputs PNG/WebP images.
  3. Tile storage: Tiles are stored in a hierarchical key-value store keyed by zoom/x/y. At zoom >15, only populated tiles (non-empty oceans) are stored — sparsity drops storage by 60%+.
  4. CDN push: Changed tiles are pushed to the CDN. With 200+ PoPs, a map tile for any city is cached <50ms from most users.

Vector Tiles: The Modern Approach

Google Maps shifted to vector tiles on mobile in 2013. Instead of sending pre-rendered images, the server sends raw geographic data (road outlines, building footprints) as binary-encoded vectors. The client renders them locally using WebGL (MapboxGL) or native GL. Benefits: 10× smaller payload, smooth rotation/tilt, runtime style changes, and 3D extrusion of buildings. The tile data is still served from CDN; rendering computation shifts to the client GPU.

Road graph routing algorithm architecture: contraction hierarchies, live traffic overlay, ETA prediction, and turn-by-turn navigation
Road graph routing with Contraction Hierarchies, live traffic edge-weight updates, and ML-based ETA correction. Source: mdsanwarhossain.me

4. Road Graph & Routing Algorithms

Routing is a shortest-path problem on a directed weighted graph where nodes are intersections and edges are road segments. The challenge: the global road network has 1 billion+ nodes — naive Dijkstra runs in O((V + E) log V) and cannot return results fast enough for global queries.

Contraction Hierarchies (CH)

Contraction Hierarchies is the industry-standard technique for fast routing on large road networks. It precomputes a hierarchical overlay graph:

  1. Node ordering: Assign each node an "importance" rank (based on degree, edge density, hierarchy in the road network). Highways and major roads get higher ranks.
  2. Contraction: Remove low-rank nodes one by one, adding "shortcut" edges that preserve the shortest path through the removed node. The graph gains shortcut edges but becomes much sparser at higher ranks.
  3. Bidirectional Dijkstra: At query time, run Dijkstra forward from the source and backward from the destination, relaxing only edges to higher-rank nodes. The two searches meet at a "meeting node," yielding the shortest path in milliseconds — even across continents.
Algorithm Preprocessing Query Time Traffic Updates
Dijkstra (basic) None Seconds (global) Trivial
A* with heuristic None 100ms–2s (global) Trivial
Contraction Hierarchies Hours (one-time) <10ms (global) Complex (re-weight)
CH + Customizable (CCH) Hours (structure), seconds (weights) <10ms (global) Fast (re-weight structure)

5. Live Traffic Ingestion & Graph Updates

Google Maps computes live traffic by aggregating anonymized GPS traces from Android devices with location sharing enabled. On busy roads, hundreds of cars per minute serve as live speed sensors — far more granular than fixed traffic cameras.

Traffic Processing Pipeline

# Traffic ingestion flow
Millions of Android devices → GPS trace batches (every 30s)
         │
         ▼
Kafka topic "gps-traces" (500k events/sec)
         │
         ▼
Stream processor (Flink/Dataflow):
  1. Map-match: snap GPS coordinates to nearest road segment
  2. Compute speed: distance between consecutive points / time
  3. Aggregate: median speed per road segment per 1-minute window
  4. Detect incidents: speed < 10% of baseline → flag potential incident
         │
         ▼
Traffic graph store (Redis): edge_id → current_speed, congestion_level
         │
         ▼
Routing service reads edge weights from traffic graph at query time

6. ETA Prediction: From Dijkstra to ML

ETA prediction is a multi-layer problem. The base estimate comes from graph traversal (sum of edge travel times on the computed path), but this is systematically biased — traffic changes over the course of a trip, drivers stop at traffic lights, and historical patterns matter.

ETA Correction Layers

  1. Base graph ETA: Sum of (edge_length / current_speed) for all edges on the route. Includes typical turn delay constants (e.g., left turns average 12 seconds more than right turns in US cities).
  2. Traffic progression model: Predict how traffic will change during the trip using historical patterns by time-of-day, day-of-week, and special events. A trip starting at 4:55pm might encounter rush-hour traffic that doesn't exist when the query is made.
  3. ML correction: A gradient-boosted tree model trained on historical trip records adjusts the graph estimate based on: distance, time of day, day of week, origin/destination types, predicted weather, and error patterns for that specific route.
  4. Real-time updates: As the trip progresses, the ETA is continuously recalculated with fresh traffic data for remaining segments.

7. Geocoding & Reverse Geocoding

Geocoding converts a human-readable address ("1600 Amphitheatre Pkwy, Mountain View, CA") to coordinates (37.4224, -122.0849). Reverse geocoding goes the other way. Both must handle abbreviations, typos, international address formats, and partial addresses.

Multi-Layer Geocoding Index

Google Maps indexes hundreds of millions of places (businesses, landmarks, public facilities). A "coffee shop near me" query must combine text search with geographic proximity and quality ranking in under 100ms.

Places Search Architecture

9. Turn-by-Turn Navigation

Navigation differs from routing: instead of returning a single path, the system must guide the user in real time, detecting when they deviate and re-routing instantly.

Client-Server Split

The initial route and turn-by-turn instructions are downloaded to the client at session start. The client executes navigation logic locally (position matching to route, upcoming turn detection) without server calls — this enables offline navigation and reduces server load. Server involvement during active navigation:

10. Scale, Fault Tolerance & Cost

Tile Serving

  • 99%+ of tile requests served from CDN cache
  • Cache TTL: 24h for static tiles; 5min for tiles with live traffic overlays
  • Tile storage: ~10 PB (all zoom levels, all regions)
  • On CDN miss: origin serves from GCS; re-renders only if data changed

Routing Service

  • Road graph loaded entirely in RAM per routing server (~50 GB)
  • Routing servers are stateless; any server can answer any query
  • Queries: 10M+ routing requests/day; peak ~500 QPS per server
  • Routing graph updates: edge weights re-computed from traffic every 2min

11. Capacity Estimation & Conclusion

Back-of-Envelope: Tile Requests

  • 1 billion MAU × 5 map sessions/day × 20 tile requests/session = 100 billion tile requests/day
  • 100B ÷ 86,400 sec = ~1.16 million tile requests/second
  • With 99% CDN cache hit ratio: 11,600 tile requests/second reach origin
  • Each tile ≈ 15 KB (vector tile) → 1.16M × 15KB = 17.4 GB/s egress from CDN

A maps platform is uniquely interesting because it is read-dominated at massive scale (tile serving), but has critical write-intensive subsystems (traffic ingestion). The CDN handles the read load elegantly; the routing system's use of Contraction Hierarchies is one of the most elegant graph algorithm applications in production. The ETA prediction system — combining graph algorithms with ML corrections — is a template for how mathematical models and machine learning can be layered productively.

Leave a Comment

Related Posts

Md Sanwar Hossain - Software Engineer
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices · System Design

All Posts
Last updated: April 6, 2026