Real-time leaderboard system design - gaming score ranking at global scale
Md Sanwar Hossain - Software Engineer
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices

System Design March 19, 2026 20 min read System Design at Scale Series

Designing a Real-Time Leaderboard at Global Scale: Redis Sorted Sets, Sharding, and Consistency Trade-offs

The World Cup of the most popular mobile game in 2024 attracted 10 million simultaneous players. For 90 minutes, every point scored needed to appear on a global ranking visible to all players. The engineering team's single Redis instance — which had handled previous events just fine — began experiencing 800ms write latency within the first five minutes. The leaderboard froze. Players complained. The problem was not Redis per se, but a fundamental misunderstanding of single-node limits and the lack of any sharding or read path optimization. This post is the system design guide the team needed.

Table of Contents

  1. Requirements Analysis
  2. Redis Sorted Sets Internals
  3. Single Node Limits and Failure Points
  4. Sharding Strategies for Leaderboards
  5. Geo-Distributed Leaderboards
  6. Caching the Read Path
  7. Write Path Optimization
  8. Consistency Trade-offs
  9. Failure Scenarios
  10. Scaling Beyond Redis
  11. Key Takeaways

1. Requirements Analysis

Functional requirements: Submit a score update for a user; query the global top-N players; query a specific user's rank; query the N players surrounding a given user on the leaderboard; support time-windowed leaderboards (daily, weekly, all-time).

Non-functional requirements: 10 million concurrent active users; 1 million score updates per second during peak; sub-50ms p99 read latency for top-100 queries; sub-100ms p99 for user rank lookup; 99.99% availability; support global distribution across 5 regions.

Data estimates: 10M users × ~50 bytes per entry (user ID + score + metadata) = ~500MB for a single leaderboard. For 3 time-window leaderboards (daily, weekly, all-time) = ~1.5GB per shard. This is well within Redis memory limits per node, but write throughput of 1M ops/sec is the binding constraint.

2. Redis Sorted Sets Internals

A Redis Sorted Set (zset) is implemented as a dual data structure: a skiplist for ordered rank operations and a hashtable for O(1) score lookup by member. This hybrid gives excellent performance characteristics for leaderboard operations:

For the "show me the players just above and below me" use case, combine ZREVRANK with ZREVRANGE using the computed rank as an offset — both operations together still run in O(log N + M) time, fast enough for sub-5ms response at modest scale.

// Spring Boot with Lettuce Redis client
@Service
public class LeaderboardService {
    private final RedisTemplate<String, String> redisTemplate;
    private static final String GLOBAL_KEY = "leaderboard:global";

    // Add/update score - atomic O(log N)
    public void updateScore(String userId, double delta) {
        redisTemplate.opsForZSet().incrementScore(GLOBAL_KEY, userId, delta);
    }

    // Get top N players
    public Set<ZSetOperations.TypedTuple<String>> getTopN(int n) {
        return redisTemplate.opsForZSet()
            .reverseRangeWithScores(GLOBAL_KEY, 0, n - 1);
    }

    // Get user rank (0-based, 0 = #1)
    public Long getUserRank(String userId) {
        return redisTemplate.opsForZSet().reverseRank(GLOBAL_KEY, userId);
    }

    // Get N players around a user
    public Set<ZSetOperations.TypedTuple<String>> getNeighbors(
            String userId, int windowSize) {
        Long rank = getUserRank(userId);
        if (rank == null) return Collections.emptySet();
        long start = Math.max(0, rank - windowSize);
        long end = rank + windowSize;
        return redisTemplate.opsForZSet()
            .reverseRangeWithScores(GLOBAL_KEY, start, end);
    }
}

3. Single Node Limits and Failure Points

A single Redis instance has practical limits that become the ceiling for any non-trivial leaderboard:

4. Sharding Strategies for Leaderboards

Hash-Based Sharding (Anti-Pattern for Rankings)

Hash the user ID modulo number of shards: shard = hash(userId) % N. Each shard holds a fraction of users. Write throughput scales linearly. However, computing global rank requires querying all shards for the top scores and merging them — an O(N_shards) scatter-gather on every rank read. For top-100 reads at 10,000 reads/sec, this creates massive read amplification.

Hierarchical Aggregation (Recommended)

Partition users across N write shards by hash. Maintain a separate aggregation layer: a periodic (e.g., every 1–5 seconds) job merges the top-K scores from each write shard into a global leaderboard shard. This global shard serves all rank reads. The trade-off: the global leaderboard is eventually consistent with a lag equal to the aggregation interval — typically acceptable for games where millisecond rank precision is not required.

For user rank lookups (not just top-N), maintain a sorted structure per shard and compute approximate global rank as: global_rank ≈ Σ(count of users with higher scores in each shard). This approach scales writes horizontally while keeping reads fast.

Time-Window Leaderboards

Maintain separate sorted sets per time window: leaderboard:daily:2026-03-19, leaderboard:weekly:2026-W12, leaderboard:alltime. Use EXPIRE on daily and weekly keys to auto-evict stale data. Write each score update to all applicable keys atomically via a Lua script or pipeline to avoid partial updates.

-- Lua script for atomic multi-leaderboard update
local globalKey = KEYS[1]
local dailyKey = KEYS[2]
local weeklyKey = KEYS[3]
local userId = ARGV[1]
local delta = tonumber(ARGV[2])

redis.call('ZINCRBY', globalKey, delta, userId)
redis.call('ZINCRBY', dailyKey, delta, userId)
redis.call('ZINCRBY', weeklyKey, delta, userId)
return redis.call('ZREVRANK', globalKey, userId)

5. Geo-Distributed Leaderboards

For global events, maintaining a single leaderboard in one region forces all writes to cross the internet. A player in Singapore writing to a US-East Redis primary experiences 150–200ms of write latency — unacceptable for real-time gameplay.

Regional shards accept writes locally and sync to a global aggregation layer asynchronously. Each region maintains its own leaderboard (e.g., leaderboard:apac, leaderboard:emea, leaderboard:amer). A global aggregation service — running in each region — periodically merges regional top-scores into leaderboard:global. Players always write to their regional shard (sub-10ms), and global rankings reflect the aggregated state with a configurable staleness window (1–10 seconds).

Conflict resolution: Since scores are monotonically increasing (in most games, you can't lose points), the merge strategy is simple: take the maximum score per user across all regional shards. For games where scores decrease, use a last-writer-wins strategy with timestamp comparison, or maintain per-event sourcing.

6. Caching the Read Path

The top-100 leaderboard is read orders of magnitude more than it is updated. Pre-compute and cache a serialized snapshot:

7. Write Path Optimization

Pipelining: Instead of one round-trip per score update, batch multiple ZADD/ZINCRBY commands in a single pipeline. The Lettuce Redis client supports automatic pipelining. Batching 100 updates in one pipeline reduces network round-trips by 100x.

// Batch pipeline with Lettuce
public void batchUpdateScores(Map<String, Double> scoreDeltas) {
    redisTemplate.executePipelined((RedisCallback<Object>) connection -> {
        scoreDeltas.forEach((userId, delta) ->
            connection.zIncrBy(
                GLOBAL_KEY.getBytes(),
                delta,
                userId.getBytes()));
        return null;
    });
}

Kafka buffer: For extreme write volumes, publish score events to Kafka and have a Kafka consumer aggregate scores in memory per time window (e.g., 100ms tumbling window) before flushing to Redis. This converts 1M individual Redis writes/sec into 10K batched pipeline flushes/sec — a 100x write reduction. The trade-off is increased write latency (100ms aggregation window) and operational complexity.

8. Consistency Trade-offs

The CAP theorem forces a choice between consistency and availability for leaderboards across network partitions. In practice, eventual consistency with bounded staleness is the right model for almost all leaderboard use cases:

Storage Write Throughput Read Latency Rank Query Best For
Redis Sorted Set 100K-200K/s (single) <1ms O(log N) <5M users, single region
Redis Cluster (sharded) 1M+/s <5ms O(shards × log N) 10M+ users, high write load
PostgreSQL (window fn) 10K-50K/s 10-100ms O(N log N) Audit trail, complex queries
Apache Flink + Redis 10M+/s <10ms (cached) Pre-aggregated Streaming, time-windowed

9. Failure Scenarios

Redis primary failure: Redis Sentinel or Redis Cluster handles automatic failover to a replica in ~30 seconds. During failover, writes fail. The application should queue writes in Kafka and replay them after failover. Reads can be served from replica (which may be slightly stale) with a READONLY flag.

Split-brain in Redis Cluster: Redis Cluster uses a quorum of masters — a partition that isolates a primary without a quorum causes it to stop accepting writes. The minority side can serve stale reads. Mitigate by enabling cluster-require-full-coverage no for partial availability, or accept the brief write outage.

During outage — serve last known snapshot: Maintain a materialized snapshot of the top-100 leaderboard in a durable store (PostgreSQL or S3) that is refreshed every 10 seconds. During Redis downtime, serve this stale snapshot with a "Last updated: X seconds ago" indicator rather than returning an error.

10. Scaling Beyond Redis

At true internet scale — hundreds of millions of users with complex analytics needs — Redis alone becomes insufficient:

"A leaderboard is one of the simplest real-time data structures to implement and one of the hardest to scale. Redis gives you the primitives; architecture gives you the scale."

11. Key Takeaways

Architecture Diagram Idea

Three-tier diagram: (1) Write tier — Game Servers → Kafka topic → Score Consumer → 8x Redis Write Shards (each holding 1/8 of users). (2) Aggregation tier — Aggregation Service polls all write shards every 2s, computes top-10K globally, writes to Redis Global Leaderboard shard. (3) Read tier — API Gateway → App Cache (Caffeine, 1s TTL) → Redis Global Shard → Fallback: PostgreSQL snapshot. Show regional replication arrows for APAC, EMEA, AMER.

Related Posts

Consistent Hashing Ring Design Database Sharding Strategies Redis Distributed Locking in Production Rate Limiting & Caching Patterns

Discussion / Comments

Have experience building leaderboards at scale? Share your approach below.

Last updated: March 2026 — Written by Md Sanwar Hossain