Distributed Caching Patterns: Invalidation, Cache Stampedes & CQRS Integration

Caching is the oldest trick in computer science: remember expensive computations so you do not repeat them. But distributed caching—where multiple servers compete for a shared cache—introduces subtle problems that do not exist in single-machine systems. Cache invalidation, the famous joke says, is one of the hardest problems in computer science. When you combine that with thousands of servers pounding a cache, cache stampedes appear: a single expired cache entry causes 1000 simultaneous database hits. This article dissects these patterns, their failure modes, and the strategies that production systems use to handle them.

Md Sanwar Hossain
Md Sanwar Hossain

Software Engineer · System Design · Performance

Caching March 26, 2026 20 min read System Design Series
Distributed caching architecture and patterns

The Cache Stampede Incident

A video streaming platform cached popular movie metadata (title, duration, recommendations) with a 1-hour TTL. At 15:00 UTC, millions of instances had the same cache entry. At 16:00 UTC, the entry expired. In the next 100 milliseconds, 50,000 concurrent requests asked "what is the metadata for movie X?" All found the cache empty. All queried the database simultaneously. The database, expecting a few hundred metadata queries per second, received 50,000 in 100ms and crashed. For the next 30 minutes, the entire platform was down. The fix: request coalescing—when a cache miss occurs, the first request computes the value while subsequent requests wait for it.

Cache Patterns: Aside, Through, Behind

Cache-Aside Pattern (Lazy Loading)

Application code is responsible for populating the cache. On a cache miss, the application loads from the source (database) and writes to the cache. On a subsequent hit, the cache is served directly.

public Product getProduct(String id) {
    // Check cache first
    Product cached = cache.get("product:" + id);
    if (cached != null) return cached;
    
    // Cache miss: fetch from database
    Product product = database.getProduct(id);
    
    // Populate cache with 1-hour TTL
    cache.set("product:" + id, product, Duration.ofHours(1));
    return product;
}

Pros: Simple, works with any database, only caches accessed data.

Cons: Cache misses cause database hits, vulnerable to stampedes, requires cache invalidation logic in application code.

Write-Through Pattern

On writes, data is written to both the cache and the database simultaneously. Reads hit the cache (which is kept warm).

public void saveProduct(Product product) {
    // Write to cache and database together
    cache.set("product:" + product.id, product);
    database.saveProduct(product);
}

Pros: Cache stays consistent with the database, reads are always fast, no stampedes.

Cons: Writes are slower (dual write), cache is populated with data that may never be read (write amplification).

Write-Behind Pattern (Write-Back)

Writes go to the cache immediately (fast) and are asynchronously flushed to the database later (eventual consistency).

public void saveProduct(Product product) {
    // Write only to cache (instant)
    cache.set("product:" + product.id, product);
    
    // Async flush to database after 5 seconds or when batch reaches 100 items
    flushService.enqueue(product);
}

Pros: Writes are extremely fast, throughput is limited only by database batch capacity.

Cons: Data loss if cache crashes before flushing, eventual consistency can cause user-facing inconsistencies, complex failure handling.

Solving the Cache Stampede

Solution 1: Request Coalescing

When multiple threads detect a cache miss simultaneously, only one computes the value while others wait for it. This serializes the computation and prevents multiple database queries.

private final Map> inFlight = new ConcurrentHashMap<>();

public Product getProduct(String id) {
    String cacheKey = "product:" + id;
    
    // Check cache
    Product cached = cache.get(cacheKey);
    if (cached != null) return cached;
    
    // Coalesce: only one thread computes
    CompletableFuture future = inFlight.computeIfAbsent(cacheKey, key -> {
        return CompletableFuture.supplyAsync(() -> {
            Product product = database.getProduct(id);
            cache.set(cacheKey, product, Duration.ofHours(1));
            return product;
        });
    });
    
    try {
        return future.get(); // Wait for computation
    } finally {
        inFlight.remove(cacheKey);
    }
}

Solution 2: Probabilistic Early Expiration (Xfetch)

Instead of expiring at a fixed TTL, start refreshing before expiry with some probability. If a key is accessed within 5 minutes of expiry and random() < probability, refresh it from the database. This spreads the load and prevents thundering herds.

public Product getProduct(String id) {
    CachedValue cached = cache.get("product:" + id);
    
    if (cached == null) {
        return fetch(id); // Complete miss
    }
    
    long secondsUntilExpiry = cached.expiresAt - now();
    
    // Proactively refresh if close to expiry and probability check passes
    if (secondsUntilExpiry < 300 && Math.random() < 0.01) {
        CompletableFuture.runAsync(() -> fetch(id)); // Non-blocking refresh
    }
    
    return cached.value;
}

Solution 3: Stale-While-Revalidate

Serve stale data from the cache while recomputing in the background. Users see data instantly; eventual consistency is tolerated for non-critical data.

public Product getProduct(String id) {
    CachedValue cached = cache.get("product:" + id);
    
    if (cached != null && !cached.isExpired()) {
        return cached.value; // Fresh
    }
    
    if (cached != null && cached.isStale()) {
        // Return stale but trigger refresh
        CompletableFuture.runAsync(() -> refresh(id));
        return cached.value; // Return stale immediately
    }
    
    // Complete miss: must compute
    return fetch(id);
}

Cache Invalidation Strategies

Time-Based Expiry (TTL)

Entries expire after a fixed duration. Simple but can tolerate staleness up to the TTL window.

// Cache entries expire after 1 hour
cache.set("user:123", user, Duration.ofHours(1));

Event-Based Invalidation

When data changes (e.g., a user updates their profile), publish an event. Cache subscribers hear the event and invalidate the entry.

@Service
public class UserService {
    
    public void updateUser(User user) {
        database.saveUser(user);
        
        // Publish invalidation event
        eventBus.publish(new UserUpdatedEvent(user.id));
    }
}

@Component
public class CacheInvalidator {
    
    @EventListener
    public void onUserUpdated(UserUpdatedEvent event) {
        cache.delete("user:" + event.userId);
    }
}

Pattern: Active-Active Invalidation

In a distributed system, different cache nodes may have the same entry. A single write should invalidate the entry on all nodes. This requires:

CQRS Integration: Separating Reads and Writes

Command Query Responsibility Segregation (CQRS) splits your data model into write and read models. The write model is optimized for transactions; the read model is optimized for caching and queries.

// Write model: normalized, transactional
@Entity
public class User {
    @Id Long id;
    String email;
    String name;
}

// Read model: denormalized, cached
@Data
public class UserReadModel {
    Long id;
    String email;
    String name;
    List recentOrders;       // Denormalized from Order table
    int orderCount;                   // Precomputed aggregate
    String membershipTier;            // Denormalized
}

// Reads hit the read model (which is in Redis)
@Service
public class UserQuery {
    public UserReadModel getUserProfile(Long id) {
        return redisCache.get("user:profile:" + id, UserReadModel.class);
    }
}

// Writes go to the write model; changes are published to update read models
@Service
public class UserCommand {
    
    public void updateUserEmail(Long id, String email) {
        // Write to the transactional write model
        user.setEmail(email);
        writeDatabase.save(user);
        
        // Publish a domain event
        eventBus.publish(new UserEmailChangedEvent(id, email));
    }
}

// Read model is kept in sync via event subscribers
@Component
public class UserReadModelUpdater {
    
    @EventListener
    public void onUserEmailChanged(UserEmailChangedEvent event) {
        UserReadModel model = readModel.get(event.userId);
        model.setEmail(event.newEmail);
        redisCache.set("user:profile:" + event.userId, model);
    }
}

Consistency Models in Distributed Caches

Strong Consistency

All readers see the same value immediately after a write. Achieved via synchronous replication or write-through patterns. Slow but correct.

Eventual Consistency

After a write, readers may see stale data for a brief period. Data converges to the correct value after the update propagates. Fast but temporarily inconsistent. Acceptable for most non-critical data (user preferences, recommendations, product metadata).

Causal Consistency

Operations that are causally related appear in order. If A writes X and then B reads X and writes Y, all readers see X before Y. Complex to implement but valuable for data with dependencies.

TTL Optimization: Balancing Freshness and Load

A short TTL (5 minutes) means stale data is bounded but cache misses are frequent. A long TTL (24 hours) reduces misses but increases staleness. In production, TTL should vary by data type:

Monitoring Cache Health

Observe:

Caching is not optional in modern systems; it is essential for performance. But caching incorrectly is worse than no caching at all. Master these patterns, and your system will handle millions of concurrent requests.

Key Takeaways

Tags:

caching patterns cache invalidation cache stampede CQRS Redis

Read More

Explore related deep-dives on system design and performance:

Discussion / Comments

Related Posts

System Design

Database Sharding

Horizontal scaling with consistent hashing and resharding.

System Design

Database Replication

Leader-follower and consistency models.

Database

Transaction Isolation

ACID properties and race condition prevention.

Last updated: March 2026 — Written by Md Sanwar Hossain