Microservices

Redis Caching Patterns for Microservices: Cache-Aside, Write-Through & Distributed Locking 2026

Q: How does the TL;DR — Caching Strategy in One Sentence work?

"Start with cache-aside (lazy loading + @Cacheable) for read-heavy services. Use write-through when cache consistency is critical. Add Redisson distributed locks to prevent cache stampede. Graduate to Redis Cluster only after your data exceeds a single node's memory capacity or you need multi-AZ HA."

Q: What is Database Load Reduction and how does it work?

Without caching, popular endpoints create a thundering herd against your primary database. A viral product page with 10,000 concurrent users generates 10,000 identical SELECT queries for the same product row. A Redis cache with a 60-second TTL means that SELECT runs once per minute regardless of traffic. At scale, this is the difference between a healthy database at 30% CPU and a melting database at 95% CPU triggering cascading failures across every service sharing that database host.

Distributed microservices architecture lives and dies by latency. A cache miss that triggers five downstream service calls plus a database query can spike your p99 from 30ms to 800ms. This production-grade guide covers every Redis caching pattern used in high-traffic microservices systems — from lazy-loading cache-aside to distributed locking with Redisson — with real Spring Boot code, tradeoff tables, and operational guidance for 2026.

Md Sanwar Hossain April 8, 2026 21 min read Microservices

Redis caching patterns for microservices — cache-aside, write-through, distributed locking

TL;DR — Caching Strategy in One Sentence

"Start with cache-aside (lazy loading + @Cacheable) for read-heavy services. Use write-through when cache consistency is critical. Add Redisson distributed locks to prevent cache stampede. Graduate to Redis Cluster only after your data exceeds a single node's memory capacity or you need multi-AZ HA."

Why Caching Matters in Microservices
Redis Architecture Overview
Cache-Aside Pattern: Lazy Loading with @Cacheable
Write-Through Pattern: Keeping Cache & DB in Sync
Write-Behind (Write-Back) Pattern: Async Writes
Read-Through Pattern with Spring Cache Abstraction
Cache Invalidation Strategies
Distributed Locking with Redisson & Lettuce
Session Caching with Spring Session + Redis
Redis Cluster for High Availability
Cache Serialization: JSON vs Java Serialization
Monitoring & Eviction Policies

1. Why Caching Matters in Microservices

In a monolithic application, an in-process cache is cheap and effective. In a microservices architecture, the picture changes dramatically. Each service runs in its own process (often on separate pods/nodes), so in-process caches cannot be shared. A user profile read that was previously a single hash-map lookup now travels the network to a dedicated User Service, which in turn queries a PostgreSQL replica, parses the result, and serializes it for the network response — all under the pressure of hundreds of concurrent requests.

The Latency Problem

Consider an e-commerce checkout flow that calls six microservices: Product, Inventory, Pricing, User, Promotions, and Tax. Even at 20ms per uncached service call (p50), the sequential total is 120ms before any business logic runs. Fan-out parallelism helps but introduces orchestration complexity. A Redis cache with sub-millisecond reads collapses those 20ms calls to 0.3–1ms, bringing end-to-end checkout latency well under 50ms.

Database Load Reduction

Without caching, popular endpoints create a thundering herd against your primary database. A viral product page with 10,000 concurrent users generates 10,000 identical SELECT queries for the same product row. A Redis cache with a 60-second TTL means that SELECT runs once per minute regardless of traffic. At scale, this is the difference between a healthy database at 30% CPU and a melting database at 95% CPU triggering cascading failures across every service sharing that database host.

Cascade Failure Protection

When a downstream service goes down in a microservices mesh, the cache becomes a critical resilience layer. A well-designed caching strategy can serve stale data for seconds to minutes (using stale-while-revalidate semantics), giving the failing service time to recover without user-visible errors. Circuit breakers integrated with cache fallbacks can transform hard outages into degraded-but-functional experiences — a foundational practice in services with five-nines SLAs.

Why Redis Wins

Redis dominates distributed caching for microservices in 2026 because of its combination of speed (sub-millisecond operations), rich data structures (strings, hashes, sorted sets, streams), Lua scripting for atomic multi-key operations, built-in pub/sub for event-driven invalidation, cluster mode for horizontal scaling, and first-class Spring Boot integration via Spring Data Redis and Spring Cache abstraction. Alternatives like Memcached offer simplicity but lack data structures, pub/sub, persistence, and cluster-aware clients that Redis provides.

2. Redis Architecture Overview

Before selecting a caching pattern, you must choose the correct Redis deployment topology. The wrong topology is the single most common cause of production Redis failures.

Standalone Mode

A single Redis instance — appropriate only for development, staging, or low-traffic services where data loss on node failure is acceptable. Maximum dataset size is bounded by a single machine's RAM (typically 32–256 GB in production). Failover is manual. Never use standalone mode for production caches with write-through semantics or session storage.

Sentinel Mode

Redis Sentinel provides automated failover for a primary + replica topology. A quorum of Sentinel processes monitors the primary and promotes a replica when the primary fails (typically within 10–30 seconds). Sentinel is the right choice when your dataset fits on a single node but you need HA. AWS ElastiCache in non-cluster mode uses Sentinel-compatible behavior with Multi-AZ replication.

Minimum quorum: 3 Sentinel processes (odd number to avoid split-brain)
Failover time: 10–30 seconds by default (min-replicas-to-write controls write safety)
Client requirement: Sentinel-aware client (Spring Data Redis supports this natively)
Scale ceiling: Single primary's memory; reads scale via replicas

Cluster Mode

Redis Cluster shards data across 16,384 hash slots distributed across multiple primary nodes, each with their own replicas. It enables both horizontal scaling (beyond a single machine's memory) and HA (each primary has at least one replica for automatic failover without Sentinel). Cluster mode is covered in depth in Section 10.

Memory Model

Redis stores all data in RAM with optional persistence via RDB snapshots or AOF (Append-Only File). The memory model has critical implications for caching:

jemalloc allocator: Redis uses jemalloc by default, which can hold fragmented memory. Monitor mem_fragmentation_ratio — above 1.5 indicates significant fragmentation.
Object encoding: Redis automatically selects compact encodings (ziplist, listpack, intset) for small collections. A hash with ≤128 fields each ≤64 bytes uses ziplist, consuming ~10× less memory than a full hash table.
maxmemory: Always set this. Without it, Redis will consume all available RAM until the OOM killer terminates it. Set to 70–80% of available RAM to leave headroom for fragmentation and OS use.

Redis deployment architectures: standalone, sentinel, and cluster modes for microservices — Redis Deployment Architectures — standalone, sentinel, and cluster modes with hash slot distribution. Source: mdsanwarhossain.me

3. Cache-Aside Pattern: Lazy Loading with @Cacheable

Cache-aside (also called lazy loading) is the most widely used caching pattern in microservices. The application is responsible for managing the cache — it is not automatically populated. On a cache miss, the application fetches data from the database, writes it to the cache, and returns it. Subsequent requests hit the cache directly.

How Cache-Aside Works

Application receives a read request for product:42.
Application checks Redis: key exists? → return cached value (cache hit).
Key not found (cache miss) → query database → store result in Redis with TTL → return to caller.
On write: update database → invalidate (delete) the cache key (do not update — prevents stale dual-write races).

Spring Boot Configuration

Add the dependencies and wire up RedisCacheManager with per-cache TTL configuration:

// pom.xml dependencies
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
</dependency>

// CacheConfig.java
@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
        // Default TTL: 10 minutes
        RedisCacheConfiguration defaults = RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(10))
            .serializeKeysWith(RedisSerializationContext.SerializationPair
                .fromSerializer(new StringRedisSerializer()))
            .serializeValuesWith(RedisSerializationContext.SerializationPair
                .fromSerializer(new GenericJackson2JsonRedisSerializer()))
            .disableCachingNullValues();

        // Per-cache TTL overrides
        Map<String, RedisCacheConfiguration> cacheConfigs = new HashMap<>();
        cacheConfigs.put("products",    defaults.entryTtl(Duration.ofMinutes(30)));
        cacheConfigs.put("userProfiles",defaults.entryTtl(Duration.ofMinutes(5)));
        cacheConfigs.put("inventory",   defaults.entryTtl(Duration.ofSeconds(30)));
        cacheConfigs.put("sessions",    defaults.entryTtl(Duration.ofHours(2)));

        return RedisCacheManager.builder(connectionFactory)
            .cacheDefaults(defaults)
            .withInitialCacheConfigurations(cacheConfigs)
            .transactionAware()
            .build();
    }
}

Using @Cacheable, @CachePut, @CacheEvict

@Service
@RequiredArgsConstructor
public class ProductService {

    private final ProductRepository productRepository;

    // Cache-aside read: returns cached value or fetches from DB on miss
    @Cacheable(value = "products", key = "#productId",
               unless = "#result == null")
    public ProductDto getProduct(Long productId) {
        return productRepository.findById(productId)
            .map(ProductMapper::toDto)
            .orElseThrow(() -> new ProductNotFoundException(productId));
    }

    // Write + update cache (use sparingly — prefer eviction)
    @CachePut(value = "products", key = "#result.id")
    public ProductDto updateProduct(Long productId, UpdateProductRequest req) {
        Product product = productRepository.findById(productId)
            .orElseThrow(() -> new ProductNotFoundException(productId));
        ProductMapper.applyUpdate(product, req);
        return ProductMapper.toDto(productRepository.save(product));
    }

    // Evict cache on delete
    @CacheEvict(value = "products", key = "#productId")
    public void deleteProduct(Long productId) {
        productRepository.deleteById(productId);
    }

    // Evict entire cache (use with caution in production)
    @CacheEvict(value = "products", allEntries = true)
    public void clearProductCache() { }
}

TTL Strategy for Cache-Aside

TTL selection is the most important operational decision for cache-aside. Too short and your hit rate collapses; too long and users see stale data after writes. Use this heuristic:

Reference data (product categories, country codes): 60–120 minutes — changes rarely, high read volume
User profiles: 5–15 minutes — changes infrequently, staleness for a few minutes is acceptable
Inventory counts: 15–60 seconds — must be reasonably fresh to avoid overselling
Pricing: 30 seconds to 5 minutes — depends on pricing volatility; use event-driven invalidation for flash sales
Session data: match your application session timeout (30 minutes to 2 hours)

A critical production tip: add jitter (±10–20% random variance) to your TTLs to prevent the cache expiration thundering herd — when thousands of keys set with the same TTL all expire simultaneously, causing a spike of database queries.

// TTL with jitter to prevent mass simultaneous expiration
private Duration ttlWithJitter(Duration base) {
    long jitterMs = (long)(base.toMillis() * 0.15 * Math.random());
    return base.plusMillis(jitterMs);
}

4. Write-Through Pattern: Keeping Cache & DB in Sync

In the write-through pattern, every write updates both the cache and the database synchronously before the write is acknowledged to the caller. Unlike cache-aside where writes evict the cache, write-through keeps the cache always warm with the latest data.

When to Use Write-Through

Data that is read immediately after it is written (e.g., user submits a form and immediately views the result)
Services where cache misses are very expensive (complex joins, cross-service aggregations)
Systems where cache consistency is more important than write throughput
Reference data management services where all readers must see the same version

Dual-Write Risks and Mitigation

Write-through introduces the classic dual-write consistency problem: what happens if the database write succeeds but the cache write fails (or vice versa)? You have three options:

Option 1: DB-first, cache-second (recommended)

Write to the database first. If it succeeds, write to cache. If the cache write fails, log and continue — the worst case is a cache miss on the next read, which is safely handled by cache-aside fallback. This preserves DB as the source of truth. Never use cache-first — a cache write success followed by a DB failure creates an inconsistent cache with no error recovery path.

Option 2: Transactional cache write with Spring Cache + @Transactional

Use transactionAware = true on RedisCacheManager (already shown in the config above). With this flag, @CachePut and @CacheEvict operations are deferred until the Spring transaction commits. If the transaction rolls back, the cache operation is discarded. This eliminates the "successful cache write + rolled-back DB write" inconsistency.

Option 3: CDC-based synchronization (Debezium)

Use Change Data Capture (Debezium + Kafka) to stream database changes to a cache-updating consumer. The application writes only to the DB; the cache is updated asynchronously by the CDC pipeline. This fully decouples write path from cache management and guarantees eventual consistency. See the Outbox Pattern post for implementation details.

Write-Through with Spring Data Redis (Manual)

@Service
@RequiredArgsConstructor
@Transactional
public class UserProfileService {

    private final UserRepository userRepository;
    private final RedisTemplate<String, UserProfileDto> redisTemplate;
    private static final Duration PROFILE_TTL = Duration.ofMinutes(10);

    public UserProfileDto updateProfile(Long userId, UpdateProfileRequest req) {
        // 1. Write to database (inside transaction)
        User user = userRepository.findById(userId)
            .orElseThrow(() -> new UserNotFoundException(userId));
        UserMapper.applyUpdate(user, req);
        UserProfileDto dto = UserMapper.toProfileDto(userRepository.save(user));

        // 2. Write-through to cache after successful DB save
        // (transactionAware cache manager defers this until commit)
        String cacheKey = "userProfiles::" + userId;
        redisTemplate.opsForValue().set(cacheKey, dto, PROFILE_TTL);

        return dto;
    }
}

5. Write-Behind (Write-Back) Pattern: Async Writes

Write-behind (also called write-back) inverts the persistence order: the application writes to the cache first and acknowledges the write immediately. The cache asynchronously flushes dirty data to the database at a later point — in batches or after a delay. This dramatically improves write throughput but introduces data loss risk.

When Write-Behind Makes Sense

High-frequency counters: page view counts, like counts, analytics events — losing a few counts on crash is acceptable
Rate limiting state: token bucket counters can tolerate brief inconsistency
Session activity tracking: "last seen" timestamps, browsing history accumulation
Leaderboard scores: real-time sorted set updates with periodic persistence to RDBMS

Data Loss Risks

Write-behind is unsuitable for financial transactions, order state, or any data where loss of a single write has business impact. If Redis crashes between the cache write and the database flush, that data is gone (unless Redis persistence is enabled). Mitigations:

Enable AOF with appendfsync always to persist every write to disk — reduces write throughput advantage significantly
Use Redis Streams as a durable write buffer that the DB consumer processes sequentially, with consumer group acknowledgements
Set a short flush interval (1–5 seconds) to bound potential data loss window

// Write-behind with Redis INCR (atomic counter accumulation)
// High-frequency view counter — write to Redis first, flush to DB every 60s

@Component
@RequiredArgsConstructor
public class ViewCounterService {

    private final RedisTemplate<String, Long> redisTemplate;
    private static final String KEY_PREFIX = "view_count:product:";

    public void incrementView(Long productId) {
        // Atomic increment — sub-millisecond, no DB hit
        redisTemplate.opsForValue().increment(KEY_PREFIX + productId);
    }

    public Long getViewCount(Long productId) {
        Long count = redisTemplate.opsForValue().get(KEY_PREFIX + productId);
        return count != null ? count : 0L;
    }
}

// Scheduled flusher — flush accumulated counts to DB every 60 seconds
@Component
@RequiredArgsConstructor
public class ViewCountFlusher {

    private final ViewCounterService viewCounterService;
    private final ProductRepository productRepository;
    private final RedisTemplate<String, Long> redisTemplate;

    @Scheduled(fixedRate = 60_000)
    public void flushCountsToDatabase() {
        Set<String> keys = redisTemplate.keys("view_count:product:*");
        if (keys == null || keys.isEmpty()) return;

        for (String key : keys) {
            Long delta = redisTemplate.opsForValue().getAndDelete(key);
            if (delta != null && delta > 0) {
                Long productId = Long.parseLong(key.replace("view_count:product:", ""));
                productRepository.incrementViewCount(productId, delta);
            }
        }
    }
}

6. Read-Through Pattern with Spring Cache Abstraction

Read-through differs from cache-aside in who is responsible for the cache population on a miss. In cache-aside, the application code manages cache reads explicitly. In read-through, the cache itself (or a cache loader configured in the cache manager) transparently fetches from the backing store on a miss and populates itself — the caller sees only a unified interface.

In practice, Spring's @Cacheable annotation implements read-through semantics from the application's perspective — the method body acts as the data loader invoked on a cache miss. The key difference from pure cache-aside is that the application code does not need to handle "check cache → miss → load → store" explicitly; Spring's AOP proxy handles the full flow.

Programmatic Cache Loader with CacheLoader Interface

// Caffeine (local) + Redis (distributed) two-level read-through cache
@Configuration
@EnableCaching
public class TwoLevelCacheConfig {

    // L1: fast local Caffeine cache (per-JVM, 1000 entries, 30s TTL)
    @Bean
    public CaffeineCache caffeineProductCache() {
        return new CaffeineCache("products-l1",
            Caffeine.newBuilder()
                .expireAfterWrite(30, TimeUnit.SECONDS)
                .maximumSize(1000)
                .recordStats()
                .build());
    }

    // L2: Redis distributed cache (all pods share this)
    @Bean
    public RedisCacheManager redisCacheManager(RedisConnectionFactory cf) {
        RedisCacheConfiguration config = RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(10))
            .serializeValuesWith(RedisSerializationContext.SerializationPair
                .fromSerializer(new GenericJackson2JsonRedisSerializer()));
        return RedisCacheManager.builder(cf)
            .cacheDefaults(config).build();
    }
}

// Service using Spring's @Cacheable for transparent read-through
@Service
@RequiredArgsConstructor
public class CatalogService {

    private final ProductRepository productRepository;

    @Cacheable(value = "products", key = "#sku",
               cacheManager = "redisCacheManager")
    public ProductDto getProductBySku(String sku) {
        // This method body is the "data loader" — only called on cache miss
        log.info("Cache miss for SKU {}; loading from database", sku);
        return productRepository.findBySku(sku)
            .map(ProductMapper::toDto)
            .orElseThrow(() -> new ProductNotFoundException(sku));
    }
}

AWS ElastiCache Redis Cluster topology for microservices high availability — AWS ElastiCache Redis Cluster — hash slot distribution, replica failover, and multi-AZ configuration for production microservices. Source: mdsanwarhossain.me

7. Cache Invalidation Strategies

Phil Karlton's famous quip — "There are only two hard things in Computer Science: cache invalidation and naming things" — is especially true in microservices where a single business entity (e.g., a Product) may be cached across five different services. Getting invalidation right is the difference between a fast, consistent system and one that serves subtly wrong data.

Strategy 1: TTL-Based Expiration

The simplest strategy: set a TTL and accept eventual consistency up to that window. Every cached entry automatically expires. No invalidation code required. The trade-off is a bounded staleness window — during TTL period, stale data may be served. Suitable for reference data, static content, and anything where eventually consistent is acceptable.

Strategy 2: Event-Driven Invalidation

The owning service publishes an event when data changes (e.g., ProductUpdatedEvent on a Kafka topic). All services that cache that data subscribe and invalidate their local cache entries on receipt. This achieves near-real-time consistency while keeping services decoupled.

// Publisher: Product Service
@Service
@RequiredArgsConstructor
public class ProductCommandService {

    private final ProductRepository productRepository;
    private final KafkaTemplate<String, ProductUpdatedEvent> kafkaTemplate;
    private final CacheManager cacheManager;

    @Transactional
    public ProductDto updateProduct(Long productId, UpdateProductRequest req) {
        Product product = productRepository.findById(productId)
            .orElseThrow(() -> new ProductNotFoundException(productId));
        ProductMapper.applyUpdate(product, req);
        Product saved = productRepository.save(product);

        // Invalidate local cache
        Cache productCache = cacheManager.getCache("products");
        if (productCache != null) productCache.evict(productId);

        // Publish event for distributed invalidation
        kafkaTemplate.send("product-events",
            new ProductUpdatedEvent(productId, saved.getVersion()));

        return ProductMapper.toDto(saved);
    }
}

// Subscriber: Any service caching product data
@Component
@RequiredArgsConstructor
public class ProductCacheInvalidationListener {

    private final CacheManager cacheManager;

    @KafkaListener(topics = "product-events",
                   groupId = "pricing-service-cache-invalidation")
    public void onProductUpdated(ProductUpdatedEvent event) {
        Cache productCache = cacheManager.getCache("products");
        if (productCache != null) {
            productCache.evict(event.getProductId());
            log.info("Evicted product {} from cache due to update event v{}",
                event.getProductId(), event.getVersion());
        }
    }
}

Strategy 3: Versioned Cache Keys

Embed a version number or hash into the cache key. When data changes, increment the version — old keys become orphaned and expire via TTL. No explicit eviction needed. Works well for deployments where you want to immediately invalidate all cached data on a new application version:

// Key format: "v{appVersion}:products:{productId}"
// Changing APP_VERSION in your config immediately "invalidates" all old entries
@Cacheable(value = "products",
           key = "'v' + @appVersion + ':' + #productId")
public ProductDto getProduct(Long productId) { ... }

// In application.properties:
// app.cache.version=42  (increment on breaking data structure changes)

8. Distributed Locking with Redisson & Lettuce

Two of the most dangerous caching failure modes in microservices are the cache stampede and the thundering herd. Both occur when a popular cached key expires and many concurrent requests simultaneously attempt to rebuild it — each triggering an expensive database query, overwhelming the database, and defeating the purpose of caching.

Cache Stampede: The Problem

Consider 500 concurrent requests for product:hotdeal:99. The key expires. All 500 threads call getProduct(99), see a cache miss, and simultaneously query the database. PostgreSQL now receives 500 identical queries in a few milliseconds — enough to spike CPU to 100% and trigger connection pool exhaustion. The solution is a distributed lock that ensures only one thread rebuilds the cache while others wait (or serve stale data).

Redisson Distributed Lock: Production Implementation

// pom.xml
<dependency>
    <groupId>org.redisson</groupId>
    <artifactId>redisson-spring-boot-starter</artifactId>
    <version>3.27.2</version>
</dependency>

@Service
@RequiredArgsConstructor
public class StampedeProtectedProductService {

    private final RedissonClient redissonClient;
    private final RedisTemplate<String, ProductDto> redisTemplate;
    private final ProductRepository productRepository;

    private static final Duration CACHE_TTL   = Duration.ofMinutes(10);
    private static final Duration LOCK_TTL    = Duration.ofSeconds(5);
    private static final Duration LOCK_WAIT   = Duration.ofSeconds(3);

    public ProductDto getProduct(Long productId) {
        String cacheKey = "products::" + productId;
        String lockKey  = "lock:products::" + productId;

        // Fast path: return from cache
        ProductDto cached = redisTemplate.opsForValue().get(cacheKey);
        if (cached != null) return cached;

        // Slow path: acquire lock to rebuild cache
        RLock lock = redissonClient.getLock(lockKey);
        try {
            boolean acquired = lock.tryLock(
                LOCK_WAIT.toMillis(), LOCK_TTL.toMillis(), TimeUnit.MILLISECONDS);

            if (acquired) {
                try {
                    // Double-checked locking: another thread may have rebuilt while we waited
                    ProductDto doubleCheck = redisTemplate.opsForValue().get(cacheKey);
                    if (doubleCheck != null) return doubleCheck;

                    // Only this thread reaches the database
                    ProductDto dto = productRepository.findById(productId)
                        .map(ProductMapper::toDto)
                        .orElseThrow(() -> new ProductNotFoundException(productId));

                    redisTemplate.opsForValue().set(cacheKey, dto,
                        ttlWithJitter(CACHE_TTL));
                    return dto;
                } finally {
                    lock.unlock();
                }
            } else {
                // Lock not acquired — serve stale or throw (configurable behavior)
                log.warn("Could not acquire cache rebuild lock for product {}; "
                    + "returning stale or empty", productId);
                // Optionally: return stale value from a secondary "stale" key
                throw new CacheLockTimeoutException(
                    "Cache rebuild in progress for product " + productId);
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new CacheException("Interrupted waiting for cache lock", e);
        }
    }

    private Duration ttlWithJitter(Duration base) {
        long jitter = (long)(base.toMillis() * 0.15 * Math.random());
        return base.plusMillis(jitter);
    }
}

Probabilistic Early Expiration (PER) — Zero-Lock Alternative

An elegant lockless alternative to distributed locks for stampede prevention. Instead of expiring at a fixed TTL, each cache read calculates a probability of early recomputation using the formula: current_time - (recompute_time × β × log(random)). Threads probabilistically start refreshing the cache before it expires, so expiration never results in a mass-concurrent miss. The β parameter (typically 1.0) controls how eagerly pre-computation starts.

9. Session Caching with Spring Session + Redis

In a microservices deployment with multiple replicas behind a load balancer, HTTP session state cannot be stored in JVM memory — the next request may hit a different pod. Spring Session with Redis solves this by storing session data in a shared Redis store, making sessions available across all pods transparently.

Configuration

// pom.xml
<dependency>
    <groupId>org.springframework.session</groupId>
    <artifactId>spring-session-data-redis</artifactId>
</dependency>

// application.yml
spring:
  session:
    store-type: redis
    redis:
      namespace: myapp:session
      flush-mode: on-save      # or 'immediate' for eager flush
    timeout: 30m               # session TTL

  data:
    redis:
      host: ${REDIS_HOST:localhost}
      port: 6379
      password: ${REDIS_PASSWORD:}
      lettuce:
        pool:
          max-active: 20
          max-idle: 10
          min-idle: 5
          max-wait: 1000ms

// Enable Spring Session with Redis
@SpringBootApplication
@EnableRedisHttpSession(maxInactiveIntervalInSeconds = 1800)
public class ApiGatewayApplication {
    public static void main(String[] args) {
        SpringApplication.run(ApiGatewayApplication.class, args);
    }
}

Session Security Considerations

Encrypt session data: Session objects serialized to Redis may contain sensitive user data. Encrypt at-rest using a custom RedisSerializer that wraps AES-256 encryption around the serialized payload.
Session fixation protection: Spring Security's SessionFixationProtectionStrategy rotates session IDs on login. Ensure Redis session store propagates the new ID correctly before the old one expires.
TLS to Redis: All session data travels over the network to Redis. Enforce TLS with spring.data.redis.ssl.enabled=true and mutual TLS on AWS ElastiCache in-transit encryption.
Namespace isolation: Use distinct Redis key namespaces (myapp:session:, myapp:cache:, myapp:lock:) to avoid key collisions between session storage and application caches sharing the same Redis cluster.

10. Redis Cluster for High Availability

Redis Cluster is the production-grade solution for datasets that exceed a single node's memory or require sub-second automatic failover without Sentinel. Understanding hash slots is fundamental to designing cluster-compatible caching strategies.

Hash Slots and Sharding

Redis Cluster divides the keyspace into 16,384 hash slots. Each key is assigned to a slot via CRC16(key) % 16384. Hash slots are distributed evenly across primary nodes. With 3 primaries, each holds ~5,461 slots. When you add a fourth primary, Redis re-balances slots with zero downtime (live resharding).

Hash tags allow you to force related keys to the same slot: wrap the meaningful part of the key in curly braces — {user:42}:profile and {user:42}:settings both hash to the slot of user:42, enabling multi-key operations like MGET and Lua scripts across them.

Replica Failover

Each primary in a Redis Cluster has one or more replicas. When a primary fails, the cluster automatically elects a replica as the new primary (typically within 1–3 seconds) without Sentinel involvement. The cluster uses a gossip protocol (CLUSTER PING/PONG) for failure detection — a primary is declared failed when a majority of primaries agree it is unreachable.

Spring Boot Cluster Configuration

# application.yml — Redis Cluster configuration
spring:
  data:
    redis:
      cluster:
        nodes:
          - redis-cluster-node-1:6379
          - redis-cluster-node-2:6379
          - redis-cluster-node-3:6379
          - redis-cluster-node-4:6379
          - redis-cluster-node-5:6379
          - redis-cluster-node-6:6379
        max-redirects: 3       # MOVED / ASK redirect limit
      password: ${REDIS_CLUSTER_PASSWORD}
      ssl:
        enabled: true
      lettuce:
        cluster:
          refresh:
            adaptive: true        # Dynamic topology refresh
            period: 30s           # Periodic topology refresh
        pool:
          max-active: 50
          max-idle: 20
          min-idle: 10
          max-wait: 2000ms

// AWS ElastiCache Cluster configuration bean
@Configuration
public class RedisClusterConfig {

    @Value("${spring.data.redis.cluster.nodes}")
    private List<String> clusterNodes;

    @Bean
    public LettuceConnectionFactory redisConnectionFactory() {
        RedisClusterConfiguration clusterConfig =
            new RedisClusterConfiguration(clusterNodes);
        clusterConfig.setMaxRedirects(3);

        LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .readFrom(ReadFrom.REPLICA_PREFERRED)  // Read from replicas to reduce primary load
            .clientOptions(ClusterClientOptions.builder()
                .autoReconnect(true)
                .topologyRefreshOptions(ClusterTopologyRefreshOptions.builder()
                    .enableAdaptiveRefreshTrigger(
                        RefreshTrigger.MOVED_REDIRECT,
                        RefreshTrigger.PERSISTENT_RECONNECTS)
                    .adaptiveRefreshTriggersTimeout(Duration.ofSeconds(30))
                    .enablePeriodicRefresh(Duration.ofSeconds(30))
                    .build())
                .build())
            .build();

        return new LettuceConnectionFactory(clusterConfig, clientConfig);
    }
}

Cluster Limitations to Design Around

Multi-key commands: MGET, MSET, DEL with multiple keys must all target the same hash slot. Use hash tags to group related keys, or avoid multi-key commands across different slots.
Lua scripts: All keys referenced in a Lua script must reside in the same slot. Design scripts to operate on hash-tagged key groups.
Database index: Redis Cluster supports only database 0 (SELECT is disabled). Namespacing via key prefixes replaces DB index separation.
SCAN complexity: SCAN in cluster mode must be run against each node individually. Use Lettuce's ClusterScanCursor or Redisson's cluster-aware key scan.

11. Cache Serialization: JSON vs Java Serialization

Every object stored in Redis must be serialized to bytes and deserialized on read. The serialization choice has major implications for performance, debuggability, schema evolution, and security.

Java Serialization (JdkSerializationRedisSerializer)

Spring Data Redis uses Java serialization by default if you don't configure otherwise. Avoid this in production. Java serialization:

Produces opaque binary blobs — impossible to inspect in Redis CLI
Tightly couples serialized format to Java class structure — any field rename or class move breaks deserialization of existing cache entries
Significantly larger payload than JSON (~3–5× for typical DTOs)
Is a well-known attack vector (deserialization gadget chains) — never deserialize untrusted data with Java serialization

JSON Serialization (GenericJackson2JsonRedisSerializer)

The recommended default for application caches. Human-readable in Redis CLI, survives most refactoring (field additions are ignored by older readers), and ~2–3× smaller than Java serialized output. Configure Jackson carefully:

@Bean
public RedisSerializer<Object> redisSerializer() {
    ObjectMapper mapper = new ObjectMapper();
    // Include type information so deserialization works without explicit class knowledge
    mapper.activateDefaultTyping(
        mapper.getPolymorphicTypeValidator(),
        ObjectMapper.DefaultTyping.NON_FINAL,
        JsonTypeInfo.As.PROPERTY);
    mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
    mapper.registerModule(new JavaTimeModule());
    // Ignore unknown fields (schema evolution tolerance)
    mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
    return new GenericJackson2JsonRedisSerializer(mapper);
}

@Bean
public RedisTemplate<String, Object> redisTemplate(
        RedisConnectionFactory connectionFactory) {
    RedisTemplate<String, Object> template = new RedisTemplate<>();
    template.setConnectionFactory(connectionFactory);
    template.setKeySerializer(new StringRedisSerializer());
    template.setHashKeySerializer(new StringRedisSerializer());
    template.setValueSerializer(redisSerializer());
    template.setHashValueSerializer(redisSerializer());
    template.afterPropertiesSet();
    return template;
}

Performance Comparison

Serializer	Payload Size	Speed	Debuggable	Schema Evolution
Java Serialization	Large (~3–5×)	Moderate	❌ Opaque binary	❌ Brittle
JSON (Jackson)	Medium	Good	✅ Human-readable	✅ Flexible
MessagePack	Small (~30% vs JSON)	Excellent	⚠️ Binary	✅ Good
Protocol Buffers	Smallest (~20% vs JSON)	Excellent	⚠️ Binary	✅ Schema-registry
Kryo	Small	Fastest	❌ Opaque	⚠️ Fragile

Recommendation: Use JSON (Jackson) for most caches. Switch to MessagePack for high-throughput caches where serialization CPU is measurable in profiling. Use Protobuf only if you already have schema management infrastructure. Never use Java serialization or Kryo in multi-service distributed caches — schema fragility will cause deployment headaches.

12. Monitoring & Eviction Policies

A Redis cache you cannot observe is a liability. Production Redis monitoring requires understanding three categories: cache effectiveness, memory health, and client behavior.

Key Metrics to Monitor

keyspace_hits / keyspace_misses: Cache hit rate = hits / (hits + misses). Target ≥85% for read-heavy services. A sudden drop signals cold cache (restart?) or misconfigured keys.
evicted_keys: Keys evicted due to maxmemory pressure. Any evictions indicate your maxmemory budget is too low or your dataset has grown. Evictions on a write-through cache cause data loss.
mem_fragmentation_ratio: Ratio of RSS to allocated memory. Above 1.5 indicates significant fragmentation — schedule a MEMORY PURGE or rolling restart.
connected_clients: Compare to your connection pool max-active × pod count. If connected_clients approaches maxclients (default 10,000), new connections will be refused.
blocked_clients: Clients blocked on BLPOP, BRPOP, BZPOPMIN. High counts indicate queue consumers are slow.
instantaneous_ops_per_sec: Baseline this metric. Spikes during deployments (cold cache restart) are expected; unexpected spikes indicate stampede events.
rdb_last_bgsave_status / aof_last_write_status: Monitor persistence health. A failed bgsave means your last restore point is older than you think.

Eviction Policies

When Redis reaches maxmemory, it evicts keys based on the configured maxmemory-policy. Choose carefully — the wrong policy can silently destroy cache correctness:

Policy	Behavior	Best For
noeviction	Return errors on writes when full	Persistent data stores (not pure caches)
allkeys-lru	Evict least-recently-used keys from all keys	General-purpose caches (recommended default)
volatile-lru	Evict LRU keys only among keys with TTL set	Mixed persistent + cache data in same instance
allkeys-lfu	Evict least-frequently-used keys from all	Skewed access patterns (hot key workloads)
volatile-ttl	Evict keys with shortest remaining TTL first	When you want to preserve long-lived entries
allkeys-random	Evict a random key from all keys	Uniform access patterns (rarely optimal)

Spring Boot Actuator + Micrometer Redis Metrics

# application.yml — expose Redis metrics to Prometheus
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  metrics:
    tags:
      application: ${spring.application.name}
      environment: ${APP_ENV:prod}
    cache:
      redis:
        enable-statistics: true    # Enables cache hit/miss metrics

# Useful Prometheus queries for Grafana dashboards:
# Cache hit rate:
#   sum(rate(cache_gets_total{result="hit"}[5m]))
#   / sum(rate(cache_gets_total[5m]))
#
# Redis memory usage:
#   redis_memory_used_bytes / redis_memory_max_bytes
#
# Redis ops/sec:
#   rate(redis_commands_processed_total[1m])

Caching Patterns Comparison Table

Pattern	Write Path	Read Path	Consistency	Write Latency	Best For
Cache-Aside	DB only → evict cache	App manages miss	Eventual (TTL)	Low	Read-heavy, general purpose
Write-Through	DB + cache (sync)	Always cache hit	Strong	Higher (dual write)	Read-after-write consistency
Write-Behind	Cache only → async DB	Always cache hit	Eventual + data loss risk	Lowest	Write-heavy, loss-tolerant
Read-Through	DB only → evict cache	Cache auto-loads on miss	Eventual (TTL)	Low	Clean abstraction via @Cacheable

Grafana Dashboard for Redis in Microservices

Build a dedicated Redis dashboard in Grafana with these panels as a minimum viable observability setup:

Cache Hit Rate % — line chart, alert when below 80%
Memory Used / Max — gauge with warning at 75%, critical at 90%
Evicted Keys/sec — any evictions on write-through caches should alert immediately
Commands/sec by Type — GET, SET, DEL, EXPIRE breakdown to spot unusual patterns
Connected Clients — alert when approaching maxclients
Replication Lag (ms) — for Sentinel/Cluster — alert when replica lag exceeds 1 second
Slow Log Queries — Redis commands exceeding slowlog-log-slower-than (default 10ms)
Keyspace by Prefix — track key counts per namespace to detect unbounded key growth

Import Grafana dashboard ID 763 (Redis Dashboard by Prometheus community) as a starting point, then add application-specific panels for cache hit rate per cache name using the Micrometer metrics exposed by Spring Boot Actuator.