Microservices Spring Boot Design Patterns Advanced

Advanced Design Patterns in Microservices & Spring Boot: Production Engineering Guide

Md Sanwar Hossain January 12, 2026 22 min read
Advanced Design Patterns in Microservices and Spring Boot
TL;DR: Microservices introduce distributed system problems that GoF patterns don't address: partial failures, eventual consistency, distributed transactions. The patterns in this guide — Circuit Breaker, Saga, CQRS, Outbox, Bulkhead — are the production toolkit for building reliable distributed systems in Spring Boot.

1. Why GoF Patterns Are Not Enough for Microservices

The original 23 GoF patterns were designed for object-oriented design within a single process. They assume shared memory, synchronous execution, and transactional consistency. When you break a monolith into microservices, all three of these assumptions vanish simultaneously.

Microservices introduce a fundamentally different problem space that requires a separate catalog of distributed systems patterns:

Problem in MicroservicesPattern That Addresses It
Cascading failures from slow downstreamCircuit Breaker
Multi-service transaction without 2PCSaga (Choreography / Orchestration)
Read/write performance mismatchCQRS
Dual-write inconsistency (DB + Kafka)Outbox Pattern
Thread starvation from slow dependencyBulkhead
Migrating legacy monolithStrangler Fig
Cross-cutting concerns (mTLS, metrics)Sidecar
Advanced Microservices Design Patterns
Advanced Microservices Design Patterns — mdsanwarhossain.me

2. Circuit Breaker Pattern — Failing Fast in Distributed Systems

The Circuit Breaker pattern prevents a service from repeatedly calling a downstream dependency that is failing or timing out. Named after electrical circuit breakers, it has three states:

// BAD: No circuit breaker — payment service failure cascades to order service @Service public class OrderService { private final RestTemplate restTemplate; public PaymentResult processPayment(PaymentRequest req) { // If payment service is down, this thread blocks for 30s then throws exception // 100 concurrent users = 100 threads blocked = thread pool exhaustion return restTemplate.postForObject( "http://payment-service/api/payments", req, PaymentResult.class); } } // GOOD: Resilience4j @CircuitBreaker with fallback @Service @RequiredArgsConstructor public class OrderService { private final PaymentClient paymentClient; @CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback") @TimeLimiter(name = "paymentService") public CompletableFuture<PaymentResult> processPayment(PaymentRequest req) { return CompletableFuture.supplyAsync(() -> paymentClient.charge(req)); } public CompletableFuture<PaymentResult> paymentFallback( PaymentRequest req, Throwable ex) { log.warn("Payment service unavailable, queuing for retry: {}", ex.getMessage()); paymentQueueService.enqueue(req); return CompletableFuture.completedFuture(PaymentResult.queued(req.orderId())); } }
# application.yml — Resilience4j Circuit Breaker configuration
resilience4j:
  circuitbreaker:
    instances:
      paymentService:
        failure-rate-threshold: 50          # Open when 50% of calls fail
        wait-duration-in-open-state: 30s    # Stay open for 30 seconds
        sliding-window-size: 10             # Evaluate last 10 requests
        permitted-number-of-calls-in-half-open-state: 3
  timelimiter:
    instances:
      paymentService:
        timeout-duration: 3s
💡 Production Tip: Always expose circuit breaker state via Spring Boot Actuator (/actuator/circuitbreakerevents) and alert when a circuit opens in production. An open circuit is a symptom of a downstream problem — it should page an on-call engineer.

3. Saga Pattern — Distributed Transactions Without 2PC

Two-Phase Commit (2PC) requires all participating services to hold locks during the transaction. In a microservices environment with potentially dozens of participants, this causes severe lock contention, network failure sensitivity, and coordinator SPOF. Sagas replace 2PC with a sequence of local transactions, each publishing an event to trigger the next step.

There are two styles: Choreography (services react to events autonomously, no central coordinator) and Orchestration (a central Saga orchestrator drives the workflow and decides the next step).

// BAD: Distributed transaction attempt — locks across services, fails on partial commit @Transactional public void placeOrder(PlaceOrderCommand cmd) { orderService.createOrder(cmd); // Service 1 TX inventoryService.reserveStock(cmd); // Service 2 TX — if this fails... paymentService.chargeCustomer(cmd); // Service 3 TX — Service 1 may already be committed // No clean rollback across service boundaries without 2PC! } // GOOD: Orchestration Saga with compensation steps @Component @RequiredArgsConstructor public class PlaceOrderSaga { private final OrderRepository orderRepository; private final InventoryClient inventoryClient; private final PaymentClient paymentClient; private final ApplicationEventPublisher events; public SagaResult execute(PlaceOrderCommand cmd) { Order order = null; String reservationId = null; try { // Step 1: Create order order = orderRepository.save(Order.pending(cmd)); // Step 2: Reserve inventory (compensate: release reservation) reservationId = inventoryClient.reserve(cmd.items()); // Step 3: Charge payment (compensate: refund) PaymentResult payment = paymentClient.charge(cmd.customerId(), cmd.total()); // All steps succeeded — confirm order.confirm(payment.transactionId()); orderRepository.save(order); events.publishEvent(new OrderConfirmedEvent(order.getId())); return SagaResult.success(order.getId()); } catch (InventoryException e) { // Compensate step 1 if (order != null) orderRepository.delete(order); return SagaResult.failed("Inventory unavailable: " + e.getMessage()); } catch (PaymentException e) { // Compensate steps 1 and 2 if (reservationId != null) inventoryClient.release(reservationId); if (order != null) orderRepository.delete(order); return SagaResult.failed("Payment declined: " + e.getMessage()); } } }
⚠️ Saga Trade-offs: Sagas achieve eventual consistency, not immediate consistency. Compensation logic (rollback) is business logic — it must be explicitly coded for each failure scenario. For long-running workflows, consider Temporal.io or Apache Camel, which provide persistent saga state and automatic retry.

4. CQRS Pattern — Separating Reads and Writes

Command Query Responsibility Segregation (CQRS) separates the model for reading data from the model for writing data. The write model (commands) is optimized for consistency and business rule enforcement; the read model (queries) is optimized for query performance, often using denormalized projections.

CQRS Pattern — Command and Query Separation in Spring Boot
CQRS Pattern — Separating Reads and Writes in Microservices — mdsanwarhossain.me
// BAD: Same JPA entity for reads and writes — leads to N+1 queries for list views @RestController public class OrderController { @GetMapping("/orders") public List<Order> getOrders() { // Loads full Order with all lazy associations — N+1 on OrderItems, Customer, Address return orderRepository.findAll(); } @PostMapping("/orders") public Order createOrder(@RequestBody CreateOrderCommand cmd) { return orderService.createOrder(cmd); // writes need full entity for validation } } // GOOD: Separate Command model and Query projection model // Command side: full JPA entity with business logic @Entity @Table(name = "orders") public class Order { @Id private String id; @Enumerated(EnumType.STRING) private OrderStatus status; @OneToMany(cascade = CascadeType.ALL) private List<OrderItem> items; @Embedded private ShippingAddress shippingAddress; // Domain methods: confirm(), ship(), cancel() } // Query side: flat projection DTO, no lazy loading public interface OrderSummary { String getId(); String getCustomerName(); BigDecimal getTotal(); String getStatus(); LocalDateTime getCreatedAt(); } // Spring Data projection query — single optimized SQL public interface OrderQueryRepository extends JpaRepository<Order, String> { @Query("SELECT o.id AS id, c.fullName AS customerName, " + "o.total AS total, o.status AS status, o.createdAt AS createdAt " + "FROM Order o JOIN o.customer c WHERE o.customerId = :customerId") List<OrderSummary> findSummariesByCustomerId(@Param("customerId") String customerId); } // Separate handlers per concern @RestController @RequiredArgsConstructor public class OrderController { private final OrderCommandService commandService; private final OrderQueryRepository queryRepository; @PostMapping("/orders") public ResponseEntity<String> createOrder(@Valid @RequestBody CreateOrderCommand cmd) { String orderId = commandService.handle(cmd); return ResponseEntity.created(URI.create("/orders/" + orderId)).body(orderId); } @GetMapping("/orders") public List<OrderSummary> getOrders(@RequestParam String customerId) { return queryRepository.findSummariesByCustomerId(customerId); } }

5. Outbox Pattern — Guaranteed Event Publishing

The dual-write problem: when you need to both save data to a database and publish an event to Kafka, doing them as two separate operations means one can succeed and the other can fail, leaving your system in an inconsistent state. The Outbox pattern solves this by writing the event to an outbox table in the same database transaction, then reading from that table asynchronously to publish to Kafka.

// BAD: Dual-write — Kafka publish after DB save. If Kafka is down, event is permanently lost @Transactional public Order createOrder(CreateOrderCommand cmd) { Order order = orderRepository.save(Order.from(cmd)); kafkaTemplate.send("orders", new OrderCreatedEvent(order.getId())); // LOST if Kafka is down return order; } // GOOD: Outbox pattern — event goes into same DB transaction as domain data @Entity @Table(name = "outbox_events") public class OutboxEvent { @Id @GeneratedValue private Long id; private String aggregateType; private String aggregateId; private String eventType; @Column(columnDefinition = "TEXT") private String payload; private LocalDateTime createdAt; private boolean processed; } @Service @RequiredArgsConstructor public class OrderCommandService { private final OrderRepository orderRepository; private final OutboxEventRepository outboxRepository; private final ObjectMapper objectMapper; @Transactional // Both saves are atomic public String createOrder(CreateOrderCommand cmd) throws JsonProcessingException { Order order = orderRepository.save(Order.from(cmd)); OutboxEvent outboxEvent = new OutboxEvent(); outboxEvent.setAggregateType("Order"); outboxEvent.setAggregateId(order.getId()); outboxEvent.setEventType("OrderCreated"); outboxEvent.setPayload(objectMapper.writeValueAsString( new OrderCreatedEvent(order.getId(), order.getCustomerId(), order.getTotal()))); outboxEvent.setCreatedAt(LocalDateTime.now()); outboxRepository.save(outboxEvent); // same TX as order save return order.getId(); } } // Simpler alternative: @TransactionalEventListener (Spring-native) @Service @RequiredArgsConstructor public class OrderEventRelay { private final KafkaTemplate<String, Object> kafkaTemplate; @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT) public void onOrderCreated(OrderCreatedEvent event) { // This runs AFTER the DB transaction commits successfully kafkaTemplate.send("order-events", event.orderId(), event); } }
💡 Debezium CDC: For high-throughput systems, use Debezium Change Data Capture to tail the outbox table's binlog/WAL and publish events to Kafka automatically. This removes the need for a polling thread and gives you sub-second delivery latency.

6. Bulkhead Pattern — Isolating Failure Domains

Named after the watertight compartments in a ship's hull, the Bulkhead pattern isolates failure domains by assigning separate thread pools or connection pools to different downstream dependencies. This ensures that a slow or failing dependency consumes only its allocated resources and cannot starve other parts of the system.

// BAD: Shared Tomcat thread pool — slow DB query starves all HTTP requests @Service public class ProductService { public Product getProduct(String id) { return productRepository.findById(id) // slow query blocks a Tomcat thread .orElseThrow(() -> new ProductNotFoundException(id)); } public List<Recommendation> getRecommendations(String userId) { return recommendationClient.get(userId); // 3rd-party API also blocks a Tomcat thread // If recommendations API is slow, ALL Tomcat threads may get consumed here } } // GOOD: Resilience4j Bulkhead with separate thread pool per downstream @Service @RequiredArgsConstructor public class ProductService { @Bulkhead(name = "productDb", type = Bulkhead.Type.THREADPOOL) public CompletableFuture<Product> getProduct(String id) { return CompletableFuture.supplyAsync(() -> productRepository.findById(id) .orElseThrow(() -> new ProductNotFoundException(id))); } @Bulkhead(name = "recommendationApi", type = Bulkhead.Type.THREADPOOL) @CircuitBreaker(name = "recommendationApi", fallbackMethod = "emptyRecommendations") public CompletableFuture<List<Recommendation>> getRecommendations(String userId) { return CompletableFuture.supplyAsync(() -> recommendationClient.get(userId)); } public CompletableFuture<List<Recommendation>> emptyRecommendations( String userId, Throwable t) { return CompletableFuture.completedFuture(Collections.emptyList()); } }
# application.yml — Bulkhead thread pool configuration
resilience4j:
  thread-pool-bulkhead:
    instances:
      productDb:
        max-thread-pool-size: 10
        core-thread-pool-size: 5
        queue-capacity: 20
      recommendationApi:
        max-thread-pool-size: 5
        core-thread-pool-size: 2
        queue-capacity: 10

7. Strangler Fig Pattern — Migrating Legacy Monoliths

Named after the strangler fig tree that gradually envelops its host, this pattern enables incremental migration of a monolith to microservices without a big-bang rewrite. A proxy layer intercepts requests and routes them: new features go to new microservices, existing features still go to the monolith. Over time, the monolith is replaced module by module.

Real scenario: A team at a fintech company migrated a 8-year-old Spring MVC monolith to microservices over 18 months. They identified 6 bounded contexts, started with the most-changed module (payment processing), deployed it as a service, and routed traffic via Spring Cloud Gateway. No downtime, no big-bang risk.

// Spring Cloud Gateway routing: new services intercept specific paths @Configuration public class GatewayConfig { @Bean public RouteLocator routeLocator(RouteLocatorBuilder builder) { return builder.routes() // New payment microservice handles /api/payments .route("payment-service", r -> r .path("/api/payments/**") .filters(f -> f.rewritePath("/api/payments/(?<segment>.*)", "/${segment}")) .uri("lb://payment-service")) // New inventory microservice handles /api/inventory .route("inventory-service", r -> r .path("/api/inventory/**") .uri("lb://inventory-service")) // Everything else still goes to the monolith .route("legacy-monolith", r -> r .path("/**") .uri("http://legacy-monolith:8080")) .build(); } }

8. Sidecar Pattern — Cross-Cutting Concerns in Kubernetes

The Sidecar pattern deploys a helper container alongside the main application container in the same Kubernetes Pod. The sidecar handles cross-cutting concerns (mTLS, log collection, metrics scraping, retries) without modifying the application code.

Envoy proxy as sidecar (used in Istio Service Mesh) intercepts all inbound and outbound network traffic and implements: TLS termination, circuit breaking, retries, tracing header injection, and metrics collection at the infrastructure layer.

Event-Driven Microservices Patterns
Event-Driven Microservices Patterns — mdsanwarhossain.me

9. Anti-Patterns in Microservices

Recognizing anti-patterns is as valuable as knowing the patterns. These are the five most common mistakes teams make when building microservices:

1. Distributed Monolith

Microservices that still share a single database. Services are independently deployable in theory, but any schema change requires coordinating deployments of all services. You get the complexity of microservices without the benefits. Fix: Each service owns its own database schema. Use the API to share data, never direct DB access.

2. Chatty Services

20 API calls to render one page. Synchronous call chains increase latency proportionally with depth. Fix: Use API aggregation (BFF pattern), GraphQL for flexible queries, or asynchronous event-driven communication for non-blocking operations.

3. Synchronous Chain Without Timeouts

Service A calls B calls C calls D synchronously. If D takes 5 seconds, A's total latency is at minimum 5 seconds, and all threads in A, B, and C are blocked waiting. Fix: Set aggressive timeouts at every hop. Use async patterns for non-critical paths. Apply Circuit Breakers and Bulkheads.

4. Shared Mutable Database

Multiple services writing to the same database tables. This couples services at the data layer — a change to the schema by one team breaks all other services. Fix: Database per service. Expose data via events or read APIs, not direct DB access.

5. Missing API Gateway

Exposing all services directly to clients. Clients must know about all service URLs, handle their own authentication for each service, and deal with version mismatches. Fix: Spring Cloud Gateway or Kong as a single entry point handling auth, rate limiting, routing, and SSL termination.

10. Pattern Selection Guide

ProblemPatternSpring Boot Tool
Downstream service failing or slowCircuit BreakerResilience4j @CircuitBreaker
Multi-service distributed transactionSagaTemporal.io / Apache Camel
Read and write optimization mismatchCQRSSpring Data projections + read replica
Guaranteed event delivery to KafkaOutbox PatternDebezium / @TransactionalEventListener
Thread starvation from slow dependencyBulkheadResilience4j @Bulkhead
Incremental monolith migrationStrangler FigSpring Cloud Gateway routing
Cross-cutting infra concerns (mTLS, tracing)SidecarIstio / Envoy proxy

11. Interview Insights & FAQ

Q: What's the difference between Choreography and Orchestration sagas?

Choreography: Each service listens for events and autonomously decides what to do next. No central coordinator. Simple for small flows but hard to trace and debug for complex workflows. Orchestration: A central Saga orchestrator drives the workflow, calling each service in sequence and handling compensation. Easier to reason about and trace, but introduces a single orchestrator component that must be resilient. For complex multi-step workflows (>3 steps), prefer orchestration.

Q: When should you use CQRS?

Use CQRS when your read and write loads have significantly different characteristics: high-volume reads that need denormalized data, or complex domain models that need consistency on writes. Don't apply CQRS to every service by default — it adds significant complexity (two models, eventual consistency between them). A good heuristic: if you're seeing N+1 query problems or your JPA entities are getting dozens of annotations to serve different views, CQRS will help.

Q: How does @TransactionalEventListener differ from @EventListener for the Outbox pattern?

@EventListener fires during the transaction, before commit. If the event listener throws an exception, the transaction rolls back. @TransactionalEventListener with AFTER_COMMIT fires only after the transaction successfully commits. This makes it much safer for publishing to Kafka: you only publish the event if the database commit succeeded. The remaining risk is that the listener itself can fail after the commit, so combine with retry logic or persistent outbox for guaranteed delivery.

Q: How is Circuit Breaker different from Retry?

Retry handles transient failures by retrying the same operation. Circuit Breaker handles systematic failures by stopping attempts altogether when failure rate exceeds a threshold. They are complementary: use Retry for transient glitches (network hiccup) and Circuit Breaker for persistent failures (downstream service down). Always configure retry with exponential backoff and jitter to avoid thundering herd when the downstream recovers.

FAQ

Q: Can I use CQRS without Event Sourcing?
Absolutely. CQRS and Event Sourcing are independent patterns that complement each other but are not required together. Most production CQRS implementations use a relational database for the write model and a read replica or materialized view for the query model.

Q: Is the Outbox pattern overkill for small systems?
For systems with fewer than 1,000 events per day and where losing a few events is acceptable, @TransactionalEventListener alone may suffice. The full Outbox pattern with Debezium is necessary when you need guaranteed at-least-once delivery and cannot tolerate event loss (financial transactions, inventory changes, etc.).

Q: Should every microservice implement Circuit Breaker?
Every synchronous outbound call to an external service should be protected by a Circuit Breaker and a timeout. This is non-negotiable for production systems. The Bulkhead is important when one slow dependency could starve resources needed by other, healthier dependencies.

Q: What's the minimum pattern set for a new microservice?
At minimum: Circuit Breaker + timeout on all outbound calls, structured logging with correlation ID, health check endpoint, and graceful shutdown. These four practices prevent the most common production incidents in microservices.

Key Takeaways

Leave a Comment

Related Posts

Software Dev

Java Design Patterns in Production

Strategy, Factory & Builder patterns for scalable Spring Boot microservices.

Software Dev

SOLID Principles in Java

Real-world refactoring patterns for Spring Boot microservices.

Software Dev

Design Patterns: Beginner to Advanced

All 23 GoF patterns with Java examples and Spring Boot usage.

Md Sanwar Hossain
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices

All Posts
Back to Blog
Last updated: April 10, 2026