Java Reactive Programming with Project Reactor: From Mono/Flux to Production
Reactive programming in Java has matured from an academic curiosity into a production necessity for high-throughput, low-latency services. But it comes with a steep learning curve, subtle bugs that only manifest under load, and a genuine trade-off against code readability. This guide covers the full journey from Mono/Flux fundamentals to production-grade WebFlux services.
The Reactive Manifesto and Why Reactive Matters
The Reactive Manifesto (2014) defines four properties of reactive systems: Responsive (consistent, predictable response times), Resilient (stays responsive under failure), Elastic (stays responsive under varying workload), and Message-driven (uses asynchronous message-passing as the foundation). These properties are not achieved through any single technology — but non-blocking I/O and backpressure-aware data flows are key enabling mechanisms.
The fundamental problem that reactive programming solves is blocking I/O at scale. In a traditional Spring MVC application, each HTTP request occupies a thread from the moment the request arrives until the response is sent. If the request handler makes a database query that takes 50ms, the thread is blocked for 50ms doing nothing. With a thread pool of 200 threads (typical for Tomcat), the maximum concurrent request capacity is 200 — even though the CPU is idle 90% of the time waiting for I/O. At 100 requests/second and 50ms average latency, you need only 5 threads (Little's Law: N = λ × W = 100 × 0.05). But at 10,000 requests/second with 50ms latency, you need 500 threads — more than a typical JVM can efficiently manage.
Non-blocking I/O solves this by releasing the thread during I/O waits. Instead of blocking a thread waiting for a database response, the thread registers a callback and returns to the pool, free to serve other requests. When the database response arrives, a thread from the pool processes the completion. This model can serve 10,000 concurrent requests with the same 200-thread pool — because each thread is never idle, always processing completions rather than blocking.
Mono and Flux: The Core Abstractions
Project Reactor provides two core publisher types. Mono<T> represents a stream of 0 or 1 element — analogous to Optional<T> but asynchronous. It completes with either a value, an empty completion, or an error. Flux<T> represents a stream of 0 to N elements — analogous to List<T> or Stream<T> but asynchronous with backpressure support.
// Mono examples
Mono<User> findById = userRepository.findById(userId); // 0 or 1 user
Mono<Void> deleteUser = userRepository.deleteById(userId); // completion signal only
Mono<User> fromValue = Mono.just(new User("alice"));
Mono<User> empty = Mono.empty();
Mono<User> error = Mono.error(new UserNotFoundException(userId));
// Flux examples
Flux<User> allUsers = userRepository.findAll(); // N users
Flux<Integer> range = Flux.range(1, 100);
Flux<String> fromList = Flux.fromIterable(List.of("a", "b", "c"));
// Key operator: map vs flatMap
Mono<String> username = findById
.map(user -> user.getName()); // synchronous transformation
Mono<Order> latestOrder = findById
.flatMap(user -> orderRepository.findLatestByUserId(user.getId())); // async chaining
The most important conceptual rule: use map for synchronous transformations (no I/O), and flatMap for asynchronous transformations (returns Mono/Flux). Calling a method that returns Mono<T> inside a map gives you Mono<Mono<T>> — a wrapped publisher that never subscribes to the inner publisher. This is the single most common bug in Reactor code written by developers transitioning from imperative Java.
Essential Operators
map / flatMap / concatMap: flatMap subscribes to inner publishers eagerly and concurrently — the output order may not match input order, but throughput is maximized. concatMap subscribes to inner publishers sequentially — output order is preserved, but throughput is limited by the slowest inner publisher. Use flatMap when order doesn't matter (enriching a stream of products with inventory counts in parallel), concatMap when order matters (processing ordered events).
// flatMap: concurrent, unordered — faster
Flux<ProductWithInventory> enriched = productFlux
.flatMap(product ->
inventoryService.getCount(product.getId())
.map(count -> new ProductWithInventory(product, count)),
16 // concurrency parameter — max 16 concurrent inner subscriptions
);
// concatMap: sequential, ordered — predictable
Flux<ProcessedEvent> processed = eventFlux
.concatMap(event -> eventProcessor.process(event));
// zipWith: combine two streams element by element
Mono<OrderSummary> summary = orderMono
.zipWith(customerMono)
.map(tuple -> new OrderSummary(tuple.getT1(), tuple.getT2()));
// Flux.merge vs Flux.concat: merge is concurrent, concat is sequential
Flux<Data> merged = Flux.merge(source1, source2, source3); // concurrent
Flux<Data> concatenated = Flux.concat(source1, source2, source3); // sequential
Backpressure Strategies
Backpressure is the mechanism by which a subscriber signals to the publisher how many elements it is ready to consume. Without backpressure, a fast publisher can overwhelm a slow subscriber, causing unbounded queue growth and eventual OutOfMemoryError. Reactive Streams (the specification that Project Reactor implements) mandates backpressure support at every stage of the pipeline.
In practice, backpressure manifests as the request(n) signal flowing upstream — a subscriber requests N items, the publisher produces at most N items, the subscriber processes them, then requests more. For most in-process pipelines, this happens automatically. Problems arise when integrating with non-reactive sources (blocking queues, legacy APIs) or network I/O where backpressure crosses process boundaries.
// Backpressure overflow strategies for hot publishers
Flux<SensorReading> sensorStream = hotSensor.asFlux()
.onBackpressureBuffer(1000) // buffer up to 1000 elements, then error
// Alternatives:
// .onBackpressureDrop() // drop oldest elements when buffer full
// .onBackpressureLatest() // keep only the latest element
// .onBackpressureError() // signal error immediately on overflow
.publishOn(Schedulers.boundedElastic()); // process on separate thread pool
Choose backpressure strategy based on data semantics: buffer for financial transactions where every element matters, drop for telemetry/sensor data where old readings are stale, latest for UI state updates where only the most current value is relevant.
Error Handling
Reactive error handling differs fundamentally from imperative try/catch. Errors are signals in the reactive stream — they terminate the stream (unless caught) and propagate downstream through the operator chain. Reactor provides several error handling operators:
// onErrorReturn: emit a fallback value on error
Mono<UserProfile> profile = userService.getProfile(userId)
.onErrorReturn(UserNotFoundException.class,
UserProfile.anonymous());
// onErrorResume: switch to a fallback publisher on error
Mono<Config> config = primaryConfigService.get()
.onErrorResume(ServiceUnavailableException.class,
ex -> fallbackConfigService.get());
// onErrorMap: transform the error type
Mono<Data> result = externalApi.fetch()
.onErrorMap(HttpClientErrorException.class,
ex -> new DomainException("External API error: " + ex.getStatusCode()));
// doOnError: side-effect on error (logging) without affecting stream
Mono<Order> order = orderRepository.findById(orderId)
.doOnError(ex -> log.error("Failed to fetch order {}: {}", orderId, ex.getMessage()))
.onErrorResume(ex -> Mono.error(new OrderNotFoundException(orderId)));
// retry with backoff
Mono<Response> resilient = externalService.call()
.retryWhen(Retry.backoff(3, Duration.ofMillis(500))
.maxBackoff(Duration.ofSeconds(5))
.jitter(0.25)
.filter(ex -> ex instanceof TransientException));
Schedulers and Threading
One of the most misunderstood aspects of Project Reactor is threading. By default, a Reactor pipeline executes on the thread that initiates the subscription — which is usually the I/O event loop thread in Netty-based WebFlux. Blocking operations on the event loop thread are catastrophic — they prevent the event loop from processing other I/O events and cause cascading latency increases.
Use publishOn to switch which thread processes downstream operators. Use subscribeOn to switch which thread performs the subscription (affects where the entire pipeline runs, typically used with blocking sources). The cardinal rule: never block on an event loop thread.
// WRONG: blocking call on event loop thread
Flux<Data> wrong = Flux.range(1, 100)
.map(i -> blockingDatabase.query(i)); // blocks the event loop!
// CORRECT: offload blocking to boundedElastic scheduler
Flux<Data> correct = Flux.range(1, 100)
.publishOn(Schedulers.boundedElastic())
.map(i -> blockingDatabase.query(i)) // safe: runs on boundedElastic thread
.publishOn(Schedulers.parallel()); // back to parallel for CPU work
// Schedulers reference:
// Schedulers.parallel() — CPU-bound work, N threads = CPU cores
// Schedulers.boundedElastic() — I/O-bound blocking work, dynamic pool with cap
// Schedulers.single() — single-threaded, for ordered serial execution
// Schedulers.immediate() — caller thread (default, no thread switch)
Spring WebFlux Integration
Spring WebFlux is the reactive web framework that uses Project Reactor as its reactive library and Netty as its default server. A WebFlux controller returns Mono or Flux instead of plain objects:
@RestController
@RequestMapping("/api/users")
public class UserController {
private final UserService userService;
@GetMapping("/{id}")
public Mono<ResponseEntity<UserDto>> getUser(@PathVariable String id) {
return userService.findById(id)
.map(user -> ResponseEntity.ok(UserDto.from(user)))
.defaultIfEmpty(ResponseEntity.notFound().build());
}
@GetMapping
public Flux<UserDto> getAllUsers(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size) {
return userService.findAll(PageRequest.of(page, size))
.map(UserDto::from);
}
@PostMapping
@ResponseStatus(HttpStatus.CREATED)
public Mono<UserDto> createUser(@RequestBody @Valid Mono<CreateUserRequest> request) {
return request
.flatMap(userService::create)
.map(UserDto::from);
}
}
R2DBC provides reactive database access for relational databases. With R2DBC, database queries return Mono or Flux instead of blocking until the result set is complete. Spring Data R2DBC repositories follow the same repository pattern as JPA but with reactive return types:
public interface UserRepository extends ReactiveCrudRepository<User, String> {
Flux<User> findByStatus(String status);
Mono<User> findByEmail(String email);
Mono<Long> countByCreatedAtAfter(Instant after);
}
Testing Reactive Code with StepVerifier
StepVerifier is the testing utility for reactive streams in Project Reactor. It subscribes to a publisher and provides a fluent assertion API for verifying the sequence of elements, errors, and completion signals:
@Test
void getUserReturnsUserWhenFound() {
User expectedUser = new User("alice", "alice@example.com");
when(userRepository.findById("alice-id")).thenReturn(Mono.just(expectedUser));
StepVerifier.create(userService.findById("alice-id"))
.expectNext(expectedUser)
.verifyComplete();
}
@Test
void getUserReturnsEmptyWhenNotFound() {
when(userRepository.findById("unknown")).thenReturn(Mono.empty());
StepVerifier.create(userService.findById("unknown"))
.verifyComplete(); // no elements, just completion
}
@Test
void processAllUsersEmitsInOrder() {
Flux<User> users = Flux.just(
new User("alice"), new User("bob"), new User("charlie")
);
StepVerifier.create(users.map(User::getName))
.expectNext("alice")
.expectNext("bob")
.expectNext("charlie")
.verifyComplete();
}
@Test
void retryOnTransientError() {
AtomicInteger attempts = new AtomicInteger();
Mono<String> flaky = Mono.fromCallable(() -> {
if (attempts.getAndIncrement() < 2) throw new TransientException("retry me");
return "success";
}).retryWhen(Retry.fixedDelay(3, Duration.ofMillis(10)));
StepVerifier.create(flaky)
.expectNext("success")
.verifyComplete();
}
When NOT to Use Reactive
Reactive programming is not universally superior to imperative programming — it trades code readability and debuggability for throughput at high concurrency. There are scenarios where reactive is actively the wrong choice.
CPU-bound workloads do not benefit from reactive I/O. A service that performs complex computations (image processing, ML inference, cryptographic operations) with minimal I/O should use traditional thread-per-request with a sized thread pool. Reactive adds overhead (operator chains, context propagation) without any throughput benefit when I/O is not the bottleneck.
Simple CRUD services with moderate load (under 500 concurrent requests) are better served by Spring MVC with virtual threads (Java 21+). Virtual threads eliminate the blocking I/O bottleneck of platform threads at a fraction of the code complexity — see Java virtual threads for the comparison. A service that serves 200 requests/second with <20ms database queries does not need reactive; it needs JPA and a reasonably-sized thread pool.
Legacy codebases with extensive blocking library use (Hibernate ORM, JDBC, synchronous HTTP clients) cannot adopt reactive incrementally. Calling a blocking API inside a reactive chain without explicit publishOn(Schedulers.boundedElastic()) wrapping is worse than not using reactive at all — it blocks event loop threads, causing worse performance than the imperative original.
For the broader Java performance context, see Java concurrency, core Java performance, and JVM performance tuning. For Spring Boot integration patterns, see Spring Boot microservices.
Key Takeaways
- Reactive shines at high I/O concurrency: Services with thousands of concurrent requests and significant I/O wait time (database, external APIs) benefit most. Low-concurrency services should use virtual threads instead.
- flatMap for async, map for sync: This distinction is non-negotiable. The wrong choice produces nested publishers that never execute.
- Never block on the event loop thread: Any blocking operation (JDBC, legacy HTTP client, Thread.sleep) inside a reactive pipeline must be wrapped with
publishOn(Schedulers.boundedElastic()). - Use StepVerifier for all reactive tests: Standard JUnit assertions fail for async publishers. StepVerifier is the correct tool for asserting reactive stream behavior.
- Error handling operators, not try/catch: Use
onErrorReturn,onErrorResume, andonErrorMapfor in-pipeline error handling. Uncaught errors propagate to the subscriber and surface as 500 errors in WebFlux. - Profile before adopting reactive: Measure your current blocking I/O utilization before migrating. If your P99 latency is 20ms and your average concurrency is 50, reactive adds complexity without measurable benefit.
Related Posts
Discussion / Comments
Join the conversation — your comment goes directly to my inbox.