Software Engineer · Java · Spring Boot · Microservices
Java Memory Model (JMM) Explained: Happens-Before, Volatile, Synchronization & Visibility Guarantees
Concurrent bugs caused by memory visibility failures are among the most pernicious in Java: they are non-deterministic, hardware-dependent, and often disappear under the debugger. The Java Memory Model (JMM), defined in JLS Chapter 17, is the formal specification that governs which values a thread is guaranteed to see when reading a variable written by another thread. Mastering the JMM — its happens-before rules, volatile semantics, synchronized guarantees, and safe publication patterns — is essential for every Java engineer building correct concurrent systems.
Table of Contents
- Why the Java Memory Model Matters: The Silent Source of Concurrent Bugs
- The Java Memory Model: Main Memory vs Working Memory
- Happens-Before: The Foundation of JMM
- volatile: Memory Visibility and Reordering Prevention
- synchronized: Mutual Exclusion and Memory Visibility
- Safe Publication Patterns
- Data Races and How to Detect Them
- Memory Barriers and JVM Implementation
- Key Takeaways
- Conclusion
1. Why the Java Memory Model Matters: The Silent Source of Concurrent Bugs
Modern CPUs do not execute instructions in program order. Compilers reorder instructions to fill pipeline bubbles, CPUs execute instructions out-of-order to hide memory latency, and each CPU core has multiple layers of cache (L1, L2, L3) that may hold stale copies of variables written by another core. A write on CPU core 0 may not be visible to CPU core 1 until a cache coherence protocol flushes the write to main memory and invalidates the corresponding cache line on core 1.
Without a formal memory model, Java programmers would need to reason about every CPU's specific memory ordering guarantees — an impossible task given that Java runs on x86, ARM, RISC-V, SPARC, and other architectures with radically different ordering semantics. The JMM abstracts these details into a single specification: a set of rules that define which actions are guaranteed to be visible across threads, independent of the underlying hardware.
Consider a classic StoreLoad reordering scenario. Thread A writes a result and then sets a flag; Thread B reads the flag and then reads the result. Without a memory barrier, the CPU may reorder Thread A's stores, so Thread B sees the flag set but reads a stale result:
// StoreLoad reordering example — broken without volatile
class StoreLoadExample {
int result = 0;
boolean ready = false; // NOT volatile — visibility not guaranteed
// Thread A
void writer() {
result = 42; // store 1
ready = true; // store 2 — CPU may reorder stores 1 and 2
}
// Thread B
void reader() {
while (!ready) { } // spin-wait
// BUG: result may still be 0 if store 1 is not yet visible!
System.out.println(result);
}
}
On ARM processors, which have a weaker memory model than x86, this bug is easily reproducible. On x86, the Total Store Order (TSO) model prevents most store-store reorderings, which is why this class of bug often hides on developer workstations but surfaces in production on cloud instances running ARM-based Graviton CPUs.
2. The Java Memory Model: Main Memory vs Working Memory (CPU Cache)
The JMM defines an abstract model with two memory regions. Main memory is the authoritative store shared by all threads — analogous to DRAM. Each thread also has its own working memory — an abstract representation of the CPU registers and cache lines that thread uses. A thread never reads or writes to main memory directly; instead, it operates on its local working memory copy and the JMM defines precisely when those copies must be synchronized with main memory.
The JMM specifies eight atomic memory operations that define the interaction between threads and memory: read (transfer value from main memory to working memory), load (put read value into working memory variable copy), use (pass working memory value to execution engine), assign (put execution engine result into working memory), store (transfer working memory value to main memory), write (put stored value into main memory variable), lock (acquire monitor on main memory object), and unlock (release monitor, flushing working memory to main memory).
The following example demonstrates a classic visibility bug: a background thread sets a running flag to false to signal the worker thread to stop. Without volatile, the worker thread may cache the flag in a register and loop forever:
// Visibility bug: worker thread may loop forever
class VisibilityBug {
// BUG: not volatile — JIT may hoist the read out of the loop
private boolean running = true;
void startWorker() {
Thread worker = new Thread(() -> {
// JIT compiler may transform this into:
// if (running) { while (true) { doWork(); } }
// because it doesn't see running modified in this thread
while (running) {
doWork();
}
});
worker.start();
}
void stop() {
running = false; // this write may NEVER be seen by worker thread!
}
void doWork() { /* ... */ }
}
// FIXED: declare running as volatile
class VisibilityFixed {
private volatile boolean running = true; // guaranteed visibility
void startWorker() {
new Thread(() -> {
while (running) { doWork(); }
}).start();
}
void stop() { running = false; }
void doWork() { /* ... */ }
}
volatile and final fields and established the formal happens-before relation. Before Java 5, the JMM was widely considered broken for many concurrent use cases.
3. Happens-Before: The Foundation of JMM
The happens-before relation is the cornerstone of the JMM. Formally: if action A happens-before action B, then all effects of A — every write to every variable — are guaranteed to be visible to B when B executes. Crucially, happens-before is about visibility, not time. An action that "happened" earlier in wall-clock time does not necessarily happen-before another action unless a formal happens-before edge exists between them.
The JMM defines the following happens-before rules (this list is exhaustive — if your code cannot establish a happens-before chain through one of these rules, there is no visibility guarantee):
| Happens-Before Rule | Example / Context |
|---|---|
| Program Order Rule | Each action in a thread happens-before every subsequent action in the same thread. (Within a single thread, code runs in order.) |
| Monitor Lock Rule | An unlock on a monitor happens-before every subsequent lock on that same monitor. Used by synchronized. |
| Volatile Variable Rule | A write to a volatile field happens-before every subsequent read of that same field by any thread. |
| Thread Start Rule | All actions in a thread happen-before any action in the started thread. thread.start() establishes the edge. |
| Thread Join Rule | All actions in a thread happen-before thread.join() returns in the joining thread. |
| Thread Interruption Rule | A thread calling interrupt() on another thread happens-before the interrupted thread detects the interrupt. |
| Object Finalizer Rule | The end of an object's constructor happens-before the start of its finalizer. |
| Transitivity | If A happens-before B, and B happens-before C, then A happens-before C. |
The following code demonstrates why a happens-before edge is required and how volatile establishes one:
// Happens-before: broken without volatile, fixed with volatile
class HappensBeforeDemo {
int data = 0;
volatile boolean published = false; // the happens-before anchor
// Thread A: write data, then publish
void producer() {
data = 100; // write 1 — program order: before volatile write
published = true; // volatile write — happens-before any subsequent volatile read
// By program order rule: data=100 hb published=true
// By volatile rule: published=true hb published read in consumer
// By transitivity: data=100 hb consumer's read of data ✓
}
// Thread B: wait for publish, then read data
void consumer() {
while (!published) { } // volatile read — spin until visible
// Guaranteed: data == 100 because of happens-before chain
System.out.println(data); // always prints 100
}
}
4. volatile: Memory Visibility and Reordering Prevention
The volatile keyword provides two distinct guarantees under the JMM. First, visibility: a write to a volatile variable is immediately flushed to main memory, and any subsequent read of that variable by any thread fetches from main memory rather than from a cached copy. Second, reordering prevention: the JMM prohibits the compiler and CPU from reordering reads/writes across a volatile access. All writes before a volatile write are visible to any thread that reads that volatile variable.
What volatile does not guarantee is atomicity of compound operations. The classic example is the increment operator — i++ is a read-modify-write sequence: read current value, add 1, write back. Even with volatile int i, two threads executing i++ simultaneously can both read the same value, both compute the same result, and produce a final count that undercounts by one:
// volatile does NOT make compound operations atomic
class VolatileAtomicityBug {
volatile int counter = 0; // volatile only guarantees visibility
void increment() {
counter++; // NOT atomic! Equivalent to:
// int tmp = counter; (read)
// tmp = tmp + 1; (modify)
// counter = tmp; (write)
// Two threads can interleave and lose an increment
}
}
// CORRECT: use AtomicInteger for atomic read-modify-write
import java.util.concurrent.atomic.AtomicInteger;
class AtomicCounterCorrect {
private final AtomicInteger counter = new AtomicInteger(0);
void increment() {
counter.incrementAndGet(); // CAS-based atomic increment
}
int get() { return counter.get(); }
}
One of the most critical applications of volatile is the Double-Checked Locking (DCL) singleton pattern. Without volatile, DCL is broken because object construction can be partially visible to other threads — the reference to the partially-constructed object may be written to the field before the constructor completes:
// BROKEN DCL — do NOT use this
class BrokenSingleton {
private static BrokenSingleton instance; // NOT volatile — broken!
public static BrokenSingleton getInstance() {
if (instance == null) { // check 1 — no synchronization
synchronized (BrokenSingleton.class) {
if (instance == null) { // check 2 — inside lock
instance = new BrokenSingleton();
// JIT/CPU may reorder constructor writes AFTER the
// reference assignment. Another thread reading instance
// at check 1 sees a non-null but incompletely initialized object!
}
}
}
return instance;
}
}
// CORRECT DCL — volatile ensures safe publication
class CorrectSingleton {
private static volatile CorrectSingleton instance; // volatile required!
private CorrectSingleton() { /* initialize fields */ }
public static CorrectSingleton getInstance() {
if (instance == null) { // check 1 — fast path, no lock
synchronized (CorrectSingleton.class) {
if (instance == null) { // check 2 — inside lock
instance = new CorrectSingleton();
// volatile write: all constructor writes happen-before
// this assignment, which happens-before any subsequent
// volatile read of instance by any thread. Safe!
}
}
}
return instance;
}
}
volatile is NOT a replacement for synchronized when you need atomicity of compound operations. Use volatile when: (1) only one thread writes and others only read, (2) you need a simple visibility/publication flag, or (3) you need to prevent reordering around a state transition. Use synchronized or AtomicXxx when multiple threads perform read-modify-write operations on the same variable.
5. synchronized: Mutual Exclusion and Memory Visibility
synchronized provides both mutual exclusion (only one thread executes the critical section at a time) and memory visibility (all writes made inside a synchronized block are visible to any thread that subsequently acquires the same lock). This dual guarantee is what distinguishes synchronized from volatile: volatile gives visibility without exclusion, while synchronized gives both.
The JMM's Monitor Lock Rule states: an unlock of monitor M happens-before every subsequent lock of monitor M. This means all writes performed while holding a lock are flushed to main memory on unlock, and a thread acquiring the same lock will see all those writes. This is why a properly synchronized counter never loses increments:
import java.util.concurrent.locks.ReentrantLock;
class SynchronizationExamples {
// 1. Synchronized method — lock is the instance (this)
private int count = 0;
public synchronized void increment() {
count++; // atomic under the lock + visibility guaranteed on unlock
}
public synchronized int getCount() {
return count; // guaranteed to see latest value from main memory
}
// 2. Synchronized block — more granular locking
private final Object lock = new Object();
private int value = 0;
public void setValue(int v) {
synchronized (lock) {
value = v;
} // unlock: writes flushed to main memory
}
public int getValue() {
synchronized (lock) {
return value; // lock: reads fresh from main memory
}
}
// 3. ReentrantLock — explicit lock with tryLock and timeout
private final ReentrantLock reentrantLock = new ReentrantLock();
private int resource = 0;
public boolean tryUpdate(int newValue) {
if (reentrantLock.tryLock()) { // non-blocking attempt
try {
resource = newValue;
return true;
} finally {
reentrantLock.unlock(); // ALWAYS unlock in finally!
}
}
return false; // lock was held by another thread
}
}
Choosing between synchronized, volatile, and AtomicXxx depends on the specific concurrency requirement:
- Use
volatilefor simple flags or reference publication where only one thread writes and visibility is the only concern. - Use
AtomicInteger/AtomicReference/ etc. for single-variable compound operations (increment, compare-and-set, get-and-set) without holding a broader lock. - Use
synchronizedwhen you need to protect a compound invariant spanning multiple fields, or when both atomicity and visibility across multiple variables are required. - Use
ReentrantLockwhen you need features beyondsynchronized: timed locking, interruptible locking, multiple condition variables, or fairness guarantees.
6. Safe Publication Patterns
Safe publication means making an object reference visible to other threads in a way that guarantees those threads also see the object's fully constructed state — not a partially-initialized view. This is more subtle than it sounds: publishing an object reference (storing it in a shared field) does not automatically make the object's fields visible to readers unless a happens-before edge connects the constructor to the reader.
The JMM (via JSR-133) defines four mechanisms for safe publication:
- Static initializer: an object initialized in a static initializer is safely published to all threads. The JVM guarantees static initializers run under class-loading synchronization.
- volatile field: storing a reference in a
volatilefield safely publishes it. The volatile write happens-before any volatile read, which happens-before the reader accesses the object's fields. - synchronized block: publishing inside a
synchronizedblock and reading inside asynchronizedblock on the same lock establishes the happens-before chain. - final field (JSR-133 freeze guarantee): if an object's field is
finaland the reference is not leaked from the constructor, any thread that obtains a reference to the object is guaranteed to see the field's value as set by the constructor.
// Pattern 1: BROKEN lazy singleton — unsafe publication
class BrokenLazySingleton {
private static BrokenLazySingleton instance;
static BrokenLazySingleton get() {
if (instance == null)
instance = new BrokenLazySingleton(); // data race!
return instance;
}
}
// Pattern 2: Initialization-on-demand holder idiom — class-level lazy init
// Safe because class loading is synchronized by the JVM
class HolderSingleton {
private HolderSingleton() {}
private static class Holder {
// Initialized when Holder class is first loaded (lazy)
// Class loading is synchronized — safe publication for free
static final HolderSingleton INSTANCE = new HolderSingleton();
}
public static HolderSingleton getInstance() {
return Holder.INSTANCE;
}
}
// Pattern 3: Enum singleton — the safest approach (Bloch's Effective Java)
enum EnumSingleton {
INSTANCE;
public void serve() { /* ... */ }
// Enum instances are guaranteed to be initialized exactly once
// by the JVM's class loading mechanism — thread-safe by spec
}
// Pattern 4: final fields — always safely published
class SafePoint {
private final int x;
private final int y;
SafePoint(int x, int y) {
this.x = x;
this.y = y;
// Reference must NOT escape the constructor before it completes
// (e.g., don't pass 'this' to another thread in the constructor)
}
// Any thread that sees a reference to SafePoint is guaranteed
// to see x and y as initialized — JSR-133 final field freeze guarantee
}
7. Data Races and How to Detect Them
A data race occurs when two or more threads access the same variable, at least one of the accesses is a write, and the accesses are not ordered by a happens-before relationship. In the presence of a data race, the JMM makes no guarantees about what value a reader will observe — the behavior is formally undefined at the JMM level, meaning the program can produce any result including values that were never written.
Data races lead to consequences ranging from stale reads (seeing an old value), torn reads (seeing a value that was never stored — possible for 64-bit long and double on 32-bit JVMs), to infinite loops caused by JIT hoisting of reads out of loops:
// Data race example — two threads increment without synchronization
class DataRaceExample {
int sharedCounter = 0; // not volatile, not synchronized
void run() throws InterruptedException {
Thread t1 = new Thread(() -> {
for (int i = 0; i < 100_000; i++) sharedCounter++;
});
Thread t2 = new Thread(() -> {
for (int i = 0; i < 100_000; i++) sharedCounter++;
});
t1.start(); t2.start();
t1.join(); t2.join();
// Expected: 200000, Actual: somewhere between 100001 and 200000
System.out.println(sharedCounter);
}
}
// How to detect with JFR thread profiling (JDK 11+)
// Enable with JVM flags:
// java -XX:StartFlightRecording=filename=race.jfr,settings=profile
// -XX:+EnableContended
// -Djdk.attach.allowAttachSelf=true
// MyApp
//
// Or via code using JFR API:
import jdk.jfr.Recording;
import jdk.jfr.consumer.RecordingFile;
class JfrRaceDetection {
static void captureContention() throws Exception {
try (Recording recording = new Recording()) {
recording.enable("jdk.JavaMonitorEnter").withThreshold(
java.time.Duration.ofMillis(10));
recording.enable("jdk.ThreadPark");
recording.start();
// ... run workload under test ...
Thread.sleep(5000);
recording.stop();
recording.dump(java.nio.file.Path.of("contention.jfr"));
}
// Open contention.jfr in JDK Mission Control to visualize lock contention
}
}
For native (JNI) code or when running Java through a native test harness, ThreadSanitizer (TSan) is the gold-standard dynamic race detector. For pure Java, the primary tools are: async-profiler with lock contention profiling (-e lock), JFR with jdk.JavaMonitorEnter and jdk.JavaMonitorWait events, and the JVM flag -XX:+EnableContended which enables the @Contended annotation processing for false sharing detection. Static analysis tools like SpotBugs with the findbugs-jsr305 plugin can also flag common data race patterns at compile time.
8. Memory Barriers and JVM Implementation
The JMM's happens-before guarantees are implemented at the JVM/JIT level by inserting memory barriers (also called memory fences) into the generated machine code. A memory barrier is a CPU instruction that prevents certain types of instruction reordering across it and ensures that memory operations above or below the barrier are completed before execution continues past it.
There are four canonical barrier types:
- LoadLoad: ensures all loads before the barrier complete before any load after the barrier.
- StoreStore: ensures all stores before the barrier complete before any store after the barrier.
- LoadStore: ensures all loads before the barrier complete before any store after the barrier.
- StoreLoad: the most expensive — ensures all stores before the barrier are visible to all processors before any load after the barrier. This is the full fence.
For volatile fields, the JIT inserts the following barriers: a StoreStore + StoreLoad barrier after every volatile write (ensuring all prior writes are visible before the volatile write, and the volatile write is visible before any subsequent read), and a LoadLoad + LoadStore barrier after every volatile read (ensuring the volatile read happens before any subsequent reads or writes). For synchronized blocks, a StoreLoad barrier is inserted at lock acquisition and a StoreStore + StoreLoad at lock release.
// To inspect JIT-generated assembly and memory barrier placement:
// (Requires hsdis — HotSpot Disassembler library)
//
// Compile:
// javac VolatileBarrier.java
//
// Run with PrintAssembly:
// java -server \
// -XX:+UnlockDiagnosticVMOptions \
// -XX:+PrintAssembly \
// -XX:CompileCommand=compileonly,VolatileBarrier.writer \
// VolatileBarrier
//
// On x86, a volatile write emits:
// mov [addr], value ; the store
// lock addl $0, (%rsp) ; StoreLoad fence (x86 MFENCE equivalent)
//
// On ARM64 (AArch64), a volatile write emits:
// stlr x0, [x1] ; Store-Release instruction (StoreStore + StoreLoad barrier built-in)
// On ARM64, a volatile read emits:
// ldar x0, [x1] ; Load-Acquire instruction (LoadLoad + LoadStore barrier built-in)
class VolatileBarrier {
static volatile int guard = 0;
static int data = 0;
static void writer() {
data = 42; // regular store
guard = 1; // volatile store — JIT inserts StoreStore+StoreLoad barriers
}
static void reader() {
int g = guard; // volatile load — JIT inserts LoadLoad+LoadStore barriers
if (g == 1) {
System.out.println(data); // guaranteed to see 42
}
}
}
Key Takeaways
- The JMM is the contract between the Java programmer and the JVM/CPU. Without an explicit happens-before edge, there is no visibility guarantee — period.
- happens-before is about visibility, not time. Code running earlier in wall-clock time does not guarantee visibility unless a formal happens-before edge (volatile, lock, thread start/join) connects the writer to the reader.
- volatile guarantees visibility and reordering prevention, not atomicity. Use
AtomicXxxfor compound read-modify-write operations; usevolatilefor simple flags and safe publication. - synchronized provides both mutual exclusion and visibility. The unlock happens-before the next lock on the same monitor — all writes under the lock are visible to the next thread that acquires it.
- Double-Checked Locking requires volatile. Without
volatile, a thread may see a non-null reference to a partially-constructed object. Prefer the Initialization-on-demand Holder idiom or enum singleton instead. - Safe publication requires a happens-before edge. Four mechanisms: static initializer, volatile field, synchronized block, or final fields (JSR-133 freeze guarantee).
- Data races have undefined behavior in the JMM. Use JFR, async-profiler lock profiling, and SpotBugs to detect them statically and dynamically.
- x86 hides JMM violations that ARM exposes. Test on production architecture; never assume that passing on x86 means correct on ARM Graviton.
Conclusion
The Java Memory Model is the invisible foundation on which all correct concurrent Java programs rest. Most Java developers write concurrent code for years without consciously thinking about the JMM — and many of them ship subtle bugs that manifest only under load on specific hardware, or that are accidentally masked by the stronger memory model of x86. Understanding happens-before, the semantics of volatile and synchronized, safe publication patterns, and how the JVM implements these guarantees with memory barriers is not academic knowledge — it is operational knowledge required to build reliable concurrent systems.
The practical takeaway is simple: every shared mutable variable needs an explicit synchronization strategy, and that strategy must establish a happens-before chain between every writer and every reader. Whether you choose volatile, synchronized, AtomicXxx, ReentrantLock, or higher-level concurrency utilities from java.util.concurrent, the JMM is the lens through which you should evaluate the correctness of your choice. With ARM-based infrastructure now ubiquitous in cloud environments, the cost of getting this wrong has never been higher — or the value of getting it right more evident.
Leave a Comment
Related Posts
Software Engineer · Java · Spring Boot · Microservices