When should I use WebSocket instead of REST?

Use WebSocket when you need persistent bidirectional communication: real-time chat, live dashboards, collaborative editing, multiplayer games, stock price feeds. For one-directional server-to-client updates (notifications, log streaming), Server-Sent Events (SSE) is simpler. For request-response with occasional pushes, REST + webhooks or REST + SSE is usually sufficient.

How do I scale WebSocket connections across multiple servers?

WebSocket connections are stateful and sticky — a client stays connected to one server. To scale: (1) Use a message broker relay (RabbitMQ STOMP plugin or Redis pub/sub) so all servers can publish to any connected client. (2) Spring Cloud Gateway or ALB with sticky sessions routes a client consistently to the same server. The message broker relay is preferred as it avoids sticky session dependency.

What is STOMP and why use it over raw WebSocket?

STOMP (Simple Text Oriented Messaging Protocol) is a messaging protocol that runs over WebSocket, providing pub/sub topic routing, message headers, receipts, and heartbeats. Raw WebSocket is just a byte stream — you'd need to implement routing yourself. STOMP gives you @MessageMapping, /topic/ pub/sub, /user/ queue per-user messages, and SockJS fallback in Spring — all production-ready.

What happens to WebSocket connections during a rolling deployment?

Active WebSocket connections are terminated when a pod/instance shuts down. Clients should implement automatic reconnection with exponential backoff. Use a STOMP heartbeat (25s interval) to detect dead connections. For zero-downtime, drain connections gracefully: stop accepting new connections first, wait for existing ones to disconnect (or set a timeout), then shut down the instance. Kubernetes preStop hooks help with this.

How do I authenticate WebSocket connections with JWT?

Authenticate at the STOMP CONNECT frame level using a ChannelInterceptor. Extract the JWT from the STOMP CONNECT headers (not HTTP headers, as WebSocket upgrade headers have limited support in browsers). Validate the JWT and set the authenticated Principal on the message header. This runs once per WebSocket session (at CONNECT), not per message.

Spring Boot April 11, 2026 · 21 min read

WebSocket in Spring Boot: Real-Time Chat, Notifications & Horizontal Scaling (2026)

Build production-grade real-time applications with Spring Boot and WebSocket: STOMP protocol, SockJS fallback, JWT authentication at the STOMP layer, user-specific queues, Redis pub/sub for multi-instance scaling, presence indicators, and production tuning.

Md Sanwar Hossain
Senior Java & Backend Engineer

WebSocket Spring Boot Real-Time Guide 2026

TL;DR: WebSocket is the right choice for persistent bidirectional communication — chat, live dashboards, collaborative editing. Use STOMP over WebSocket for pub/sub topic routing in Spring Boot, and Redis pub/sub to broadcast messages across multiple server instances.

1. WebSocket vs SSE vs Long Polling

Technique	Direction	Protocol	Use Case	Server Load
Long Polling	Server → Client	HTTP	Simple notifications	High (N open connections)
SSE	Server → Client only	HTTP/2	Live feeds, dashboards	Low
WebSocket	⚡ Bidirectional	WS (upgrade from HTTP)	Chat, collab, gaming	Low (persistent conn)

2. Spring Boot Setup: STOMP Config

// ❌ BAD: Simple in-memory broker (no clustering, lost messages on restart)

@Override
public void configureMessageBroker(MessageBrokerRegistry registry) {
    registry.enableSimpleBroker("/topic", "/queue");  // In-memory only!
    registry.setApplicationDestinationPrefixes("/app");
}

// ✅ GOOD: Full STOMP config with SockJS + external broker relay

@Configuration
@EnableWebSocketMessageBroker
public class WebSocketConfig implements WebSocketMessageBrokerConfigurer {

    @Override
    public void configureMessageBroker(MessageBrokerRegistry registry) {
        // External broker relay (RabbitMQ with STOMP plugin) — survives restarts, scales
        registry.enableStompBrokerRelay("/topic", "/queue")
            .setRelayHost("rabbitmq.internal")
            .setRelayPort(61613)
            .setClientLogin("ws-user")
            .setClientPasscode("ws-password")
            .setSystemLogin("system")
            .setSystemPasscode("system-password")
            .setVirtualHost("/")
            .setHeartbeatSendInterval(10000)
            .setHeartbeatReceiveInterval(10000);
        registry.setApplicationDestinationPrefixes("/app");
        registry.setUserDestinationPrefix("/user");
    }

    @Override
    public void registerStompEndpoints(StompEndpointRegistry registry) {
        registry.addEndpoint("/ws")
            .setAllowedOriginPatterns("https://*.myapp.com")
            .withSockJS()    // fallback for browsers that don't support WebSocket
            .setHeartbeatTime(25_000);
    }
}

3. STOMP Protocol: Message Flow

SUBSCRIBE /topic/chat-room-1 → receive all messages published to that topic
SEND /app/chat.send → routed to @MessageMapping("/chat.send") handler
@SendTo("/topic/chat-room-1") → broadcast to all subscribers of that topic
@SendToUser("/queue/reply") → sends only to the authenticated user who sent the message
SimpMessagingTemplate → programmatically send to any destination from any Spring component

4. Real-Time Chat Application

// ✅ GOOD: ChatController + WebSocket event listener

@Controller
public class ChatController {
    @Autowired private SimpMessagingTemplate messagingTemplate;
    @Autowired private MessageRepository messageRepository;

    @MessageMapping("/chat.send")
    @SendTo("/topic/chat-room-{roomId}")    // broadcast to room
    public ChatMessage sendMessage(@DestinationVariable String roomId,
                                   @Payload ChatMessage message,
                                   Principal principal) {
        message.setSenderId(principal.getName());
        message.setTimestamp(Instant.now());
        // Async persist to DB (don't block WebSocket thread)
        CompletableFuture.runAsync(() -> messageRepository.save(message));
        return message;
    }

    @MessageMapping("/chat.typing")
    public void typingIndicator(@DestinationVariable String roomId,
                                 @Payload TypingEvent event, Principal principal) {
        event.setUserId(principal.getName());
        messagingTemplate.convertAndSend("/topic/chat-room-" + roomId + "/typing", event);
    }
}

@Component
public class WebSocketEventListener {
    @Autowired private SimpMessageSendingOperations messagingTemplate;

    @EventListener
    public void handleWebSocketConnectListener(SessionConnectedEvent event) {
        StompHeaderAccessor sha = StompHeaderAccessor.wrap(event.getMessage());
        log.info("User connected: {}", sha.getUser().getName());
    }

    @EventListener
    public void handleWebSocketDisconnectListener(SessionDisconnectEvent event) {
        StompHeaderAccessor sha = StompHeaderAccessor.wrap(event.getMessage());
        ChatMessage leaveMessage = ChatMessage.builder()
            .type(MessageType.LEAVE)
            .senderId(sha.getUser().getName())
            .build();
        messagingTemplate.convertAndSend("/topic/public", leaveMessage);
    }
}

5. JWT Authentication for WebSocket

// ✅ GOOD: ChannelInterceptor validates JWT on STOMP CONNECT frame

@Component
public class JwtChannelInterceptor implements ChannelInterceptor {
    @Autowired private JwtDecoder jwtDecoder;

    @Override
    public Message<?> preSend(Message<?> message, MessageChannel channel) {
        StompHeaderAccessor accessor = MessageHeaderAccessor.getAccessor(
            message, StompHeaderAccessor.class);

        if (StompCommand.CONNECT.equals(accessor.getCommand())) {
            // JWT passed in STOMP CONNECT header (not HTTP Authorization header)
            String token = accessor.getFirstNativeHeader("Authorization");
            if (token != null && token.startsWith("Bearer ")) {
                try {
                    Jwt jwt = jwtDecoder.decode(token.substring(7));
                    UsernamePasswordAuthenticationToken auth =
                        new UsernamePasswordAuthenticationToken(
                            jwt.getSubject(), null, extractAuthorities(jwt));
                    accessor.setUser(auth);  // set Principal for @SendToUser
                } catch (JwtException e) {
                    throw new MessagingException("Invalid JWT: " + e.getMessage());
                }
            } else {
                throw new MessagingException("Missing JWT in CONNECT frame");
            }
        }
        return message;
    }

    @Override
    public void configureClientInboundChannel(ChannelRegistration registration) {
        registration.interceptors(new JwtChannelInterceptor());
    }
}

6. User-Specific Queues: Private Messages

// ✅ GOOD: Direct message to a specific user using /user/ prefix

// Client subscribes to: /user/queue/direct-messages
// (STOMP prefix /user + username + /queue/direct-messages)

@MessageMapping("/dm.send")
public void sendDirectMessage(@Payload DirectMessage dm, Principal sender) {
    dm.setSenderId(sender.getName());
    dm.setTimestamp(Instant.now());

    // Send to recipient only (even if connected to different server instance)
    messagingTemplate.convertAndSendToUser(
        dm.getRecipientId(),         // username
        "/queue/direct-messages",    // destination
        dm                           // payload
    );

    // Also send back to sender (for their own chat window)
    messagingTemplate.convertAndSendToUser(
        sender.getName(),
        "/queue/direct-messages",
        dm
    );
    // Async persist
    messageRepository.save(dm);
}

7. Horizontal Scaling with Redis Pub/Sub

The problem: User A connects to Server 1. User B connects to Server 2. If User A sends a message, Server 1 can only deliver it to clients connected to Server 1 — User B never receives it.

Solution: All servers subscribe to a Redis pub/sub channel. When any server receives a message, it publishes to Redis; all servers receive it and forward to their locally connected clients.

// ✅ GOOD: Redis pub/sub relay for WebSocket horizontal scaling

@Configuration
public class RedisWebSocketRelay {
    @Autowired private SimpMessagingTemplate messagingTemplate;

    @Bean
    public MessageListenerAdapter messageListenerAdapter() {
        return new MessageListenerAdapter(new RedisMessageDelegate(messagingTemplate));
    }

    @Bean
    public RedisMessageListenerContainer redisListenerContainer(
            RedisConnectionFactory connectionFactory,
            MessageListenerAdapter listenerAdapter) {
        RedisMessageListenerContainer container = new RedisMessageListenerContainer();
        container.setConnectionFactory(connectionFactory);
        // Subscribe to all chat channel patterns
        container.addMessageListener(listenerAdapter, new PatternTopic("chat:*"));
        return container;
    }
}

@Service
public class ChatBroadcastService {
    @Autowired private RedisTemplate<String, ChatMessage> redisTemplate;
    @Autowired private SimpMessagingTemplate wsTemplate;

    public void broadcast(String roomId, ChatMessage message) {
        // Publish to Redis — all instances will receive and forward to WS clients
        redisTemplate.convertAndSend("chat:" + roomId, message);
    }
}

class RedisMessageDelegate {
    private final SimpMessagingTemplate wsTemplate;

    public void handleMessage(ChatMessage message, String channel) {
        String roomId = channel.replace("chat:", "");
        wsTemplate.convertAndSend("/topic/chat-room-" + roomId, message);
    }
}

8. Presence & Typing Indicators

// ✅ GOOD: Presence tracking with Redis + TTL for crash recovery

@Service
public class PresenceService {
    @Autowired private RedisTemplate<String, String> redisTemplate;
    private static final int PRESENCE_TTL_SECONDS = 35;  // slightly > heartbeat interval

    public void userConnected(String roomId, String userId) {
        String key = "presence:" + roomId;
        redisTemplate.opsForSet().add(key, userId);
        redisTemplate.expire(key, PRESENCE_TTL_SECONDS, TimeUnit.SECONDS);  // auto-cleanup on crash
        broadcastPresence(roomId);
    }

    public void userDisconnected(String roomId, String userId) {
        redisTemplate.opsForSet().remove("presence:" + roomId, userId);
        broadcastPresence(roomId);
    }

    // Called on each heartbeat to refresh TTL
    public void refreshPresence(String roomId, String userId) {
        redisTemplate.opsForSet().add("presence:" + roomId, userId);
        redisTemplate.expire("presence:" + roomId, PRESENCE_TTL_SECONDS, TimeUnit.SECONDS);
    }

    public Set<String> getOnlineUsers(String roomId) {
        return redisTemplate.opsForSet().members("presence:" + roomId);
    }
}

9. Production Tuning & Connection Limits

Config	Default	Production	Notes
Max text message size	64KB	128KB	For rich messages with metadata
Max binary message size	512KB	1MB	Adjust per use case
Connections per JVM	~65K theoretical	20-30K safe	~10KB memory per idle conn
SockJS heartbeat	25s	25s	Keep alive through proxies/LBs

10. Interview Questions & Production Checklist

✅ WebSocket Production Checklist

Use external broker relay for production clustering
JWT auth on STOMP CONNECT frame
Redis pub/sub for multi-instance message delivery
Presence tracking with Redis TTL
Client reconnect with exponential backoff
SockJS fallback for load balancer compatibility
Heartbeat to keep connections alive through proxies
Graceful shutdown drain before pod restart

11. At BRAC IT: Real-Time Loan Status Notifications

At BRAC IT, we manage a microfinance platform that processes loan applications for hundreds of thousands of borrowers across Bangladesh. When we first launched the platform, loan applicants had no way to know the status of their application without repeatedly polling our REST API. The mobile app refreshed every 30 seconds — adding unnecessary server load and still leaving users with stale data. During peak periods our polling endpoints were handling 400+ req/sec of status-check traffic, with 95% of those returning "no change."

We replaced polling with a WebSocket notification system. When a loan officer approves or rejects an application, or when the automated scoring engine produces a result, a domain event is published to Kafka. A notification microservice consumes the event and pushes the update directly to the applicant's open WebSocket connection. The mobile app now shows a real-time status banner — "Your loan application has been approved!" — the moment the decision is made.

System Architecture

At peak usage we sustain 15,000 concurrent WebSocket connections — one per active loan applicant session. We run three notification service pods, each holding roughly 5,000 connections. The key architectural challenge: when a loan decision event arrives at any pod, the borrower's WebSocket connection might be on a different pod. A simple in-process SimpMessagingTemplate.convertAndSendToUser() would only work if the event arrived at the same pod that holds that user's connection.

The solution: configure Spring's message broker relay using Redis pub/sub as the shared message bus. Every pod subscribes to the same Redis channel. When any pod receives a domain event, it publishes to Redis, and all pods receive it. Each pod checks whether it holds a connection for that user — if yes, it delivers locally. This removes the need for sticky session routing at the load balancer level.

// ✅ BRAC IT: Redis pub/sub relay — any pod can push to any connected client

@Configuration
@EnableWebSocketMessageBroker
public class LoanNotificationWebSocketConfig implements WebSocketMessageBrokerConfigurer {

    @Value("${redis.host}") private String redisHost;
    @Value("${redis.port}") private int    redisPort;

    @Override
    public void configureMessageBroker(MessageBrokerRegistry registry) {
        // External Redis relay — shared across all pods
        registry.enableStompBrokerRelay("/topic", "/queue")
            .setRelayHost(redisHost)
            .setRelayPort(redisPort)
            .setClientLogin("borrower-svc")
            .setClientPasscode("${REDIS_PASSWORD}")
            .setSystemLogin("borrower-svc-system")
            .setSystemPasscode("${REDIS_PASSWORD}")
            // Heartbeat between gateway and Redis broker
            .setSystemHeartbeatSendInterval(10_000)
            .setSystemHeartbeatReceiveInterval(10_000);

        registry.setApplicationDestinationPrefixes("/app");
        registry.setUserDestinationPrefix("/user");
    }

    @Override
    public void registerStompEndpoints(StompEndpointRegistry registry) {
        registry.addEndpoint("/ws/notifications")
            .setAllowedOriginPatterns("https://*.bracit.com.bd")
            .withSockJS()
            .setHeartbeatTime(25_000)
            .setDisconnectDelay(5_000);
    }
}

// Kafka consumer that pushes events to WebSocket clients
@Component
@Slf4j
public class LoanDecisionEventConsumer {

    @Autowired private SimpMessagingTemplate messagingTemplate;
    @Autowired private LoanNotificationMapper mapper;

    @KafkaListener(topics = "loan.decisions", groupId = "notification-service",
                   containerFactory = "loanDecisionListenerFactory")
    public void handleLoanDecision(LoanDecisionEvent event) {
        LoanStatusNotification notification = mapper.toNotification(event);

        // Push to the specific borrower — works regardless of which pod holds the connection
        // Spring + Redis relay routes it to the correct pod automatically
        messagingTemplate.convertAndSendToUser(
            event.getBorrowerId(),          // Spring prefixes this to /user/{borrowerId}/queue/
            "/queue/loan-status",
            notification
        );

        log.info("Pushed loan decision to borrower={} status={} loanId={}",
            event.getBorrowerId(), event.getStatus(), event.getLoanId());
    }
}

Session Affinity: Why We Moved Away From It

Our first scaling attempt used sticky sessions at the AWS ALB level — each client was pinned to one pod via a cookie. This created uneven load distribution: some pods had 8,000 connections while others had 2,000. More critically, when a pod was restarted for a deployment, all 8,000 sticky connections dropped simultaneously, causing a reconnection storm. The Redis relay approach distributes reconnections evenly across pods since there's no affinity requirement.

12. WebSocket vs SSE vs Long Polling: When to Use Each

The choice between WebSocket, Server-Sent Events (SSE), and long polling has significant implications for server resource usage, browser compatibility, proxy behavior, and development complexity. Use this comparison table to make an informed decision before reaching for WebSocket by default:

Dimension	WebSocket	Server-Sent Events (SSE)	Long Polling
Protocol	ws:// / wss:// (TCP upgrade)	HTTP/1.1 or HTTP/2 streaming	Standard HTTP
Direction	Full-duplex bidirectional	Server → Client only	Server → Client (via poll)
Browser Support	All modern browsers (with SockJS fallback for older)	All modern; IE needs polyfill	Universal — plain HTTP
Proxy / Firewall	Some corporate proxies block WS upgrades; SockJS mitigates	Works through all HTTP proxies	Works through all HTTP proxies
Server Load	One persistent connection per client; ~10KB idle memory	One persistent HTTP stream per client	One thread blocked per client poll (high for many clients)
Latency	Sub-millisecond (persistent socket)	Sub-millisecond (persistent stream)	Up to poll timeout (1–30s)
Best For	Chat, collaborative editing, gaming, trading dashboards	Live feeds, notifications, log streaming, progress bars	Legacy environments, simple notifications with rare updates
Spring Support	Full: `@EnableWebSocketMessageBroker`, STOMP, SockJS	Full: `SseEmitter` in MVC, `Flux<ServerSentEvent>` in WebFlux	Standard MVC: `DeferredResult`, `Callable`

Recommendation: Use WebSocket when you need client-to-server messages in addition to server-to-client (e.g., chat, where clients send messages back). Use SSE when traffic is strictly one-directional from server to client and you want the simplest possible implementation — SSE reuses standard HTTP, works through every proxy, and reconnects automatically on the client side without any client-side library. Use long polling only as a last resort for legacy environments or when you need maximum compatibility with no persistent connection infrastructure.

// SSE in Spring Boot — simpler than WebSocket for server-only push

// Spring MVC SSE — great for progress bars, notifications, log tailing
@GetMapping(value = "/api/loans/{id}/status-stream",
            produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public SseEmitter streamLoanStatus(@PathVariable String id) {
    SseEmitter emitter = new SseEmitter(300_000L); // 5 min timeout
    loanStatusService.subscribe(id, emitter);
    return emitter;
}

// Spring WebFlux SSE — fully reactive, backpressure-aware
@GetMapping(value = "/api/loans/{id}/status-stream",
            produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<LoanStatus>> streamLoanStatusReactive(
        @PathVariable String id) {
    return loanStatusService.getStatusFlux(id)
        .map(status -> ServerSentEvent.<LoanStatus>builder()
            .id(UUID.randomUUID().toString())
            .event("loan-status-update")
            .data(status)
            .build());
}

13. Handling Reconnection and Message Delivery Guarantees

WebSocket connections are not TCP streams — they can drop at any moment due to network flaps, mobile device sleep, battery optimization, or backend pod restarts. A production WebSocket system must be designed with the assumption that connections will drop and implement both client-side reconnection and server-side message buffering to prevent message loss.

Client-Side: Exponential Backoff Reconnection

Never reconnect immediately on disconnect — a "thundering herd" of clients all reconnecting at the same moment (e.g., after a pod restart) can overwhelm the server. Implement exponential backoff with jitter:

// JavaScript: ReconnectableWebSocketClient with exponential backoff + jitter

class ReconnectableStompClient {
    constructor(url, token, onMessage) {
        this.url        = url;
        this.token      = token;
        this.onMessage  = onMessage;
        this.attempt    = 0;
        this.maxDelay   = 30_000;   // cap at 30 seconds
        this.baseDelay  = 500;      // start at 500ms
        this.client     = null;
        this.connect();
    }

    connect() {
        const socket = new SockJS(this.url);
        this.client  = Stomp.over(socket);
        this.client.debug = null; // suppress debug logs in production

        const headers = { Authorization: `Bearer ${this.token}` };

        this.client.connect(headers,
            (frame) => {
                console.log('Connected:', frame);
                this.attempt = 0; // reset backoff on success
                this.client.subscribe('/user/queue/loan-status',
                    (msg) => this.onMessage(JSON.parse(msg.body)));
            },
            (error) => {
                console.warn('Connection lost, reconnecting...', error);
                this.scheduleReconnect();
            }
        );
    }

    scheduleReconnect() {
        // Exponential backoff: 500ms, 1s, 2s, 4s, 8s, 16s, 30s (capped)
        const exponential = this.baseDelay * Math.pow(2, this.attempt);
        const jitter      = Math.random() * 1000;   // up to 1s random jitter
        const delay       = Math.min(exponential + jitter, this.maxDelay);
        this.attempt++;
        console.log(`Reconnecting in ${Math.round(delay)}ms (attempt ${this.attempt})`);
        setTimeout(() => this.connect(), delay);
    }
}

Server-Side: Message Buffering for Reconnecting Clients

If a loan decision arrives while a borrower's connection is dropped, the message would be lost without buffering. We store pending messages in Redis with a 5-minute TTL. When a client reconnects and sends a STOMP CONNECT frame, an interceptor checks Redis for any buffered messages and drains them immediately:

// ✅ Server-side message buffer: store missed messages in Redis for 5 minutes

@Service
@Slf4j
public class MessageBufferService {

    private static final Duration BUFFER_TTL = Duration.ofMinutes(5);
    private final RedisTemplate<String, LoanStatusNotification> redisTemplate;
    private final SimpMessagingTemplate messagingTemplate;

    // Called when no active WebSocket connection for this user
    public void bufferMessage(String userId, LoanStatusNotification notification) {
        String key = "ws:buffer:" + userId;
        redisTemplate.opsForList().rightPush(key, notification);
        redisTemplate.expire(key, BUFFER_TTL);
        log.debug("Buffered notification for offline user={} loanId={}",
            userId, notification.getLoanId());
    }

    // Called when user reconnects — drain buffer to their session
    public void drainBuffer(String userId) {
        String key = "ws:buffer:" + userId;
        List<LoanStatusNotification> buffered =
            redisTemplate.opsForList().range(key, 0, -1);

        if (buffered != null && !buffered.isEmpty()) {
            log.info("Delivering {} buffered messages to reconnected user={}",
                buffered.size(), userId);
            buffered.forEach(notification ->
                messagingTemplate.convertAndSendToUser(
                    userId, "/queue/loan-status", notification));
            redisTemplate.delete(key);  // clear buffer after delivery
        }
    }
}

// Channel interceptor — drain buffer on CONNECT
@Component
public class ReconnectDrainInterceptor implements ChannelInterceptor {

    @Autowired private MessageBufferService bufferService;

    @Override
    public Message<?> preSend(Message<?> message, MessageChannel channel) {
        StompHeaderAccessor accessor = MessageHeaderAccessor.getAccessor(
            message, StompHeaderAccessor.class);

        if (StompCommand.CONNECT.equals(accessor.getCommand())) {
            // JWT validation happens in JwtChannelInterceptor (runs first)
            // At this point, accessor.getUser() is already set
        }
        if (StompCommand.CONNECTED.equals(accessor.getCommand())
                && accessor.getUser() != null) {
            // Drain any buffered messages after successful authentication
            bufferService.drainBuffer(accessor.getUser().getName());
        }
        return message;
    }
}

14. WebSocket Security: Beyond JWT

JWT authentication on the STOMP CONNECT frame is the first line of defence, but a hardened production WebSocket service needs additional security controls. Three areas that are commonly overlooked: CSRF for WebSocket endpoints, rate limiting per connection to prevent message flooding, and payload size limits to prevent memory exhaustion attacks.

CSRF for WebSocket

Cross-origin WebSocket connections are not subject to the same-origin policy in the same way as XHR requests, but browsers do send cookies (including session cookies) on the WebSocket HTTP upgrade request. A malicious site could therefore initiate a WebSocket connection to your server using the victim's session cookies. The defense: validate the Origin header on the WebSocket upgrade request and use token-based authentication (JWT in STOMP CONNECT) rather than cookie-based sessions.

// ✅ Origin validation on WebSocket upgrade + disable CSRF for WS (using JWT instead)

@Configuration
@EnableWebSecurity
public class WebSocketSecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http
            // CSRF for WebSocket endpoints is handled via Origin check + JWT
            // not via CSRF tokens (which can't be injected into WS handshake easily)
            .csrf(csrf -> csrf
                .ignoringRequestMatchers("/ws/**"))
            .headers(headers -> headers
                .frameOptions(f -> f.deny()));
        return http.build();
    }
}

// Validate Origin in the WebSocket config itself
@Override
public void registerStompEndpoints(StompEndpointRegistry registry) {
    registry.addEndpoint("/ws/notifications")
        // ONLY allow connections from our own domains — blocks cross-origin WS abuse
        .setAllowedOriginPatterns(
            "https://*.bracit.com.bd",
            "https://mobileapp.bracit.com.bd"
        )
        .withSockJS();
}

Rate Limiting Per Connection: Preventing Message Flooding

A connected client can send STOMP frames as fast as the connection allows. Without rate limiting, a single malicious or buggy client could flood your server with messages, consuming thread pool capacity and degrading service for all other users. We implement a per-user message rate limiter using a sliding window counter stored in Redis:

// ✅ ChannelInterceptor: rate limit inbound STOMP messages per user (10/sec)

@Component
@Slf4j
public class MessageRateLimitingInterceptor implements ChannelInterceptor {

    private static final int MAX_MESSAGES_PER_SECOND = 10;
    private final RedisTemplate<String, Long> redisTemplate;

    @Override
    public Message<?> preSend(Message<?> message, MessageChannel channel) {
        StompHeaderAccessor accessor = MessageHeaderAccessor.getAccessor(
            message, StompHeaderAccessor.class);

        // Only rate-limit SEND frames (client sending messages to server)
        if (!StompCommand.SEND.equals(accessor.getCommand())) {
            return message;
        }

        if (accessor.getUser() == null) {
            throw new MessagingException("Unauthenticated message rejected");
        }

        String userId = accessor.getUser().getName();
        if (isRateLimited(userId)) {
            log.warn("Message rate limit exceeded for user={}", userId);
            throw new MessagingException("Rate limit exceeded: max "
                + MAX_MESSAGES_PER_SECOND + " messages per second");
        }

        return message;
    }

    private boolean isRateLimited(String userId) {
        String key      = "ws:ratelimit:" + userId;
        long   now      = System.currentTimeMillis();
        long   windowMs = 1000L;

        // Sliding window: increment counter, set TTL on first message
        Long count = redisTemplate.opsForValue().increment(key);
        if (count != null && count == 1L) {
            redisTemplate.expire(key, Duration.ofMillis(windowMs));
        }
        return count != null && count > MAX_MESSAGES_PER_SECOND;
    }
}

// Register interceptor in WebSocket config
@Override
public void configureClientInboundChannel(ChannelRegistration registration) {
    registration.interceptors(
        jwtChannelInterceptor,          // auth first
        messageRateLimitingInterceptor  // then rate limit
    );
}

Additionally, configure Spring's built-in message size limits to prevent individual messages from exhausting heap memory. A single large message can cause an OOM error if the buffer size is not capped:

// ✅ Message size limits to prevent memory exhaustion attacks

@Override
public void configureWebSocketTransport(WebSocketTransportRegistration registration) {
    registration
        .setMessageSizeLimit(64 * 1024)         // max 64KB per STOMP message
        .setSendBufferSizeLimit(512 * 1024)      // max 512KB per-session send buffer
        .setSendTimeLimit(20_000)                // disconnect if send takes >20s
        .setTimeToFirstMessage(30_000);          // disconnect if no CONNECT within 30s
}

15. Performance Tuning WebSocket Connections

Sustaining 15,000 concurrent connections at BRAC IT required careful tuning of heartbeats, thread pools, buffer sizes, and JVM parameters. The defaults are designed for correctness at low scale; production needs deliberate configuration for high connection density.

Heartbeat Configuration

STOMP heartbeats keep connections alive through load balancers and proxies that have idle connection timeouts. AWS ALB drops idle connections after 60 seconds by default. Our heartbeat interval of 25 seconds ensures the connection is never idle long enough to be dropped:

# application.yml — BRAC IT WebSocket production tuning

spring:
  websocket:
    # No direct YAML config for message broker — use Java config
    # These affect the underlying Tomcat/Netty connector

server:
  tomcat:
    threads:
      max: 400         # Handle WS upgrade HTTP requests + REST endpoints
      min-spare: 20
    connection-timeout: 20000
    max-connections: 20000     # Tomcat max WS + HTTP connections

---
# JVM flags for high connection density (add to JAVA_OPTS in pod spec)
# -Xms2g -Xmx4g
# -XX:+UseG1GC
# -XX:MaxGCPauseMillis=100
# -Djava.net.preferIPv4Stack=true
# Increase file descriptor limit in container: ulimit -n 100000

// Java config: heartbeat, thread pools, buffer sizes

@Override
public void configureMessageBroker(MessageBrokerRegistry registry) {
    registry.enableSimpleBroker("/topic", "/queue")
        // STOMP heartbeat: server sends every 10s, expects client every 10s
        // If client misses 2 consecutive heartbeats (20s) → session dropped
        .setHeartbeatValue(new long[]{10_000, 10_000})
        // Thread pool for broadcasting messages to subscribers
        .setTaskScheduler(heartbeatTaskScheduler());

    registry.setApplicationDestinationPrefixes("/app");
}

@Bean
public TaskScheduler heartbeatTaskScheduler() {
    ThreadPoolTaskScheduler scheduler = new ThreadPoolTaskScheduler();
    // Size based on: concurrent heartbeat + scheduled tasks
    // Rule of thumb: 1 thread per 5,000 connections for heartbeat checks
    scheduler.setPoolSize(5);
    scheduler.setThreadNamePrefix("ws-heartbeat-");
    return scheduler;
}

@Override
public void configureClientInboundChannel(ChannelRegistration registration) {
    // Thread pool that processes STOMP frames from clients
    // Under load: one thread can handle ~500 msgs/sec of simple routing
    registration.taskExecutor()
        .corePoolSize(8)
        .maxPoolSize(20)
        .queueCapacity(500)
        .keepAliveSeconds(60);
}

@Override
public void configureClientOutboundChannel(ChannelRegistration registration) {
    // Thread pool for sending messages to clients
    // Should be at least as large as inbound to avoid back-pressure
    registration.taskExecutor()
        .corePoolSize(8)
        .maxPoolSize(20)
        .queueCapacity(1000)
        .keepAliveSeconds(60);
}

Performance Summary: BRAC IT Numbers

Config	Default	Our Production	Notes
STOMP heartbeat interval	0 (disabled)	10,000ms	Detect dead connections; stay under ALB 60s timeout
Max message size	64KB	32KB	Our loan notifications are <1KB; smaller cap prevents abuse
Send buffer per session	512KB	128KB	Reduced to save memory with 15K connections (~1.9GB total)
Inbound thread pool	1 core	8 core / 20 max	Handles concurrent STOMP frame processing from all sessions
Connections per pod	No explicit limit	~5,000	Load balanced; ~10KB memory per idle connection
Reconnection jitter window	N/A (client-side)	0–1,000ms random	Prevents thundering herd on pod restarts

One critical operational lesson: monitor the WebSocket outbound send buffer queue depth. When a slow client (poor network) can't consume messages fast enough, the outbound buffer fills up. Without a send time limit, the server thread stays blocked trying to send to that one slow client. Setting setSendTimeLimit(20_000) ensures that a slow client gets disconnected rather than blocking server resources indefinitely. The client reconnects with exponential backoff and receives any missed messages from the Redis buffer.

Tags:

websocket spring boot spring boot stomp real time chat spring boot 2026 websocket horizontal scaling redis websocket jwt auth sockjs spring boot

System Design

Real-Time Notification System

Microservices

Redis Caching Patterns

Spring Boot

Spring WebFlux vs MVC