Load testing Spring Boot with k6 and Gatling performance benchmarks
Md Sanwar Hossain - Software Engineer
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices

Load Testing Spring Boot with k6 & Gatling: Benchmarking APIs Under Real Traffic

Your Spring Boot API passes all unit and integration tests — but how does it behave when 1,000 users hit it simultaneously? Load testing bridges the gap between a green CI pipeline and production confidence. With k6 and Gatling, you can benchmark your APIs under realistic traffic, catch bottlenecks before they become incidents, and validate SLOs with hard data.

Table of Contents

  1. Why Load Testing is Critical Before Production Deployment
  2. Getting Started with k6 for Spring Boot APIs
  3. Advanced k6: Scenarios, Checks & Thresholds
  4. Gatling for Java Engineers: Simulation DSL
  5. Interpreting Results: Bottleneck Identification
  6. Integrating Load Tests in CI/CD Pipelines
  7. Spring Boot Optimization Based on Load Test Results

1. Why Load Testing is Critical Before Production Deployment

Load testing architecture: k6/Gatling sending traffic to Spring Boot API monitored by Prometheus and Grafana
Figure 1: Load testing architecture — generators drive traffic into the Spring Boot API; metrics flow to Prometheus and Grafana for real-time visibility.

In 2023, a major e-commerce platform's Black Friday deployment failed within 11 minutes of go-live. The culprit? A perfectly functioning checkout API that had never been tested beyond 50 concurrent users. Under 800 concurrent shoppers it exhausted its HikariCP connection pool in 40 seconds, triggering a cascade that brought down order, inventory, and notification services simultaneously.

The distinction between test types matters at 2 AM during an incident:

Test Type Goal Pattern When to Run
Load Test Validate SLOs at expected peak Ramp to target, hold 30 min Before every release
Stress Test Find breaking point Increase load until failure Capacity planning
Spike Test Detect sudden traffic surge handling Instant 10× load for 60 s Marketing campaigns
Soak Test Uncover memory leaks, slow degradation Moderate load for 8–24 h Monthly regression

Define your SLOs as load test thresholds before writing a single script. A concrete target: p99 latency < 200 ms at 1,000 RPS with an error rate below 0.5%. Everything else flows from that definition.

Tool Language Runtime Reporting Best For
k6 JavaScript/TS Go Grafana / InfluxDB Cloud-native, DevOps teams
Gatling Scala / Java JVM / Akka HTML (built-in) Java teams, Maven/Gradle CI
JMeter XML / GUI JVM Dashboard plugin Legacy enterprise, GUI users

2. Getting Started with k6 for Spring Boot APIs

Install k6 in seconds on any platform. On macOS/Linux:

# macOS
brew install k6

# Linux (Debian/Ubuntu)
sudo gpg -k
sudo gpg --no-default-keyring \
  --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
  --keyserver hkp://keyserver.ubuntu.com:80 \
  --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
  https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update && sudo apt-get install k6

# Docker (no install required)
docker run --rm -i grafana/k6 run - <script.js

Here is a complete first k6 script targeting a Spring Boot REST API. It ramps up to 100 virtual users, holds for 1 minute, and enforces p95 and error-rate thresholds:

// k6-springboot-baseline.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

const BASE_URL = __ENV.BASE_URL || 'http://localhost:8080';
const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '30s', target: 20  },  // warm-up
    { duration: '1m',  target: 100 },  // ramp to 100 VUs
    { duration: '2m',  target: 100 },  // hold at 100 VUs
    { duration: '30s', target: 0   },  // ramp down
  ],
  thresholds: {
    'http_req_duration': ['p(95)<500', 'p(99)<1000'],
    'http_req_failed':   ['rate<0.01'],  // <1% error rate
    'errors':            ['rate<0.01'],
  },
};

export default function () {
  const res = http.get(`${BASE_URL}/api/v1/products`, {
    headers: { 'Accept': 'application/json' },
  });

  const ok = check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
    'has products array': (r) => JSON.parse(r.body).length > 0,
  });

  errorRate.add(!ok);
  sleep(1);
}

Run it and generate an HTML report:

# Run against local Spring Boot
k6 run --out json=results.json k6-springboot-baseline.js

# Generate HTML report (k6 v0.49+)
k6 run --out web-dashboard k6-springboot-baseline.js

# Run against staging with env variable
BASE_URL=https://staging.myapp.com k6 run k6-springboot-baseline.js

# Export to InfluxDB for Grafana
k6 run --out influxdb=http://localhost:8086/k6 k6-springboot-baseline.js

3. Advanced k6: Scenarios, Checks & Thresholds

Real APIs require multiple concurrent user journeys. k6 scenarios let you model this precisely, each with independent executors, VU pools, and thresholds. The following script simulates anonymous browsing, authenticated checkout, and a background admin polling job simultaneously:

// k6-advanced-scenarios.js
import http from 'k6/http';
import { check, sleep, group } from 'k6';
import { Counter, Trend } from 'k6/metrics';

const BASE_URL = __ENV.BASE_URL || 'http://localhost:8080';
const loginErrors  = new Counter('login_errors');
const checkoutTime = new Trend('checkout_duration', true);

export const options = {
  scenarios: {
    browse: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '30s', target: 50  },
        { duration: '2m',  target: 200 },
        { duration: '30s', target: 0   },
      ],
      exec: 'browseScenario',
    },
    checkout: {
      executor: 'constant-vus',
      vus: 20,
      duration: '3m',
      exec: 'checkoutScenario',
      startTime: '30s',
    },
    adminPoll: {
      executor: 'constant-arrival-rate',
      rate: 5,
      timeUnit: '1s',
      duration: '3m',
      preAllocatedVUs: 5,
      exec: 'adminScenario',
    },
  },
  thresholds: {
    'http_req_duration{scenario:browse}':    ['p(95)<300'],
    'http_req_duration{scenario:checkout}':  ['p(95)<800'],
    'checkout_duration':                      ['p(99)<2000'],
    'login_errors':                           ['count<5'],
    'http_req_failed':                        ['rate<0.005'],
  },
};

function getAuthToken() {
  const res = http.post(`${BASE_URL}/api/auth/login`,
    JSON.stringify({ username: 'testuser', password: 'Test@1234' }),
    { headers: { 'Content-Type': 'application/json' } }
  );
  if (res.status !== 200) { loginErrors.add(1); return null; }
  return JSON.parse(res.body).accessToken;
}

export function browseScenario() {
  group('browse products', () => {
    const r = http.get(`${BASE_URL}/api/v1/products?page=0&size=20`);
    check(r, { 'products 200': (res) => res.status === 200 });
    sleep(2);

    const products = JSON.parse(r.body).content || [];
    if (products.length > 0) {
      const detail = http.get(`${BASE_URL}/api/v1/products/${products[0].id}`);
      check(detail, { 'detail 200': (res) => res.status === 200 });
    }
  });
  sleep(1);
}

export function checkoutScenario() {
  const token = getAuthToken();
  if (!token) { sleep(2); return; }

  const headers = {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json',
  };

  const start = Date.now();
  const order = http.post(
    `${BASE_URL}/api/v1/orders`,
    JSON.stringify({ productId: 42, quantity: 1 }),
    { headers }
  );
  checkoutTime.add(Date.now() - start);
  check(order, { 'order created': (r) => r.status === 201 });
  sleep(3);
}

export function adminScenario() {
  const r = http.get(`${BASE_URL}/actuator/health`);
  check(r, { 'health UP': (res) => res.status === 200 });
}

4. Gatling for Java Engineers: Simulation DSL

k6 vs Gatling side-by-side comparison of language, runtime, reporting, CI integration and learning curve
Figure 2: k6 vs Gatling — key differentiators across language, runtime, reporting, and CI integration.

For Java teams, Gatling integrates directly into Maven/Gradle workflows. Add the dependency and plugin to your pom.xml:

<!-- pom.xml -->
<dependency>
  <groupId>io.gatling.highcharts</groupId>
  <artifactId>gatling-charts-highcharts</artifactId>
  <version>3.11.5</version>
  <scope>test</scope>
</dependency>

<plugin>
  <groupId>io.gatling</groupId>
  <artifactId>gatling-maven-plugin</artifactId>
  <version>4.9.6</version>
  <configuration>
    <simulationClass>simulations.ProductApiSimulation</simulationClass>
    <failOnError>true</failOnError>
  </configuration>
</plugin>

A complete Gatling Simulation in Java using the Java DSL (Gatling 3.9+):

// src/test/java/simulations/ProductApiSimulation.java
package simulations;

import io.gatling.javaapi.core.*;
import io.gatling.javaapi.http.*;
import static io.gatling.javaapi.core.CoreDsl.*;
import static io.gatling.javaapi.http.HttpDsl.*;

public class ProductApiSimulation extends Simulation {

    private static final String BASE_URL =
        System.getProperty("baseUrl", "http://localhost:8080");

    // CSV feeder: id,name columns
    FeederBuilder<String> productFeeder =
        csv("products.csv").circular();

    HttpProtocolBuilder httpProtocol = http
        .baseUrl(BASE_URL)
        .acceptHeader("application/json")
        .contentTypeHeader("application/json")
        .userAgentHeader("Gatling/3.11 LoadTest")
        .shareConnections();

    // Auth chain — obtain JWT, store in session
    ChainBuilder authenticate = exec(
        http("Login")
            .post("/api/auth/login")
            .body(StringBody("""
                {"username":"testuser","password":"Test@1234"}
                """))
            .check(status().is(200))
            .check(jsonPath("$.accessToken").saveAs("token"))
    );

    // Browse products scenario
    ScenarioBuilder browseProducts = scenario("Browse Products")
        .feed(productFeeder)
        .exec(
            http("List Products")
                .get("/api/v1/products?page=0&size=20")
                .check(status().is(200))
                .check(jsonPath("$.totalElements").gt("0"))
        )
        .pause(2)
        .exec(
            http("Get Product Detail")
                .get("/api/v1/products/#{id}")
                .check(status().is(200))
                .check(responseTimeInMillis().lte(300))
        )
        .pause(1);

    // Authenticated checkout scenario
    ScenarioBuilder checkout = scenario("Checkout Flow")
        .exec(authenticate)
        .pause(1)
        .exec(
            http("Create Order")
                .post("/api/v1/orders")
                .header("Authorization", "Bearer #{token}")
                .body(StringBody("""
                    {"productId":42,"quantity":1}
                    """))
                .check(status().is(201))
                .check(jsonPath("$.orderId").exists())
        )
        .pause(3);

    {
        setUp(
            browseProducts.injectOpen(
                nothingFor(5),
                rampUsersPerSec(1).to(50).during(60),
                constantUsersPerSec(50).during(120)
            ),
            checkout.injectOpen(
                nothingFor(15),
                rampUsers(20).during(30),
                constantUsersPerSec(5).during(120)
            )
        )
        .protocols(httpProtocol)
        .assertions(
            global().responseTime().percentile(95).lte(500),
            global().responseTime().percentile(99).lte(1000),
            global().failedRequests().percent().lte(1.0),
            forAll().responseTime().mean().lte(300),
            details("Create Order").responseTime().percentile(95).lte(800)
        );
    }
}

Run the simulation and fail the build on assertion failure:

# Run with Maven
mvn gatling:test -DbaseUrl=https://staging.myapp.com

# Gradle
./gradlew gatlingRun -DbaseUrl=https://staging.myapp.com

# Reports generated at:
# target/gatling/productsapisimulation-<timestamp>/index.html

5. Interpreting Results: Bottleneck Identification

Raw numbers are meaningless without context. Here is how to interpret the key percentiles:

Metric Meaning Green Red Flag
p50 (median) Typical user experience < 100ms > 500ms
p95 95% of users see this or better < 500ms > 1s
p99 The "tail latency" SLO target < 1s > 3s
Error Rate % of non-2xx responses < 0.1% > 1%

Common Bottleneck Patterns

1. JVM GC pause spikes. You see periodic p99 latency spikes every 30–120 seconds. In Prometheus, jvm_gc_pause_seconds_max correlates exactly. The fix is switching to ZGC (-XX:+UseZGC) or tuning heap sizing. Reproduce with:

# Prometheus query — GC pause over 100ms
rate(jvm_gc_pause_seconds_sum[1m]) / rate(jvm_gc_pause_seconds_count[1m]) > 0.1

# k6 check — detect GC pauses via latency histogram
jvm_gc_pause_seconds_max{action="end of minor GC"} > 0.1

2. Thread pool saturation. p50 stays healthy but p95 degrades as load increases — classic thread starvation. Diagnose with a thread dump:

# Take thread dump from running Spring Boot pod
kubectl exec -it <pod> -- jcmd 1 Thread.print | grep -A3 "WAITING\|BLOCKED" | head -60

# Or via Spring Boot Actuator
curl http://localhost:8080/actuator/threaddump | jq '.threads[] | select(.threadState=="BLOCKED")'

3. Database connection pool exhaustion. The error rate spikes with HikariPool-1 - Connection is not available, request timed out in logs. In Prometheus, monitor:

# HikariCP pool exhaustion signal
hikaricp_connections_pending{pool="HikariPool-1"} > 0

# Connection acquisition time increasing
hikaricp_connections_acquire_seconds_max{pool="HikariPool-1"} > 0.5

6. Integrating Load Tests in CI/CD Pipelines

The goal is a performance gate that blocks deployments when SLOs regress. Separate your test sizes: smoke (1 VU, 30s for fast CI feedback) from full load tests (500 VUs, 5 min, triggered on release branches only).

GitHub Actions with k6

# .github/workflows/load-test.yml
name: Load Test — Spring Boot API

on:
  push:
    branches: [main, release/**]
  workflow_dispatch:
    inputs:
      environment:
        description: 'Target environment'
        required: true
        default: 'staging'

jobs:
  smoke-test:
    name: Smoke Test (1 VU)
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: grafana/setup-k6-action@v1
        with:
          k6-version: '0.54.0'
      - name: Run smoke test
        run: |
          k6 run \
            --vus 1 --duration 30s \
            --out json=smoke-results.json \
            tests/k6-springboot-baseline.js
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: smoke-results
          path: smoke-results.json

  load-test:
    name: Full Load Test (500 VUs)
    needs: smoke-test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/heads/release/')
    steps:
      - uses: actions/checkout@v4
      - uses: grafana/setup-k6-action@v1
        with:
          k6-version: '0.54.0'
      - name: Run full load test
        run: |
          k6 run \
            --out json=load-results.json \
            tests/k6-advanced-scenarios.js
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          K6_CLOUD_TOKEN: ${{ secrets.K6_CLOUD_TOKEN }}
      - name: Check threshold breaches
        if: failure()
        run: |
          echo "::error::Load test thresholds breached — deployment blocked"
          cat load-results.json | jq '.metrics | to_entries[] | select(.value.thresholds != null)'
          exit 1
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: load-test-results
          path: load-results.json
          retention-days: 30

Gatling CI Mode

# .github/workflows/gatling.yml
name: Gatling Load Test

on:
  push:
    branches: [main]

jobs:
  gatling:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: 'temurin'
      - name: Cache Maven
        uses: actions/cache@v4
        with:
          path: ~/.m2/repository
          key: maven-${{ hashFiles('**/pom.xml') }}
      - name: Run Gatling simulation
        run: |
          mvn gatling:test \
            -DbaseUrl=${{ secrets.STAGING_URL }} \
            -Dgatling.simulationClass=simulations.ProductApiSimulation
        # failOnError=true in pom.xml breaks build on assertion failure
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: gatling-report
          path: target/gatling/

7. Spring Boot Optimization Based on Load Test Results

Once load tests reveal bottlenecks, here are the highest-impact tunings to apply:

HikariCP Connection Pool Tuning

# application.yml — tuned for 1000 RPS, 4-core pod
spring:
  datasource:
    hikari:
      maximum-pool-size: 20          # CPU cores × 2 + effective_spindle_count
      minimum-idle: 5
      connection-timeout: 3000       # fail fast: 3s not 30s default
      idle-timeout: 600000           # 10 min
      max-lifetime: 1800000          # 30 min — less than DB wait_timeout
      leak-detection-threshold: 5000 # detect connection leaks after 5s
      pool-name: HikariPool-orders
      data-source-properties:
        cachePrepStmts: true
        prepStmtCacheSize: 250
        prepStmtCacheSqlLimit: 2048

Tomcat Thread Pool & HTTP/2

# application.yml
server:
  port: 8080
  http2:
    enabled: true                    # multiplexing reduces connection overhead
  tomcat:
    threads:
      max: 200                       # default 200 — tune based on thread dump analysis
      min-spare: 20
    accept-count: 100                # queue depth before rejecting connections
    connection-timeout: 5000
    keep-alive-timeout: 20000
    max-keep-alive-requests: 100

JVM Flags for Container Deployments

# Dockerfile / Kubernetes pod spec JVM flags
JAVA_OPTS="\
  -XX:+UseContainerSupport \
  -XX:MaxRAMPercentage=75.0 \
  -XX:InitialRAMPercentage=50.0 \
  -XX:+UseZGC \
  -XX:+ZGenerational \
  -XX:+AlwaysPreTouch \
  -XX:+DisableExplicitGC \
  -Djava.security.egd=file:/dev/./urandom \
  -Dspring.jmx.enabled=false"

@Async for Non-Blocking Operations

@Configuration
@EnableAsync
public class AsyncConfig {

    @Bean
    public Executor taskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(4);
        executor.setMaxPoolSize(20);
        executor.setQueueCapacity(500);
        executor.setThreadNamePrefix("async-");
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

@Service
public class NotificationService {

    @Async
    public CompletableFuture<Void> sendOrderConfirmationEmail(Order order) {
        // heavy I/O — runs in async thread pool, not Tomcat threads
        emailClient.send(order.getEmail(), buildTemplate(order));
        return CompletableFuture.completedFuture(null);
    }
}

After applying all optimizations, re-run your k6/Gatling suite and compare the before/after p99 in Grafana. A well-tuned Spring Boot service running on 4 vCPUs should comfortably sustain 1,000 RPS with p99 under 200 ms — validate that claim with data, not assumptions.

Key Takeaways

  • Define SLOs as thresholds before writing load test scripts — measure purpose, not vanity numbers.
  • k6 excels for DevOps teams and cloud-native CI/CD; Gatling fits Java teams with Maven/Gradle workflows.
  • Separate smoke tests (1 VU, 30s) from full load tests (500+ VU) to keep CI pipelines fast.
  • HikariCP pool exhaustion is the #1 Spring Boot load failure — instrument it with Prometheus from day one.
  • Use -XX:+UseZGC -XX:+ZGenerational on Java 21 to eliminate GC-induced latency spikes at scale.

Leave a Comment

Related Posts

Md Sanwar Hossain - Software Engineer
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices

Last updated: April 5, 2026