Technology

Spring Boot Actuator in Production: Custom Health Checks, Metrics & Security Hardening

Ever deployed a microservice only to realize you're flying blind? No visibility into JVM health, no way to check if database connections are healthy, no metrics to debug that mysterious latency spike at 2 AM. That's the black-box service problem, and Spring Boot Actuator solves it.

Md Sanwar Hossain March 22, 2026 16 min read Technology
Spring Boot Actuator production health checks Micrometer metrics security

Actuator exposes production-ready endpoints for monitoring and managing your application. But here's the catch: most teams enable it with defaults, exposing /actuator/env to the internet and wondering why they got breached. Or they write custom health checks that block the entire liveness probe thread.

This guide shows you how we use Actuator in production systems serving 10M+ requests/day—custom health indicators, secure endpoint exposure, Micrometer metrics, and Kubernetes integration.

Table of Contents

  1. Actuator Architecture: Endpoints, InfoContributor, HealthIndicator
  2. Building Custom Health Indicators
  3. Micrometer Metrics: Counters, Gauges, and Timers
  4. Securing Actuator Endpoints
  5. Kubernetes Liveness/Readiness Probe Integration
  6. Production Debugging with /threaddump and /heapdump
  7. Failure Scenarios and Troubleshooting
  8. Key Takeaways
  9. Conclusion

Actuator Architecture: Endpoints, InfoContributor, HealthIndicator

Actuator is built on three core abstractions:

Enable Actuator in pom.xml:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

By default, only /health and /info are exposed over HTTP. Enable more in application.yml:

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: when-authorized

Building Custom Health Indicators

Production Monitoring Stack | mdsanwarhossain.me
Production Monitoring Stack — mdsanwarhossain.me

Spring Boot auto-configures health checks for DataSource, Redis, Kafka, etc. But what about your critical external API dependency?

@Component
public class PaymentGatewayHealthIndicator implements HealthIndicator {
    
    @Autowired
    private RestTemplate restTemplate;
    
    @Override
    public Health health() {
        try {
            // Call external service health endpoint with timeout
            ResponseEntity<String> response = restTemplate.exchange(
                "https://payment-api.example.com/health",
                HttpMethod.GET,
                null,
                String.class
            );
            
            if (response.getStatusCode() == HttpStatus.OK) {
                return Health.up()
                    .withDetail("gateway", "payment-api")
                    .withDetail("latency", "45ms")
                    .build();
            }
            
            return Health.down()
                .withDetail("reason", "Non-200 status: " + response.getStatusCode())
                .build();
                
        } catch (Exception e) {
            return Health.down()
                .withDetail("error", e.getMessage())
                .withException(e)
                .build();
        }
    }
}

Health Check Timeout Trap

Kubernetes liveness probes have a 1-second default timeout. If your health check calls slow external APIs, use @Async or cache results. Never block the health endpoint.

Micrometer Metrics: Counters, Gauges, and Timers

Actuator uses Micrometer—a dimensional metrics facade that works with Prometheus, Datadog, CloudWatch, etc.

Spring Boot Actuator Endpoints & Metrics Pipeline | mdsanwarhossain.me
Spring Boot Actuator Endpoints & Metrics Pipeline — mdsanwarhossain.me

Counter example — Increment on every payment attempt:

@Service
public class PaymentService {
    
    private final Counter paymentCounter;
    
    public PaymentService(MeterRegistry registry) {
        this.paymentCounter = Counter.builder("payments.attempted")
            .tag("service", "checkout")
            .description("Total payment attempts")
            .register(registry);
    }
    
    public void processPayment(Payment payment) {
        paymentCounter.increment();
        // business logic
    }
}

Timer example — Measure external API call duration:

@Service
public class ExternalApiClient {
    
    private final Timer apiTimer;
    
    public ExternalApiClient(MeterRegistry registry) {
        this.apiTimer = Timer.builder("external.api.calls")
            .tag("api", "payment-gateway")
            .description("Payment gateway API call duration")
            .register(registry);
    }
    
    public Response callApi() {
        return apiTimer.record(() -> {
            return restTemplate.getForObject(...);
        });
    }
}

Securing Actuator Endpoints

Default configuration exposes sensitive endpoints to the internet. Secure them with Spring Security:

@Configuration
public class ActuatorSecurityConfig {
    
    @Bean
    public SecurityFilterChain actuatorSecurity(HttpSecurity http) throws Exception {
        http
            .securityMatcher(EndpointRequest.toAnyEndpoint())
            .authorizeHttpRequests(auth -> auth
                .requestMatchers(EndpointRequest.to("health", "info")).permitAll()
                .anyRequest().hasRole("ACTUATOR_ADMIN")
            )
            .httpBasic();
        return http.build();
    }
}

Expose only health and info publicly. Everything else (env, beans, heapdump) requires authentication.

Kubernetes Liveness/Readiness Probe Integration

Spring Boot 2.3+ provides dedicated liveness and readiness states:

management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessState:
      enabled: true
    readinessState:
      enabled: true

Kubernetes deployment:

livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  
readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Production Debugging with /threaddump and /heapdump

When your app is hanging in production, use /actuator/threaddump to see what threads are blocked:

curl -u admin:secret https://api.example.com/actuator/threaddump > threaddump.json

For memory leaks, trigger a heap dump:

curl -u admin:secret -X POST https://api.example.com/actuator/heapdump -o heapdump.hprof

Analyze with VisualVM or Eclipse MAT.

Failure Scenarios and Troubleshooting

Scenario 1: Pod restarting every 30 seconds — Liveness probe fails because health check calls slow DB query. Fix: Move DB check to readiness only.

Scenario 2: Metrics endpoint returns 404 — Forgot to add micrometer-registry-prometheus dependency. Fix: Add the correct registry dependency.

Scenario 3: Health endpoint exposed to internet — Security misconfiguration. Fix: Use Spring Security to restrict access.

Key Takeaways

Conclusion

Spring Boot Actuator transforms your microservices from black boxes into observable systems. Combined with proper security, custom health checks, and Kubernetes integration, you get production-grade monitoring out of the box.

Customizing the /info Endpoint for Build and Git Metadata

The /actuator/info endpoint is underused. By adding the spring-boot-maven-plugin info goal, you can expose build version, git commit SHA, and branch name—invaluable for debugging "which version is running in prod?":

<!-- pom.xml -->
<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <executions>
        <execution>
            <goals><goal>build-info</goal></goals>
        </execution>
    </executions>
</plugin>

<!-- Also add git-commit-id plugin for git info -->
<plugin>
    <groupId>io.github.git-commit-id</groupId>
    <artifactId>git-commit-id-maven-plugin</artifactId>
    <version>7.0.0</version>
    <executions>
        <execution>
            <goals><goal>revision</goal></goals>
        </execution>
    </executions>
</plugin>
# application.yml
management:
  info:
    git:
      mode: full
    build:
      enabled: true
    env:
      enabled: true

info:
  app:
    name: payment-service
    team: core-backend
    contact: team-backend@brac-it.com.bd

Now GET /actuator/info returns:

{
  "build": {
    "version": "2.4.1",
    "artifact": "payment-service",
    "time": "2026-04-28T10:32:00Z"
  },
  "git": {
    "branch": "main",
    "commit": {
      "id": "a3f4b2c",
      "time": "2026-04-28T09:15:00Z",
      "message": "fix: null pointer in payment retry logic"
    }
  },
  "app": {
    "name": "payment-service",
    "team": "core-backend"
  }
}

At BRAC IT, we display this in our internal developer portal alongside Kubernetes pod status. When an incident fires, the first thing we check is git.commit.message — it immediately tells us if a recent deployment is the culprit.

Composite Health Checks with Groups

Spring Boot 2.4+ introduced health groups—you can define separate health groups for liveness and readiness with different checks in each:

management:
  endpoint:
    health:
      group:
        liveness:
          include: livenessState,diskSpace
          # Only checks app is alive — not external deps
        readiness:
          include: readinessState,db,redis,kafka,paymentGateway
          # All deps must be up to accept traffic

This is the correct Kubernetes pattern:

Probe Endpoint Failure Action Include External Deps?
Liveness /actuator/health/liveness Restart pod ❌ No — restarts won't help
Readiness /actuator/health/readiness Remove from load balancer ✅ Yes — stops traffic until deps recover

Production Insight from BRAC IT

We discovered the hard way that including payment gateway health in the liveness probe caused cascading pod restarts when the external API had a 30-second blip. Moving external deps to readiness-only stopped the restarts — pods stayed alive but removed themselves from load balancer rotation until the API recovered.

Building Custom Actuator Endpoints

Sometimes built-in endpoints aren't enough. You can create fully custom endpoints exposed under /actuator:

@Component
@Endpoint(id = "feature-flags")
public class FeatureFlagsEndpoint {

    @Autowired
    private FeatureFlagService featureFlagService;

    @ReadOperation
    public Map<String, Boolean> getFlags() {
        return featureFlagService.getAllFlags();
    }

    @WriteOperation
    public void toggleFlag(@Selector String flagName,
                           @Param("enabled") boolean enabled) {
        featureFlagService.setFlag(flagName, enabled);
        log.info("Feature flag {} set to {} via Actuator", flagName, enabled);
    }
}

// Now accessible at:
// GET  /actuator/feature-flags        → all flags
// POST /actuator/feature-flags/{name} → toggle a flag

We used this pattern at BRAC IT to create a /actuator/circuit-breakers endpoint that shows the state of all Resilience4j circuit breakers in real time — invaluable during incidents when you need to see at a glance which downstream dependencies are failing.

Prometheus + Grafana Integration Checklist

Add the Prometheus registry to expose metrics in the Prometheus scrape format:

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Configure scraping in your prometheus.yml:

scrape_configs:
  - job_name: 'payment-service'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['payment-service:8080']
    # In Kubernetes, use kubernetes_sd_configs instead

Essential Grafana panels for every Spring Boot service:

Panel Metric Alert Threshold
Request rate http_server_requests_seconds_count
P99 latency http_server_requests_seconds{quantile="0.99"} > 2s
JVM heap used jvm_memory_used_bytes{area="heap"} > 85% of max
DB connection pool hikaricp_connections_pending > 5 pending
GC pause time jvm_gc_pause_seconds_max > 500ms
Error rate http_server_requests_seconds{status=~"5.."} > 1% of requests

At BRAC IT: How We Use Actuator in 20+ Microservices

At BRAC IT we run a microfinance platform on Kubernetes with over 20 Spring Boot microservices. Actuator is the backbone of our operational visibility. Every service exposes a standard set of endpoints behind an internal-only management port (8081), and our Grafana dashboards are populated entirely from Prometheus scraping /actuator/prometheus. Before we standardised on Actuator, incident diagnosis meant SSHing into pods and scanning raw logs. Now the first step in any runbook is: check Actuator.

Three Actuator features that have saved us the most time in production incidents:

One governance rule we established early: expose the management port on 8081, not on the same port as the application. This lets us expose all endpoints freely within the cluster without any risk of external exposure:

management:
  server:
    port: 8081          # separate from app port 8080
  endpoints:
    web:
      exposure:
        include: "*"    # safe — port 8081 not exposed outside cluster
  endpoint:
    health:
      show-details: always
      probes:
        enabled: true
  info:
    git:
      mode: full
    build:
      enabled: true

Caching Health Results to Prevent Probe Overload

Kubernetes probes call /actuator/health every 5–10 seconds per pod. If your custom health indicators make external API calls on each invocation, a 10-pod deployment with 5-second intervals generates 120 health-check calls per minute to every checked dependency. This can trigger rate limiting on downstream services or create circular dependency failures where the health check itself causes the health check to fail.

Spring Boot 2.5+ supports health endpoint caching natively:

management:
  endpoint:
    health:
      cache:
        time-to-live: 10s   # cache health results for 10 seconds

For expensive indicators that call external APIs, use a background-refresh pattern — the indicator returns a cached result instantly while a scheduler refreshes it in the background:

@Component
public class AsyncExternalApiHealthIndicator implements HealthIndicator {

    private volatile Health cachedHealth = Health.unknown().build();

    @Scheduled(fixedDelay = 30_000)   // refresh every 30 seconds
    public void refreshHealth() {
        try {
            ResponseEntity<Void> resp =
                restTemplate.getForEntity(HEALTH_URL, Void.class);
            cachedHealth = resp.getStatusCode().is2xxSuccessful()
                ? Health.up().withDetail("latency", measureLatencyMs() + "ms").build()
                : Health.down().withDetail("status", resp.getStatusCode()).build();
        } catch (Exception e) {
            cachedHealth = Health.down(e).build();
        }
    }

    @Override
    public Health health() {
        return cachedHealth;   // returns instantly — never blocks Kubernetes
    }
}

This pattern ensures Kubernetes probes respond in under 1 millisecond regardless of external API latency, completely eliminating the risk of probe timeouts causing unnecessary pod restarts during dependency slowdowns.

Key Takeaways

Conclusion

Spring Boot Actuator transforms your microservices from black boxes into observable systems. The combination of custom health indicators, Micrometer business metrics, properly secured endpoints, and Kubernetes probe integration gives you production-grade observability with minimal boilerplate.

The patterns in this post—composite health groups, custom endpoints, git-enriched /info, and Prometheus integration—are what we use at BRAC IT across our 20+ microservices. They've saved us countless hours of incident investigation by making the state of every service immediately visible.

Next, dive deeper into distributed tracing with OpenTelemetry and JVM profiling with JFR to complete your observability stack.

Common Actuator Anti-Patterns to Avoid

After auditing many production Spring Boot services, these are the Actuator anti-patterns that appear most often:

Exposing all endpoints on the public port. The most dangerous anti-pattern. /actuator/heapdump returns the entire JVM heap — including passwords, tokens, and PII stored in memory — as a downloadable file. /actuator/env exposes all environment variables including credentials. Always run management on a separate internal port, or secure sensitive endpoints with Spring Security roles.

Using /health instead of dedicated liveness/readiness endpoints. The general /actuator/health endpoint aggregates all health indicators. If any one fails (even a non-critical one), the endpoint returns DOWN. Kubernetes will restart perfectly healthy pods because a non-critical indicator returned DOWN. Use the dedicated liveness and readiness group endpoints with explicitly scoped indicators.

Calling slow external APIs on every health check invocation. A health indicator that makes a synchronous HTTP call to an external service adds that service's latency (potentially hundreds of milliseconds) to every Kubernetes probe. Under load, this can cause probe timeouts. Use the async background-refresh pattern described in this post for any indicator that calls an external dependency.

No custom business metrics. The default Micrometer metrics cover JVM and HTTP. But "loan applications processed per minute" and "payment success rate" are the metrics your business stakeholders care about. If your Grafana dashboard only shows JVM heap and HTTP latency, you are running blind on business outcomes. Add at least 3–5 business-specific counters and timers to every service.

Ignoring the /actuator/info endpoint. Most teams expose health and metrics but never configure info. Two lines of Maven plugin configuration give you build version, git commit SHA, and branch in every service — priceless during incident triage. There is no good reason not to configure it.

Leave a Comment

Related Posts

Md Sanwar Hossain - Software Engineer
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Microservices

Last updated: March 22, 2026