Technology

Spring Boot Actuator in Production: Custom Health Checks, Metrics & Security Hardening

Ever deployed a microservice only to realize you're flying blind? No visibility into JVM health, no way to check if database connections are healthy, no metrics to debug that mysterious latency spike at 2 AM. That's the black-box service problem, and Spring Boot Actuator solves it.

Md Sanwar Hossain March 22, 2026 16 min read Technology

Spring Boot Actuator production health checks Micrometer metrics security

Actuator exposes production-ready endpoints for monitoring and managing your application. But here's the catch: most teams enable it with defaults, exposing /actuator/env to the internet and wondering why they got breached. Or they write custom health checks that block the entire liveness probe thread.

This guide shows you how we use Actuator in production systems serving 10M+ requests/day—custom health indicators, secure endpoint exposure, Micrometer metrics, and Kubernetes integration.

Actuator Architecture: Endpoints, InfoContributor, HealthIndicator
Building Custom Health Indicators
Micrometer Metrics: Counters, Gauges, and Timers
Securing Actuator Endpoints
Kubernetes Liveness/Readiness Probe Integration
Production Debugging with /threaddump and /heapdump
Failure Scenarios and Troubleshooting
Key Takeaways
Conclusion

Actuator Architecture: Endpoints, InfoContributor, HealthIndicator

Actuator is built on three core abstractions:

Endpoints — Expose application internals (health, metrics, beans, env)
HealthIndicator — Contribute to /actuator/health with custom checks
InfoContributor — Add metadata to /actuator/info

Enable Actuator in pom.xml:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

By default, only /health and /info are exposed over HTTP. Enable more in application.yml:

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: when-authorized

Building Custom Health Indicators

Production Monitoring Stack | mdsanwarhossain.me — Production Monitoring Stack — mdsanwarhossain.me

Spring Boot auto-configures health checks for DataSource, Redis, Kafka, etc. But what about your critical external API dependency?

@Component
public class PaymentGatewayHealthIndicator implements HealthIndicator {
    
    @Autowired
    private RestTemplate restTemplate;
    
    @Override
    public Health health() {
        try {
            // Call external service health endpoint with timeout
            ResponseEntity<String> response = restTemplate.exchange(
                "https://payment-api.example.com/health",
                HttpMethod.GET,
                null,
                String.class
            );
            
            if (response.getStatusCode() == HttpStatus.OK) {
                return Health.up()
                    .withDetail("gateway", "payment-api")
                    .withDetail("latency", "45ms")
                    .build();
            }
            
            return Health.down()
                .withDetail("reason", "Non-200 status: " + response.getStatusCode())
                .build();
                
        } catch (Exception e) {
            return Health.down()
                .withDetail("error", e.getMessage())
                .withException(e)
                .build();
        }
    }
}

Health Check Timeout Trap

Kubernetes liveness probes have a 1-second default timeout. If your health check calls slow external APIs, use @Async or cache results. Never block the health endpoint.

Micrometer Metrics: Counters, Gauges, and Timers

Actuator uses Micrometer—a dimensional metrics facade that works with Prometheus, Datadog, CloudWatch, etc.

Spring Boot Actuator Endpoints & Metrics Pipeline | mdsanwarhossain.me — Spring Boot Actuator Endpoints & Metrics Pipeline — mdsanwarhossain.me

Counter example — Increment on every payment attempt:

@Service
public class PaymentService {
    
    private final Counter paymentCounter;
    
    public PaymentService(MeterRegistry registry) {
        this.paymentCounter = Counter.builder("payments.attempted")
            .tag("service", "checkout")
            .description("Total payment attempts")
            .register(registry);
    }
    
    public void processPayment(Payment payment) {
        paymentCounter.increment();
        // business logic
    }
}

Timer example — Measure external API call duration:

@Service
public class ExternalApiClient {
    
    private final Timer apiTimer;
    
    public ExternalApiClient(MeterRegistry registry) {
        this.apiTimer = Timer.builder("external.api.calls")
            .tag("api", "payment-gateway")
            .description("Payment gateway API call duration")
            .register(registry);
    }
    
    public Response callApi() {
        return apiTimer.record(() -> {
            return restTemplate.getForObject(...);
        });
    }
}

Securing Actuator Endpoints

Default configuration exposes sensitive endpoints to the internet. Secure them with Spring Security:

@Configuration
public class ActuatorSecurityConfig {
    
    @Bean
    public SecurityFilterChain actuatorSecurity(HttpSecurity http) throws Exception {
        http
            .securityMatcher(EndpointRequest.toAnyEndpoint())
            .authorizeHttpRequests(auth -> auth
                .requestMatchers(EndpointRequest.to("health", "info")).permitAll()
                .anyRequest().hasRole("ACTUATOR_ADMIN")
            )
            .httpBasic();
        return http.build();
    }
}

Expose only health and info publicly. Everything else (env, beans, heapdump) requires authentication.

Kubernetes Liveness/Readiness Probe Integration

Spring Boot 2.3+ provides dedicated liveness and readiness states:

management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessState:
      enabled: true
    readinessState:
      enabled: true

Kubernetes deployment:

livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  
readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Production Debugging with /threaddump and /heapdump

When your app is hanging in production, use /actuator/threaddump to see what threads are blocked:

curl -u admin:secret https://api.example.com/actuator/threaddump > threaddump.json

For memory leaks, trigger a heap dump:

curl -u admin:secret -X POST https://api.example.com/actuator/heapdump -o heapdump.hprof

Analyze with VisualVM or Eclipse MAT.

Failure Scenarios and Troubleshooting

Scenario 1: Pod restarting every 30 seconds — Liveness probe fails because health check calls slow DB query. Fix: Move DB check to readiness only.

Scenario 2: Metrics endpoint returns 404 — Forgot to add micrometer-registry-prometheus dependency. Fix: Add the correct registry dependency.

Scenario 3: Health endpoint exposed to internet — Security misconfiguration. Fix: Use Spring Security to restrict access.

Key Takeaways

Use custom HealthIndicators for critical external dependencies
Secure sensitive endpoints — only expose health/info publicly
Separate liveness and readiness — liveness = "is the app alive?", readiness = "can it serve traffic?"
Instrument business metrics with Micrometer counters and timers
Set timeouts on health checks — never block Kubernetes probes

Conclusion

Spring Boot Actuator transforms your microservices from black boxes into observable systems. Combined with proper security, custom health checks, and Kubernetes integration, you get production-grade monitoring out of the box.

Customizing the /info Endpoint for Build and Git Metadata

The /actuator/info endpoint is underused. By adding the spring-boot-maven-plugin info goal, you can expose build version, git commit SHA, and branch name—invaluable for debugging "which version is running in prod?":

<!-- pom.xml -->
<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <executions>
        <execution>
            <goals><goal>build-info</goal></goals>
        </execution>
    </executions>
</plugin>

<!-- Also add git-commit-id plugin for git info -->
<plugin>
    <groupId>io.github.git-commit-id</groupId>
    <artifactId>git-commit-id-maven-plugin</artifactId>
    <version>7.0.0</version>
    <executions>
        <execution>
            <goals><goal>revision</goal></goals>
        </execution>
    </executions>
</plugin>

# application.yml
management:
  info:
    git:
      mode: full
    build:
      enabled: true
    env:
      enabled: true

info:
  app:
    name: payment-service
    team: core-backend
    contact: team-backend@brac-it.com.bd

Now GET /actuator/info returns:

{
  "build": {
    "version": "2.4.1",
    "artifact": "payment-service",
    "time": "2026-04-28T10:32:00Z"
  },
  "git": {
    "branch": "main",
    "commit": {
      "id": "a3f4b2c",
      "time": "2026-04-28T09:15:00Z",
      "message": "fix: null pointer in payment retry logic"
    }
  },
  "app": {
    "name": "payment-service",
    "team": "core-backend"
  }
}

At BRAC IT, we display this in our internal developer portal alongside Kubernetes pod status. When an incident fires, the first thing we check is git.commit.message — it immediately tells us if a recent deployment is the culprit.

Composite Health Checks with Groups

Spring Boot 2.4+ introduced health groups—you can define separate health groups for liveness and readiness with different checks in each:

management:
  endpoint:
    health:
      group:
        liveness:
          include: livenessState,diskSpace
          # Only checks app is alive — not external deps
        readiness:
          include: readinessState,db,redis,kafka,paymentGateway
          # All deps must be up to accept traffic

This is the correct Kubernetes pattern:

Probe	Endpoint	Failure Action	Include External Deps?
Liveness	`/actuator/health/liveness`	Restart pod	❌ No — restarts won't help
Readiness	`/actuator/health/readiness`	Remove from load balancer	✅ Yes — stops traffic until deps recover

Production Insight from BRAC IT

We discovered the hard way that including payment gateway health in the liveness probe caused cascading pod restarts when the external API had a 30-second blip. Moving external deps to readiness-only stopped the restarts — pods stayed alive but removed themselves from load balancer rotation until the API recovered.

Building Custom Actuator Endpoints

Sometimes built-in endpoints aren't enough. You can create fully custom endpoints exposed under /actuator:

@Component
@Endpoint(id = "feature-flags")
public class FeatureFlagsEndpoint {

    @Autowired
    private FeatureFlagService featureFlagService;

    @ReadOperation
    public Map<String, Boolean> getFlags() {
        return featureFlagService.getAllFlags();
    }

    @WriteOperation
    public void toggleFlag(@Selector String flagName,
                           @Param("enabled") boolean enabled) {
        featureFlagService.setFlag(flagName, enabled);
        log.info("Feature flag {} set to {} via Actuator", flagName, enabled);
    }
}

// Now accessible at:
// GET  /actuator/feature-flags        → all flags
// POST /actuator/feature-flags/{name} → toggle a flag

We used this pattern at BRAC IT to create a /actuator/circuit-breakers endpoint that shows the state of all Resilience4j circuit breakers in real time — invaluable during incidents when you need to see at a glance which downstream dependencies are failing.

Prometheus + Grafana Integration Checklist

Add the Prometheus registry to expose metrics in the Prometheus scrape format:

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Configure scraping in your prometheus.yml:

scrape_configs:
  - job_name: 'payment-service'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['payment-service:8080']
    # In Kubernetes, use kubernetes_sd_configs instead

Essential Grafana panels for every Spring Boot service:

Panel	Metric	Alert Threshold
Request rate	`http_server_requests_seconds_count`	—
P99 latency	`http_server_requests_seconds{quantile="0.99"}`	> 2s
JVM heap used	`jvm_memory_used_bytes{area="heap"}`	> 85% of max
DB connection pool	`hikaricp_connections_pending`	> 5 pending
GC pause time	`jvm_gc_pause_seconds_max`	> 500ms
Error rate	`http_server_requests_seconds{status=~"5.."}`	> 1% of requests

At BRAC IT: How We Use Actuator in 20+ Microservices

At BRAC IT we run a microfinance platform on Kubernetes with over 20 Spring Boot microservices. Actuator is the backbone of our operational visibility. Every service exposes a standard set of endpoints behind an internal-only management port (8081), and our Grafana dashboards are populated entirely from Prometheus scraping /actuator/prometheus. Before we standardised on Actuator, incident diagnosis meant SSHing into pods and scanning raw logs. Now the first step in any runbook is: check Actuator.

Three Actuator features that have saved us the most time in production incidents:

/actuator/info with git metadata — when a service behaves differently after a release, we check the git commit SHA and message in /info first. This immediately tells us whether the cause is a code change and who made it, cutting triage time from 30 minutes to under 2.
Custom PaymentGatewayHealthIndicator — our payment service calls a third-party BKASH API. When that API degrades, our custom health indicator marks the service as OUT_OF_SERVICE within 30 seconds, removing it from readiness and stopping traffic routing automatically.
/actuator/threaddump during memory incidents — we had a recurring memory leak in Q3 2025. Using heapdump and Eclipse MAT we identified a ThreadLocal variable not being cleaned up in a custom filter. Without Actuator, that investigation would have required a maintenance window to attach a Java agent.

One governance rule we established early: expose the management port on 8081, not on the same port as the application. This lets us expose all endpoints freely within the cluster without any risk of external exposure:

management:
  server:
    port: 8081          # separate from app port 8080
  endpoints:
    web:
      exposure:
        include: "*"    # safe — port 8081 not exposed outside cluster
  endpoint:
    health:
      show-details: always
      probes:
        enabled: true
  info:
    git:
      mode: full
    build:
      enabled: true

Caching Health Results to Prevent Probe Overload

Kubernetes probes call /actuator/health every 5–10 seconds per pod. If your custom health indicators make external API calls on each invocation, a 10-pod deployment with 5-second intervals generates 120 health-check calls per minute to every checked dependency. This can trigger rate limiting on downstream services or create circular dependency failures where the health check itself causes the health check to fail.

Spring Boot 2.5+ supports health endpoint caching natively:

management:
  endpoint:
    health:
      cache:
        time-to-live: 10s   # cache health results for 10 seconds

For expensive indicators that call external APIs, use a background-refresh pattern — the indicator returns a cached result instantly while a scheduler refreshes it in the background:

@Component
public class AsyncExternalApiHealthIndicator implements HealthIndicator {

    private volatile Health cachedHealth = Health.unknown().build();

    @Scheduled(fixedDelay = 30_000)   // refresh every 30 seconds
    public void refreshHealth() {
        try {
            ResponseEntity<Void> resp =
                restTemplate.getForEntity(HEALTH_URL, Void.class);
            cachedHealth = resp.getStatusCode().is2xxSuccessful()
                ? Health.up().withDetail("latency", measureLatencyMs() + "ms").build()
                : Health.down().withDetail("status", resp.getStatusCode()).build();
        } catch (Exception e) {
            cachedHealth = Health.down(e).build();
        }
    }

    @Override
    public Health health() {
        return cachedHealth;   // returns instantly — never blocks Kubernetes
    }
}

This pattern ensures Kubernetes probes respond in under 1 millisecond regardless of external API latency, completely eliminating the risk of probe timeouts causing unnecessary pod restarts during dependency slowdowns.

Key Takeaways

Use custom HealthIndicators for critical external dependencies
Separate liveness and readiness — liveness = "is the app alive?", readiness = "can it serve traffic?"
Never include slow external dependencies in liveness probes — cascading restarts will make your outage worse
Instrument business metrics with Micrometer counters and timers, not just technical metrics
Expose /info with build and git metadata — saves hours during incident triage
Secure sensitive endpoints — only expose health/info publicly, require ACTUATOR_ADMIN role for everything else
Build custom endpoints for operational tasks your team runs frequently (feature flags, circuit breakers)

Conclusion

Spring Boot Actuator transforms your microservices from black boxes into observable systems. The combination of custom health indicators, Micrometer business metrics, properly secured endpoints, and Kubernetes probe integration gives you production-grade observability with minimal boilerplate.

The patterns in this post—composite health groups, custom endpoints, git-enriched /info, and Prometheus integration—are what we use at BRAC IT across our 20+ microservices. They've saved us countless hours of incident investigation by making the state of every service immediately visible.

Next, dive deeper into distributed tracing with OpenTelemetry and JVM profiling with JFR to complete your observability stack.

Common Actuator Anti-Patterns to Avoid

After auditing many production Spring Boot services, these are the Actuator anti-patterns that appear most often:

Exposing all endpoints on the public port. The most dangerous anti-pattern. /actuator/heapdump returns the entire JVM heap — including passwords, tokens, and PII stored in memory — as a downloadable file. /actuator/env exposes all environment variables including credentials. Always run management on a separate internal port, or secure sensitive endpoints with Spring Security roles.

Using /health instead of dedicated liveness/readiness endpoints. The general /actuator/health endpoint aggregates all health indicators. If any one fails (even a non-critical one), the endpoint returns DOWN. Kubernetes will restart perfectly healthy pods because a non-critical indicator returned DOWN. Use the dedicated liveness and readiness group endpoints with explicitly scoped indicators.

Calling slow external APIs on every health check invocation. A health indicator that makes a synchronous HTTP call to an external service adds that service's latency (potentially hundreds of milliseconds) to every Kubernetes probe. Under load, this can cause probe timeouts. Use the async background-refresh pattern described in this post for any indicator that calls an external dependency.

No custom business metrics. The default Micrometer metrics cover JVM and HTTP. But "loan applications processed per minute" and "payment success rate" are the metrics your business stakeholders care about. If your Grafana dashboard only shows JVM heap and HTTP latency, you are running blind on business outcomes. Add at least 3–5 business-specific counters and timers to every service.

Ignoring the /actuator/info endpoint. Most teams expose health and metrics but never configure info. Two lines of Maven plugin configuration give you build version, git commit SHA, and branch in every service — priceless during incident triage. There is no good reason not to configure it.

Spring Boot Actuator in Production: Custom Health Checks, Metrics & Security Hardening

Table of Contents

Actuator Architecture: Endpoints, InfoContributor, HealthIndicator

Building Custom Health Indicators

Micrometer Metrics: Counters, Gauges, and Timers

Securing Actuator Endpoints

Kubernetes Liveness/Readiness Probe Integration

Production Debugging with /threaddump and /heapdump

Failure Scenarios and Troubleshooting

Key Takeaways

Conclusion

Customizing the /info Endpoint for Build and Git Metadata

Composite Health Checks with Groups

Building Custom Actuator Endpoints

Prometheus + Grafana Integration Checklist

At BRAC IT: How We Use Actuator in 20+ Microservices

Caching Health Results to Prevent Probe Overload

Key Takeaways

Conclusion

Common Actuator Anti-Patterns to Avoid

Tags

Leave a Comment

Related Posts

Spring Boot Actuator in Production: Custom Health Checks, Metrics & Security Hardening

Table of Contents

Actuator Architecture: Endpoints, InfoContributor, HealthIndicator

Building Custom Health Indicators

Micrometer Metrics: Counters, Gauges, and Timers

Securing Actuator Endpoints

Kubernetes Liveness/Readiness Probe Integration

Production Debugging with /threaddump and /heapdump

Failure Scenarios and Troubleshooting

Key Takeaways

Conclusion

Customizing the /info Endpoint for Build and Git Metadata

Composite Health Checks with Groups

Building Custom Actuator Endpoints

Prometheus + Grafana Integration Checklist

At BRAC IT: How We Use Actuator in 20+ Microservices

Caching Health Results to Prevent Probe Overload

Key Takeaways

Conclusion

Common Actuator Anti-Patterns to Avoid

Tags

Leave a Comment

Related Posts

Spring Boot Performance Tuning: 10 Proven Tips to Reduce Latency by 60% (2026)

Java Flight Recorder (JFR) in Production: Zero-Overhead Profiling, Custom Events & Incident Investigation

DevOps Observability: Mastering Logs, Metrics, and Traces in Production 2026

Cookie Notice