Software Engineer · Java · Spring Boot · Microservices
Kubernetes Secrets Management Done Right: External Secrets Operator, AWS SSM & Vault Integration
A fintech startup was managing 47 microservices on Kubernetes. Their secrets were base64-encoded values committed directly to their GitOps repository. One developer accidentally pushed a Helm values file with real database credentials to a public fork. The secret sat in git history for 23 hours before anyone noticed. The fix was External Secrets Operator (ESO) with AWS SSM Parameter Store — secrets never live in git, never in Kubernetes etcd unencrypted, and rotate automatically. This guide shows you exactly how to build that system.
Table of Contents
- The Real Problem: Kubernetes Secrets Are Not Secrets
- The Kubernetes Secret Management Landscape
- External Secrets Operator Architecture
- Setting Up ESO with AWS SSM Parameter Store
- HashiCorp Vault Integration
- Secret Rotation Without Pod Restarts
- GitOps with ESO: The Complete Workflow
- Failure Scenarios
- Security Hardening
- Key Takeaways
- Conclusion
1. The Real Problem: Kubernetes Secrets Are Not Secrets
The name is dangerously misleading. Kubernetes Secrets are base64-encoded by default, not encrypted. Any user with kubectl get secret access in a namespace can decode every value in under a second:
# This is all it takes to expose a "secret"
kubectl get secret db-credentials -o jsonpath='{.data.password}' | base64 -d
# Output: MyProductionPassword123!
The situation gets worse when secrets leak into Git. In GitOps workflows, Helm values files, kustomization.yaml overlays, and plain YAML manifests all tempt developers to inline credentials. Even if the secret is removed in a later commit, git history retains it forever unless you force-push and rewrite history — a destructive operation most teams never complete before the window closes.
This is not an isolated incident. OWASP Kubernetes Top 10 lists Secret Management failures as the #2 most critical risk. The industry consensus is clear: secrets must live in dedicated secret stores — AWS SSM Parameter Store, HashiCorp Vault, GCP Secret Manager — with Kubernetes acting only as a consumer, never as the source of truth for sensitive values. The mechanism that enforces this boundary in a GitOps-native way is the External Secrets Operator.
2. The Kubernetes Secret Management Landscape
Before committing to ESO, understand the full landscape of options available and the trade-offs each introduces:
- Kubernetes native Secrets — the baseline. Base64-encoded, stored in etcd, accessible to anyone with RBAC permission. Acceptable only for non-sensitive configuration that happens to use the Secret API for consistency.
- Sealed Secrets (Bitnami) — encrypts secrets in Git using a cluster-side key. Better than plaintext, but the key pair lives in the cluster. Key rotation requires re-sealing every secret. Works well for small teams with a single cluster and no multi-cloud requirements.
- HashiCorp Vault + Agent Injector — a full-featured secret store with dynamic secrets, lease-based access, and fine-grained audit logging. High operational complexity; requires a production-grade Vault cluster. Best for enterprises with multi-cloud or compliance-heavy environments.
- External Secrets Operator (ESO) — CNCF project. Cloud-agnostic, GitOps-compatible, supports 30+ secret store backends. Secrets never touch Git. Syncs to Kubernetes Secrets automatically with configurable refresh intervals. The de-facto standard in 2025.
- AWS Secrets Manager CSI Driver — mounts secrets directly as files in pod volumes via the Secrets Store CSI Driver. No Kubernetes Secret object created at all. Useful for compliance scenarios where you must avoid etcd entirely, but less flexible for environment variable injection.
Why ESO wins in 2025: It is cloud-agnostic (same CRD syntax whether you use AWS, GCP, Azure, or Vault), it is GitOps-compatible (only the ExternalSecret CRD lives in git — never the actual values), and its automatic refresh interval means secrets rotate in Kubernetes without any deployment or pod restart. With CNCF backing and 30+ supported backends, it has replaced bespoke operator solutions across the industry.
3. External Secrets Operator Architecture
ESO introduces two new custom resource definitions (CRDs) into your cluster and a controller that reconciles them continuously. The data flow is deliberately simple:
Secret Store (AWS SSM / HashiCorp Vault / GCP Secret Manager)
│
│ ESO Operator watches ExternalSecret CRD
▼
ExternalSecret CRD (lives in Git — no secret values)
│
│ ESO fetches secret values from the store
▼
Kubernetes Secret (created/updated by ESO — lives in etcd)
│
│ Pod mounts or env-injects the Kubernetes Secret
▼
Application Pod
The SecretStore CRD defines the connection and authentication parameters for your external secret backend — think of it as a configured client for AWS SSM or Vault. The ExternalSecret CRD declares which keys to fetch from that store and how to map them into a Kubernetes Secret. The ESO controller runs a reconciliation loop on each ExternalSecret, fetching values from the backend and writing them into the corresponding Kubernetes Secret. When the refreshInterval elapses, ESO fetches again — so updated values in AWS SSM appear in Kubernetes within minutes, without touching your deployments.
A ClusterSecretStore is a cluster-scoped variant of SecretStore, accessible from any namespace. Use ClusterSecretStore for shared infrastructure backends and namespace-scoped SecretStore when you need strict per-team isolation of AWS credentials or Vault tokens.
4. Setting Up ESO with AWS SSM Parameter Store
Install ESO with Helm first, then define your SecretStore and ExternalSecret resources:
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets \
--create-namespace \
--set installCRDs=true
With ESO installed, create the SecretStore that connects to AWS SSM Parameter Store. For production, always prefer IRSA (IAM Roles for Service Accounts) over static access keys — see Section 9 for details. The example below uses secretRef-based auth for clarity:
# 1. SecretStore — connects to AWS SSM
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-ssm-store
namespace: production
spec:
provider:
aws:
service: ParameterStore
region: us-east-1
auth:
secretRef:
accessKeyIDSecretRef:
name: aws-credentials
key: access-key-id
secretAccessKeySecretRef:
name: aws-credentials
key: secret-access-key
# 2. ExternalSecret — pull specific params
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: payment-service-secrets
namespace: production
spec:
refreshInterval: 1h # Auto-rotate every hour
secretStoreRef:
name: aws-ssm-store
kind: SecretStore
target:
name: payment-secrets # Creates this Kubernetes Secret
creationPolicy: Owner
data:
- secretKey: DB_PASSWORD
remoteRef:
key: /production/payment-service/db-password
- secretKey: STRIPE_API_KEY
remoteRef:
key: /production/payment-service/stripe-key
ESO creates the payment-secrets Kubernetes Secret automatically and keeps it in sync with AWS SSM. Verify the sync status with:
kubectl get externalsecret payment-service-secrets -n production
# NAME STORE REFRESH INTERVAL STATUS
# payment-service-secrets aws-ssm-store 1h SecretSynced
You can also use dataFrom to bulk-import all parameters under a path prefix, which is useful for services with many secrets:
spec:
dataFrom:
- extract:
key: /production/payment-service # imports ALL params under this prefix
decodingStrategy: None
5. HashiCorp Vault Integration
Choose Vault over AWS SSM when you need multi-cloud portability, dynamic database credentials with short TTLs, or lease-based access revocation. Vault's database secrets engine generates unique, time-limited credentials per service identity — when a credential expires, Vault refuses to renew it unless the lease is actively maintained, giving you automatic breach containment.
Configure a ClusterSecretStore for Vault with token authentication:
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: "https://vault.internal.example.com"
path: "secret"
version: "v2"
auth:
tokenSecretRef:
name: vault-token
namespace: external-secrets
key: token
Then reference the ClusterSecretStore in an ExternalSecret to pull from a Vault KV path:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: order-service-vault-secrets
namespace: production
spec:
refreshInterval: 30m
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: order-service-secrets
creationPolicy: Owner
data:
- secretKey: DB_USERNAME
remoteRef:
key: secret/production/order-service
property: db_username
- secretKey: DB_PASSWORD
remoteRef:
key: secret/production/order-service
property: db_password
For dynamic database credentials, Vault's database secrets engine generates a new username and password pair on each lease request, with a configurable TTL (e.g., 1 hour). ESO fetches the current credential at each refresh interval. When Vault rotates the credential after TTL expiry, ESO fetches the new values and updates the Kubernetes Secret — without any manual intervention or pod restart required (see Section 6 for pod reload strategies).
6. Secret Rotation Without Pod Restarts
ESO updates the Kubernetes Secret when a rotation happens, but running pods do not automatically reload environment variables from updated Secrets. You have two main strategies for zero-downtime rotation:
Strategy 1 — Stakater Reloader annotation. Deploy Stakater/Reloader alongside ESO. Annotate your Deployment with reloader.stakater.com/auto: "true". Reloader watches for changes to Secrets and ConfigMaps and triggers a rolling restart of annotated Deployments automatically. The restart is rolling — zero downtime, old pods serve traffic until new pods are healthy:
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-service
annotations:
reloader.stakater.com/auto: "true" # Reloader watches this deployment
spec:
template:
spec:
containers:
- name: payment-service
envFrom:
- secretRef:
name: payment-secrets # ESO-managed secret
Strategy 2 — Volume-mounted secrets with inotify. Mount the Kubernetes Secret as a volume file instead of environment variables. Modern applications using frameworks like Spring Boot can watch mounted secret files and reload their configuration at runtime without a pod restart. This is the preferred approach for high-availability services where even a rolling restart is costly.
RefreshInterval tuning. Setting refreshInterval: 1m on a high-volume cluster with many ExternalSecrets will throttle your AWS API quota quickly. AWS SSM has a default rate limit of 40 GetParameter API calls per second per region. A cluster with 200 ExternalSecrets refreshing every minute will issue 3.3 calls/sec — safe. But at every 10 seconds, that becomes 20 calls/sec and you approach throttling during node scaling events. Use refreshInterval: 1h for stable secrets and refreshInterval: 5m only for credentials you rotate frequently.
ESO Prometheus metrics. ESO exposes a rich set of metrics for rotation monitoring. The key metric is externalsecrets_sync_calls_total{status="error"} — alert when this counter increases to catch sync failures before they cascade into application errors. ESO also exposes externalsecrets_sync_calls_total{status="success"} for SLO tracking.
7. GitOps with ESO: The Complete Workflow
The key discipline in a GitOps + ESO workflow is a strict division of what lives where:
- Lives in Git:
SecretStoreYAML,ExternalSecretYAML, Deployment/Service manifests. Zero actual secret values. Any developer can read the repository without gaining access to credentials. - Lives in AWS SSM / Vault: Actual secret values. Access controlled by IAM policies or Vault ACL policies. Changes audited by AWS CloudTrail or Vault audit logs.
- Lives in Kubernetes etcd (ephemeral): The Kubernetes Secret objects created by ESO. These are considered derived state — they can be deleted and ESO will recreate them. Never back them up; they are not the source of truth.
With ArgoCD, configure your Application to sync the SecretStore and ExternalSecret resources but explicitly exclude the Kubernetes Secrets that ESO creates. Otherwise ArgoCD will detect drift (ESO updates the Secret value, ArgoCD sees the live object differs from git) and either revert it or alert incorrectly:
# ArgoCD Application — exclude ESO-managed Secrets from sync
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: payment-service
spec:
source:
repoURL: https://github.com/myorg/k8s-manifests
path: services/payment-service
syncPolicy:
syncOptions:
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: ""
kind: Secret
name: payment-secrets # ESO manages this — ignore ArgoCD drift detection
Audit trail: Every secret access through AWS SSM is recorded in AWS CloudTrail with the IAM principal, timestamp, parameter name, and request source IP. For Vault, the audit log captures every read, write, and token renewal. This creates a complete, immutable audit trail for compliance — something impossible with etcd-stored Kubernetes Secrets where access logging requires additional tooling.
8. Failure Scenarios
Understanding the failure envelope prevents production surprises. Here are the four most common ESO operational failures:
| Failure | Symptom | Fix |
|---|---|---|
| IAM permission denied | ExternalSecret status: SecretSyncedError |
Add ssm:GetParameter and ssm:GetParameters to the pod's IRSA role policy |
| Secret not found in SSM | Status: SecretNotFound |
Verify the exact parameter path and region in the ExternalSecret remoteRef.key |
| Refresh throttled by AWS | Sync failures every refresh cycle | Increase refreshInterval to 1h or enable jitter with refreshIntervalJitter |
| ESO operator crash | All secrets stale; no new ExternalSecrets synced | Restore ESO RBAC ClusterRole, restart operator pod; existing Kubernetes Secrets remain intact |
A critical operational property: ESO operator crashes do not delete existing Kubernetes Secrets. Pods continue running with the last-synced values. Only new sync requests and value updates are blocked until the operator recovers. This means a short ESO outage has zero user impact — though you should alert on it immediately to prevent secrets from becoming stale during rotation windows.
9. Security Hardening
ESO dramatically improves your secrets posture, but a poorly configured ESO cluster can create new attack surfaces. Apply these hardening practices from day one:
Use IRSA, not access keys, for AWS authentication. IAM Roles for Service Accounts (IRSA) binds an IAM role to the ESO ServiceAccount via OIDC federation. No access key is stored anywhere — the temporary STS credential is injected by the AWS SDK automatically. This eliminates the credential-in-a-Secret bootstrap problem entirely:
# SecretStore with IRSA — no access keys stored
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-ssm-irsa-store
namespace: production
spec:
provider:
aws:
service: ParameterStore
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa # ServiceAccount annotated with IRSA role ARN
Namespace-scope your SecretStores. Use namespace-scoped SecretStore (not ClusterSecretStore) for team-specific secret namespaces. This ensures a misconfigured ExternalSecret in the staging namespace cannot access production secrets even if the IAM path allows it — the SecretStore itself is scoped.
Encrypt etcd at rest. Even with ESO, Kubernetes Secrets still land in etcd. Enable etcd encryption at rest using a KMS provider key (AWS KMS, GCP CMEK). This is defense in depth — if etcd backups are leaked, the data is useless without the KMS key.
Audit log every ExternalSecret sync. Configure your ESO Deployment to emit structured JSON logs and ship them to your SIEM. Each sync log entry includes the ExternalSecret name, namespace, store, and sync timestamp. Combined with AWS CloudTrail, this gives you a full chain-of-custody audit from secret update in SSM to pod environment variable.
"Secrets management is not a feature you add to a cluster — it is a discipline you build into every workflow from the first commit. The best time to adopt ESO was when you created the cluster. The second best time is today."
— Platform Security Engineering principle
Key Takeaways
- Kubernetes Secrets are base64, not encrypted — treat them as plaintext and never commit values to Git. The git history problem is permanent unless you force-push, which most teams never complete in time.
- External Secrets Operator is the GitOps-native solution — ExternalSecret and SecretStore CRDs live in Git; actual values live in AWS SSM or Vault. ESO bridges the gap without exposing credentials anywhere in your source control.
- IRSA eliminates the bootstrap credential problem — never store AWS access keys in a Kubernetes Secret to authenticate ESO. Use IRSA to bind an IAM role to the ESO ServiceAccount via OIDC instead.
- ESO operator crashes are non-fatal — existing Kubernetes Secrets remain intact when the operator is down. Alert immediately, but do not panic: pods continue running with last-synced values until the operator recovers.
- Combine ESO with Stakater Reloader and etcd encryption for defense in depth — ESO removes secrets from Git, Reloader handles zero-downtime rotation, and etcd encryption ensures stored Secrets are unreadable even if backup media is compromised.
Conclusion
The fintech incident — 47 services, one bad push, 23 hours of exposed credentials — is entirely preventable with External Secrets Operator. Once ESO is in place, there are no secret values in your Git repository, no manual rotation procedures, and no silent exposure windows. The ExternalSecret CRD gives developers a clean, GitOps-compatible interface to declare what secrets their service needs without ever seeing the actual values.
ESO is only one layer of your Kubernetes security posture. To ensure that service identities cannot abuse the secrets they receive, pair ESO with a robust RBAC model. Our Kubernetes RBAC Security guide covers role scoping, least-privilege service accounts, and audit logging to create a complete access control layer that complements your secrets management strategy.
Read Full Blog Here
Explore the complete guide including IRSA setup, Vault dynamic secrets configuration, and a production-ready GitOps workflow for Kubernetes secrets management.
Read the Full PostDiscussion / Comments
Related Posts
Kubernetes RBAC Security
Lock down cluster access with least-privilege RBAC roles, service account hardening, and audit logging.
GitOps with ArgoCD
Implement declarative, Git-driven Kubernetes deployments with ArgoCD sync policies and app-of-apps patterns.
Microservices Security Patterns
Secure inter-service communication with mTLS, JWT validation, and zero-trust network policies.
Last updated: March 2026 — Written by Md Sanwar Hossain