Kubernetes operator pattern custom controllers stateful applications
DevOps March 19, 2026 22 min read DevOps Reliability Engineering Series

Kubernetes Operator Pattern: Building Custom Controllers for Stateful Applications

Kubernetes Deployments and StatefulSets handle stateless and basic stateful workloads well. But running databases, message brokers, and distributed datastores at production scale requires operational knowledge that can't be encoded in YAML — backup orchestration, primary election, rolling schema upgrades, disaster recovery. The Operator Pattern encodes this human operational knowledge as code.

Table of Contents

  1. The Problem with YAML-Driven StatefulSet Management
  2. The Operator Pattern: Controller + CRD
  3. The Reconciliation Loop
  4. Designing the Custom Resource (CRD)
  5. Implementing the Controller with Kubebuilder
  6. Real Stateful Application Scenarios
  7. Production Failure Scenarios
  8. Testing Operators: Unit, Integration, E2E
  9. Trade-offs and When NOT to Write an Operator
  10. Key Takeaways

1. The Problem with YAML-Driven StatefulSet Management

Consider deploying a Kafka cluster on Kubernetes. A StatefulSet can manage the pods, but it can't:

These operations require deep application-specific knowledge. Before Operators, teams encoded this knowledge in runbooks, manual scripts, and tribal knowledge. The Operator Pattern replaces the runbook with code — specifically, a Kubernetes controller that watches a Custom Resource and continuously reconciles the cluster state toward the desired state.

Real-world scale: Strimzi (Kafka Operator), PostgreSQL Operator by Zalando, MongoDB Community Operator, and Vitess Operator are all production examples managing thousands of clusters across organizations. These operators encode years of operational runbook knowledge into code.

2. The Operator Pattern: Controller + CRD

An Operator consists of two components:

User applies: KafkaCluster YAML
                   ↓
Kubernetes API Server stores in etcd
                   ↓
Controller watches for KafkaCluster events (informer)
                   ↓
Reconcile loop runs: compare desired vs actual
                   ↓
Controller creates/updates/deletes: StatefulSets, Services,
ConfigMaps, PVCs, RBAC, NetworkPolicies
                   ↓
Updates KafkaCluster.Status (conditions, observedState)

3. The Reconciliation Loop

The reconciliation loop is the heart of every controller. It must be:

4. Designing the Custom Resource (CRD)

apiVersion: kafka.myorg.io/v1alpha1
kind: KafkaCluster
metadata:
  name: payments-kafka
  namespace: production
spec:
  version: "3.7.0"
  replicas: 3
  resources:
    requests:
      cpu: "2"
      memory: "8Gi"
    limits:
      cpu: "4"
      memory: "16Gi"
  storage:
    class: premium-ssd
    size: 500Gi
  config:
    defaultReplicationFactor: 3
    minInsyncReplicas: 2
    logRetentionHours: 168
  backup:
    enabled: true
    schedule: "0 3 * * *"          # Daily at 3 AM
    destination: s3://backups/kafka
  monitoring:
    prometheusEnabled: true
status:
  phase: Running                   # Pending | Initializing | Running | Degraded
  readyBrokers: 3
  conditions:
    - type: Available
      status: "True"
      lastTransitionTime: "2026-03-19T05:00:00Z"
    - type: BrokersDegraded
      status: "False"

CRD design principles: (1) Spec expresses desired state (immutable business intent); Status expresses observed state. Never let users write to Status — it's owned by the controller. (2) Version your API from day one (v1alpha1v1beta1v1). Kubernetes CRD versioning with conversion webhooks handles schema evolution. (3) Validation: use CEL (Common Expression Language) validation rules in the CRD schema to reject invalid specs before they reach the controller.

5. Implementing the Controller with Kubebuilder

// Kubebuilder controller skeleton
//+kubebuilder:rbac:groups=kafka.myorg.io,resources=kafkaclusters,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=apps,resources=statefulsets,verbs=get;list;watch;create;update;patch;delete

func (r *KafkaClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)

    // 1. Fetch the KafkaCluster instance
    cluster := &kafkav1alpha1.KafkaCluster{}
    if err := r.Get(ctx, req.NamespacedName, cluster); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // 2. Handle deletion via finalizer
    if !cluster.DeletionTimestamp.IsZero() {
        return r.handleDeletion(ctx, cluster)
    }

    // 3. Reconcile StatefulSet
    if result, err := r.reconcileStatefulSet(ctx, cluster); err != nil || result.Requeue {
        return result, err
    }

    // 4. Reconcile Services
    if result, err := r.reconcileServices(ctx, cluster); err != nil || result.Requeue {
        return result, err
    }

    // 5. Wait for brokers to be ready
    ready, err := r.checkBrokerReadiness(ctx, cluster)
    if err != nil {
        return ctrl.Result{}, err
    }
    if !ready {
        log.Info("Brokers not ready yet, requeueing")
        return ctrl.Result{RequeueAfter: 15 * time.Second}, nil
    }

    // 6. Update status
    return r.updateStatus(ctx, cluster)
}

6. Real Stateful Application Scenarios

Scenario: Safe Rolling Upgrade

When the user updates spec.version from 3.6 to 3.7, the operator can't just do a rolling restart like a Deployment would. It must: (1) Verify that all partitions have sufficient ISR replicas before touching any broker, (2) Upgrade one broker at a time, (3) Wait for the upgraded broker to re-join and ISR to stabilize before upgrading the next, (4) Roll back all upgraded brokers if any step fails. This multi-step coordinated upgrade is impossible with native StatefulSet rolling update logic.

Scenario: Automatic Backup Before Destructive Operation

When a user decreases spec.replicas from 5 to 3, the operator recognizes this as a destructive scale-down. Before proceeding, it triggers an immediate backup (creating a KafkaBackup CR which the operator also manages), waits for the backup to complete successfully, then proceeds with the scale-down. If the backup fails, it blocks the scale-down and sets a status condition explaining why.

7. Production Failure Scenarios

Failure: Controller Restart During Multi-Step Operation

If the controller pod restarts mid-upgrade (step 2 of 5), the reconciliation loop restarts from scratch. Idempotent reconciliation means it re-checks the current state and determines which brokers have been upgraded vs. which haven't — and continues from the correct point. Stateful multi-step operations must be encoded in the Custom Resource's Status (current step, checkpoints) so restarts can resume rather than restart from zero.

Failure: Status Desync After etcd Compaction

etcd compaction can cause controllers to miss events. The controller's informer cache re-syncs on schedule, but there's a window where Status may not reflect actual state. Implement a periodical status health check: every 5 minutes, regardless of events, verify that Status.ReadyBrokers matches the actual pod count. Correct divergence immediately.

8. Testing Operators: Unit, Integration, E2E

9. Trade-offs and When NOT to Write an Operator

Operator Maturity Model: The Operator Framework defines 5 maturity levels: (1) Basic Install, (2) Seamless Upgrades, (3) Full Lifecycle, (4) Deep Insights, (5) Auto Pilot. Most custom operators stop at Level 3. Level 5 requires the operator to proactively recommend or implement optimizations based on workload patterns.

10. Key Takeaways

  • The Operator Pattern encodes operational runbook knowledge (backup, upgrade, failover) as code in a Kubernetes controller.
  • Reconciliation loops must be idempotent and level-triggered — they compare desired vs. actual state, not deltas between events.
  • CRD Spec = user-desired state (never written by controller); CRD Status = observed state (owned by controller).
  • Multi-step operations must checkpoint progress in Status so controller restarts resume correctly.
  • Test with envtest for unit/integration tests; kind-based E2E tests for full lifecycle validation; chaos testing for resilience.
  • Before writing a custom operator, evaluate existing community operators (Strimzi, Zalando, etc.) — they encode years of production expertise.

Conclusion

Kubernetes Operators represent the evolution from "infrastructure managed by YAML" to "infrastructure managed by code." For teams running stateful systems at production scale, the investment in a well-designed operator pays for itself within a few months in reduced operational incidents and faster onboarding of new cluster instances.

Start by understanding the reconciliation model deeply — it's the conceptual foundation for everything else. Then choose Kubebuilder over Operator SDK for new projects (better Kubernetes API integration) and write your first operator for the smallest, simplest use case to build intuition before tackling complex stateful workloads.

Related Posts

Md Sanwar Hossain
Md Sanwar Hossain

Software Engineer · Java · Spring Boot · Kubernetes · DevOps · Distributed Systems

Discussion / Comments

Join the conversation — your comment goes directly to my inbox.

Back to Blog