Welcome to from-docker-to-kubernetes

Kubernetes Event-Driven Autoscaling (KEDA)

Implementing advanced event-driven autoscaling solutions in Kubernetes to scale workloads based on external metrics and event sources

Introduction to Event-Driven Autoscaling

Traditional Kubernetes autoscaling mechanisms like Horizontal Pod Autoscaler (HPA) primarily rely on CPU and memory metrics to scale workloads. However, modern cloud-native applications often require more sophisticated scaling based on application-specific metrics and event patterns. Kubernetes Event-Driven Autoscaling (KEDA) addresses this need by enabling autoscaling based on event sources and custom metrics:

  • Event-based scaling: Scale based on the number of events or messages in queues and streams
  • Application-specific metrics: Use metrics that directly reflect application workload
  • Zero-to-many scaling: Scale from zero to handle workloads efficiently
  • Diverse event sources: Support for a wide range of messaging systems and data sources
  • Custom metrics: Flexibility to define custom scaling metrics

This guide explores how to implement event-driven autoscaling in Kubernetes environments, enabling more responsive and efficient scaling for diverse workloads.

Understanding KEDA Architecture

Core Components and Concepts

KEDA consists of several key components that work together to provide event-driven autoscaling:

  1. Controller: The central component that monitors ScaledObjects and manages scaling operations
  2. Metrics Server: Exposes external metrics to the Kubernetes Metrics API
  3. ScaledObject: Custom resource that defines scaling rules and triggers
  4. ScaledJob: Custom resource for event-driven jobs (similar to CronJobs but event-triggered)
  5. Scalers: Adapters for different event sources (Kafka, RabbitMQ, Prometheus, etc.)

The architecture follows a Kubernetes-native approach:

+-----------------+      +------------------+      +----------------+
| Kubernetes API  |<---->| KEDA Controller  |<---->| Event Sources  |
+-----------------+      +------------------+      +----------------+
        ^                        |
        |                        v
        |                +------------------+
        +--------------->| Metrics Server   |
                         +------------------+

Installation and Setup

Installing KEDA using Helm:

# Add the KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts

# Update your Helm chart repository
helm repo update

# Install KEDA in your cluster
helm install keda kedacore/keda --namespace keda --create-namespace

Alternatively, using YAML manifests:

# Apply the KEDA CRDs and components
kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.10.1/keda-2.10.1.yaml

Verifying the installation:

kubectl get pods -n keda

Configuring Scalers and Triggers

ScaledObject Resource

The ScaledObject is the primary custom resource for defining how a deployment should scale based on event sources:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-scaler
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: consumer-app
  pollingInterval: 15
  cooldownPeriod: 30
  minReplicaCount: 0
  maxReplicaCount: 30
  triggers:
  - type: rabbitmq
    metadata:
      protocol: amqp
      queueName: orders
      host: rabbitmq.default.svc.cluster.local
      queueLength: '50'

Key fields in the ScaledObject:

  1. scaleTargetRef: References the deployment to scale
  2. pollingInterval: How frequently to check the event source (in seconds)
  3. cooldownPeriod: Time to wait before scaling down (in seconds)
  4. minReplicaCount/maxReplicaCount: Scaling boundaries
  5. triggers: Array of event sources that trigger scaling

ScaledJob Resource

For batch-oriented workloads, ScaledJob creates Kubernetes Jobs based on events:

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: kafka-batch-processor
  namespace: default
spec:
  jobTargetRef:
    template:
      spec:
        containers:
        - name: processor
          image: my-processor:latest
          resources:
            requests:
              memory: "64Mi"
              cpu: "100m"
            limits:
              memory: "128Mi"
              cpu: "200m"
        restartPolicy: Never
  pollingInterval: 30
  maxReplicaCount: 50
  successfulJobsHistoryLimit: 10
  failedJobsHistoryLimit: 10
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.default.svc.cluster.local:9092
      consumerGroup: batch-processor
      topic: batch-tasks
      lagThreshold: '100'

Common Scaling Triggers and Use Cases

Message Queue-Based Scaling

Scaling based on message queues is one of the most common KEDA use cases:

Database and Storage-Based Scaling

Scaling based on database metrics and storage systems:

Prometheus Metrics-Based Scaling

Using custom Prometheus metrics for scaling:

triggers:
- type: prometheus
  metadata:
    serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
    metricName: http_requests_total
    threshold: '100'
    query: sum(rate(http_requests_total{app="my-app"}[2m]))

Cron-Based Scaling

Combining event-driven scaling with time-based patterns:

triggers:
- type: cron
  metadata:
    timezone: UTC
    start: 30 * * * *
    end: 45 * * * *
    desiredReplicas: "10"

Advanced Configurations

Multiple Triggers and Scaling Rules

KEDA supports multiple triggers with logical OR behavior - scaling up occurs if any trigger activates:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: multi-trigger-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: processing-service
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.default.svc.cluster.local:9092
      consumerGroup: processor
      topic: high-priority
      lagThreshold: '10'
  - type: kafka
    metadata:
      bootstrapServers: kafka.default.svc.cluster.local:9092
      consumerGroup: processor
      topic: standard
      lagThreshold: '100'
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      threshold: '50'
      query: sum(rate(processing_queue_length{service="processor"}[2m]))

Authentication Patterns

Secure authentication for various scalers:

Environment Variables

  • Reference secrets via environment variables
triggers:
- type: kafka
  metadata:
    bootstrapServers: kafka.svc:9092
    consumerGroup: order-processor
    topic: orders
    lagThreshold: '100'
    sasl: plaintext
    username: user
    passwordFromEnv: KAFKA_PASSWORD

Kubernetes Secrets

  • Direct reference to Kubernetes secrets
triggers:
- type: postgresql
  metadata:
    connectionFromEnv: POSTGRESQL_CONN_STR
    query: "SELECT COUNT(*) FROM tasks WHERE status='pending'"
    targetQueryValue: "10"
    passwordFromSecret: postgresql-password
    passwordFromSecretKey: password

TLS Configuration

  • Secure connections with TLS
triggers:
- type: rabbitmq
  metadata:
    protocol: amqps
    host: rabbitmq.svc:5671
    queueName: orders
    queueLength: '50'
    tls: "enable"
    ca: "/mnt/certs/ca.crt"
    cert: "/mnt/certs/tls.crt"
    key: "/mnt/certs/tls.key"

Scaling from Zero

One of KEDA's powerful features is scaling from zero, which requires special consideration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: zero-to-scale
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: event-processor
  minReplicaCount: 0
  maxReplicaCount: 10
  advanced:
    restoreToOriginalReplicaCount: true
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.svc:9092
      consumerGroup: processor
      topic: events
      lagThreshold: '1'

Key considerations for scaling from zero:

  1. Activation triggers: Set appropriate threshold for activation
  2. Cold start time: Account for application startup time
  3. Resource provisioning: Ensure resources are available for rapid scaling
  4. State management: Handle stateful applications carefully

Integration with Kubernetes Ecosystem

HPA Compatibility and Interaction

KEDA works alongside the Horizontal Pod Autoscaler, extending its capabilities:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: hybrid-scaling
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-api
  advanced:
    horizontalPodAutoscalerConfig:
      name: web-api-hpa
      minReplicas: 1
      maxReplicas: 20
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      threshold: '100'
      query: sum(rate(http_requests_total{service="web-api"}[2m]))

Operators and Custom Resources

KEDA integrates with various Kubernetes operators and custom resources:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-trigger-auth
spec:
  secretTargetRef:
  - parameter: sasl
    name: kafka-secrets
    key: sasl
  - parameter: username
    name: kafka-secrets
    key: username
  - parameter: password
    name: kafka-secrets
    key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-scaledobject
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: kafka-consumer
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.svc:9092
      consumerGroup: order-processor
      topic: orders
      lagThreshold: '50'
    authenticationRef:
      name: kafka-trigger-auth

Monitoring and Observability

Monitoring KEDA operations with Prometheus and Grafana:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: keda-metrics
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: keda-operator
  endpoints:
  - port: metrics

Example Prometheus queries for KEDA monitoring:

# Active scaled objects
keda_scaled_object_active_total

# Scale decisions by trigger type
sum(keda_scaled_object_metrics_value) by (type)

# Scaling errors
keda_scaled_object_errors_total

Performance Optimization and Best Practices

Tuning Scaling Parameters

Fine-tuning KEDA for optimal performance:

Polling Intervals

  • Balance between responsiveness and resource consumption
  • Default is 30 seconds, reduce for latency-sensitive applications
  • Consider scaler-specific limitations (API rate limits)

Cooldown Periods

  • Prevent scaling thrashing
  • Typically 300 seconds for scale down
  • Shorter for dynamic workloads, longer for stable patterns

Scaling Thresholds

  • Set appropriate thresholds based on workload characteristics
  • Consider baseline and peak patterns
  • Implement gradual scaling with multiple trigger points

Example of optimized scaling parameters:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: optimized-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: consumer-service
  pollingInterval: 15  # Check every 15 seconds
  cooldownPeriod: 300  # 5 minutes cooldown before scaling down
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          stabilizationWindowSeconds: 0
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 20
            periodSeconds: 60
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      threshold: '100'
      query: sum(rate(processing_queue_length{service="consumer"}[2m]))

Resource Management

Ensuring appropriate resources for KEDA components:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: keda-operator
  namespace: keda
spec:
  # ... other fields
  template:
    spec:
      containers:
      - name: keda-operator
        image: ghcr.io/kedacore/keda:2.10.1
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 1000m
            memory: 1Gi

Scaling Limits and Quotas

Setting appropriate scaling boundaries with namespace quotas:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: pods-high
  namespace: scaling-apps
spec:
  hard:
    pods: "100"
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi

Real-world Use Cases and Patterns

Microservices Event Processing

Event-driven scaling for microservices architecture:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-processing-service
  minReplicaCount: 0
  maxReplicaCount: 20
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.svc:9092
      consumerGroup: order-processor
      topic: orders
      lagThreshold: '10'

Batch Processing and ETL Pipelines

Using ScaledJobs for batch processing:

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: data-processor
spec:
  jobTargetRef:
    template:
      spec:
        containers:
        - name: data-processor
          image: data-processor:latest
          env:
          - name: BATCH_SIZE
            value: "100"
        restartPolicy: Never
  pollingInterval: 30
  maxReplicaCount: 50
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 10
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-east-1.amazonaws.com/123456789012/data-processing-queue
      queueLength: "100"
      awsRegion: us-east-1
      identityOwner: pod

Serverless-like Workloads

Creating serverless-like experience with KEDA:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-service
  annotations:
    autoscaling.keda.sh/cooldown-period: "300"
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicaCount: 0
  maxReplicaCount: 10
  advanced:
    restoreToOriginalReplicaCount: true
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      threshold: '1'
      query: sum(rate(http_requests_total{service="api-gateway",path=~"/api/.*"}[2m]))

Troubleshooting and Debugging

Common Issues and Solutions

Diagnosing and resolving KEDA issues:

Scaling Not Triggered

  • Check ScaledObject status
  • Verify trigger metrics and thresholds
  • Confirm connectivity to event sources
  • Check authentication credentials

Scaling Delays

  • Review polling intervals
  • Check for resource constraints
  • Examine scaling behavior configuration
  • Verify event source latency

Over or Under Scaling

  • Adjust thresholds based on workload
  • Implement more granular scaling policies
  • Consider adding stabilization windows
  • Evaluate multiple triggers for complex scenarios

Example debugging commands:

# Check ScaledObject status
kubectl get scaledobject order-processor -o yaml

# Examine KEDA logs
kubectl logs -n keda -l app=keda-operator

# Check HPA created by KEDA
kubectl get hpa

# Inspect metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/kafka-lag"

Logging and Diagnostics

Enhanced logging configuration for troubleshooting:

apiVersion: v1
kind: ConfigMap
metadata:
  name: keda-logging-config
  namespace: keda
data:
  KEDA_LOG_LEVEL: debug
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: keda-operator
  namespace: keda
spec:
  template:
    spec:
      containers:
      - name: keda-operator
        envFrom:
        - configMapRef:
            name: keda-logging-config

Emerging Patterns

The KEDA ecosystem continues to evolve with new capabilities:

  1. HTTP-based scaling: Direct scaling based on HTTP traffic patterns
  2. ML-based predictive scaling: Using machine learning to predict scaling needs
  3. Cross-cluster scaling: Coordinating scaling across multiple Kubernetes clusters
  4. Composite metrics: Combining multiple metrics with weighted importance
  5. Scaling profiles: Time or condition-based scaling profiles for different scenarios

Integration with Cloud-Native Ecosystem

KEDA's integration with emerging cloud-native technologies:

# Example of KEDA with Knative
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: knative-function
spec:
  scaleTargetRef:
    apiVersion: serving.knative.dev/v1
    kind: Service
    name: my-function
  advanced:
    targetPodController: knative-deployment
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      threshold: '1'
      query: sum(rate(function_invocations_total{service="my-function"}[1m]))

Conclusion

Kubernetes Event-Driven Autoscaling represents a significant advancement in the way applications scale in cloud-native environments. By providing a Kubernetes-native way to scale workloads based on actual application demand signals rather than just resource utilization, KEDA enables more responsive, efficient, and cost-effective scaling strategies.

As organizations continue to adopt event-driven architectures and message-based systems, KEDA's ability to scale based on queue depths, custom metrics, and diverse event sources becomes increasingly valuable. The project's growing ecosystem of scalers and integrations ensures that it can adapt to a wide range of application patterns and infrastructures.

By implementing event-driven autoscaling with KEDA, teams can build more resilient applications that efficiently handle variable workloads, scale to zero when idle, and rapidly respond to demand spikes – ultimately delivering better performance and resource utilization in Kubernetes environments.