Kubernetes Workloads

Understanding Kubernetes workload resources and their management

Kubernetes provides several resource types to manage containerized applications. These workload resources define how your applications should run, scale, update, and recover from failures, forming the foundation of application deployment in Kubernetes.

Workload resources create and manage sets of pods, the smallest and simplest Kubernetes objects that represent a single instance of a process running in your cluster. These higher-level abstractions help you deploy and manage applications without managing individual pods directly.

Core Workload Resources

Pods

Smallest deployable unit in Kubernetes
Contains one or more containers that share resources
Shares network namespace (same IP and port space)
Shares storage volumes between containers
Co-located and co-scheduled containers
Represents a single instance of an application
Typically managed by higher-level controllers, not directly

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
    environment: production
  annotations:
    description: "Web server pod"
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
      name: http
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
    livenessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 30
      periodSeconds: 10
    volumeMounts:
    - name: nginx-config
      mountPath: /etc/nginx/conf.d
  volumes:
  - name: nginx-config
    configMap:
      name: nginx-configuration
  restartPolicy: Always
  terminationGracePeriodSeconds: 30

Deployments

Manages ReplicaSets for stateless applications
Handles rolling updates with controlled progression
Provides rollback capability to previous versions
Ensures declarative updates to pods and ReplicaSets
Supports scaling (manual or automatic)
Manages application lifecycle and self-healing
Most common workload resource for stateless applications

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
  annotations:
    kubernetes.io/change-cause: "Update to nginx 1.14.2"
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - nginx
              topologyKey: "kubernetes.io/hostname"

StatefulSets

For stateful applications (databases, distributed systems)
Provides stable, unique network identities (predictable pod names)
Offers stable, persistent storage for each pod
Guarantees ordered deployment and scaling (sequential operations)
Provides ordered, graceful deployment and scaling
Supports automated rolling updates with controlled termination
Maintains sticky identity for pods even after rescheduling

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: "postgres"
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:13
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secrets
              key: password
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 10Gi

DaemonSets

Runs pods on all nodes (or selected nodes using nodeSelector)
Used for cluster-wide services and monitoring
Automatic pod scheduling as nodes are added to cluster
Ensures every node runs exactly one copy of the pod
Perfect for infrastructure-related tasks
Commonly used for logging, monitoring, and networking
Respects taints and tolerations for node selection

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      # This toleration allows daemonsets to be scheduled on master nodes
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: varlog
        hostPath:
          path: /var/log

Jobs & CronJobs

Jobs: Run pods that execute a task to completion
- Batch processing and one-time tasks
- Ensures that a specified number of pods complete successfully
- Can run pods in parallel
- Handles pod failures and restarts
CronJobs: Schedule Jobs to run at specific times
- Similar to cron on Linux systems
- Creates Jobs on a time-based schedule
- Perfect for automated tasks, backups, reports
- Manages Job history and cleanup

# Job example
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processor
spec:
  completions: 5
  parallelism: 2
  backoffLimit: 3
  template:
    spec:
      containers:
      - name: processor
        image: data-processor:v1
        command: ["./process-data.sh"]
      restartPolicy: OnFailure

# CronJob example
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-database
spec:
  schedule: "0 2 * * *"  # Run at 2:00 AM every day
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: database-backup:v1
            command: ["./backup.sh"]
          restartPolicy: OnFailure

Workload Management

Scaling

There are several approaches to scaling your workloads in Kubernetes:

# Manual scaling of deployment
kubectl scale deployment nginx-deployment --replicas=5

# Declarative scaling by editing the deployment
kubectl edit deployment nginx-deployment # Then change replicas value

# Imperative scaling with patching
kubectl patch deployment nginx-deployment -p '{"spec":{"replicas":5}}'

# Horizontal Pod Autoscaling (HPA)
kubectl autoscale deployment nginx-deployment --min=2 --max=5 --cpu-percent=80

# Creating an HPA with YAML
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
EOF

# Vertical Pod Autoscaler (requires VPA operator)
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  updatePolicy:
    updateMode: "Auto"
EOF

Updates

Kubernetes provides various methods for updating applications:

# Rolling update by changing container image
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1

# Applying updated YAML file
kubectl apply -f updated-deployment.yaml

# Edit deployment directly
kubectl edit deployment/nginx-deployment

# Rollout status monitoring
kubectl rollout status deployment/nginx-deployment

# Pausing a rollout (for canary deployments)
kubectl rollout pause deployment/nginx-deployment

# Resuming a rollout
kubectl rollout resume deployment/nginx-deployment

# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment

# Rollback to specific revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2

# View rollout history
kubectl rollout history deployment/nginx-deployment

# View specific revision details
kubectl rollout history deployment/nginx-deployment --revision=3

# Restart all pods in a deployment (without changing config)
kubectl rollout restart deployment/nginx-deployment

Deployment Strategies

Kubernetes supports several deployment strategies:

Rolling Update (default)
spec: strategy: type: RollingUpdate rollingUpdate: maxSurge: 25% # Max pods added above desired number maxUnavailable: 25% # Max pods unavailable during update
Recreate (downtime strategy)
spec: strategy: type: Recreate # Terminates all pods before creating new ones
Blue/Green (using labels and services)
# Deploy new version with different label kubectl apply -f deployment-green.yaml # Test the new deployment # Then switch traffic by updating service selector kubectl patch service my-app -p '{"spec":{"selector":{"version":"green"}}}'

Canary (gradual traffic shifting)

# Main deployment (90% of traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 9
  selector:
    matchLabels:
      app: my-app
      version: v1
  template:
    metadata:
      labels:
        app: my-app
        version: v1
    spec:
      containers:
      - name: my-app
        image: my-app:v1

# Canary deployment (10% of traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app-canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
      version: v2
  template:
    metadata:
      labels:
        app: my-app
        version: v2
    spec:
      containers:
      - name: my-app
        image: my-app:v2

Health Checks

Kubernetes provides three types of health checks (probes) to ensure application reliability:

spec:
  containers:
  - name: app
    # Liveness Probe: Determines if container is running
    # If it fails, container is restarted
    livenessProbe:
      httpGet:
        path: /health
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: CheckValue
      initialDelaySeconds: 15  # Wait before first probe
      periodSeconds: 10        # How often to probe
      timeoutSeconds: 5        # Timeout for probe
      successThreshold: 1      # Consecutive successes needed
      failureThreshold: 3      # Consecutive failures before restart
    
    # Readiness Probe: Determines if container can serve traffic
    # If it fails, pod is removed from service endpoints
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
    
    # Startup Probe: Determines if application has started
    # Disables liveness and readiness until it succeeds
    # Useful for slow-starting containers
    startupProbe:
      httpGet:
        path: /startup
        port: 8080
      failureThreshold: 30     # Allow 5 minutes (30 * 10s) for startup
      periodSeconds: 10

Different probe types for various use cases:

# HTTP probe for web applications
livenessProbe:
  httpGet:
    path: /healthz
    port: http
    httpHeaders:
    - name: Authorization
      value: Bearer TOKEN

# TCP socket probe for databases or non-HTTP services
livenessProbe:
  tcpSocket:
    port: 3306
  initialDelaySeconds: 30
  periodSeconds: 10

# Command probe for custom scripts
livenessProbe:
  exec:
    command:
    - sh
    - -c
    - "redis-cli ping | grep PONG"
  initialDelaySeconds: 15
  periodSeconds: 15

# gRPC probe for gRPC services (Kubernetes 1.24+)
livenessProbe:
  grpc:
    port: 9000
    service: health.v1.HealthService
  periodSeconds: 10

Resource Management

Resource Requests

Resource requests specify the minimum amount of resources a container needs to run. Kubernetes uses these values for scheduling decisions.

resources:
  requests:
    memory: "64Mi"    # 64 mebibytes of memory
    cpu: "250m"       # 250 millicpu (1/4 of a CPU core)
    ephemeral-storage: "1Gi"  # Local ephemeral storage
    hugepages-2Mi: "128Mi"    # Optional hugepages resource (specialized)

Key points about requests:

Scheduler guarantees at least these resources are available on the node
Used to determine which node can accommodate the pod
Defines the minimum resources a container needs to operate
Does not limit actual usage - container can use more if available
Essential for proper scheduling and capacity planning

Resource Limits

Resource limits specify the maximum amount of resources a container can use. Exceeding these limits triggers enforcement actions.

resources:
  limits:
    memory: "128Mi"   # Container will be OOM killed if it exceeds this
    cpu: "500m"       # Container CPU usage will be throttled to this value
    ephemeral-storage: "2Gi"  # Limits local storage usage

Key points about limits:

Memory limits are enforced by the OOM (Out of Memory) killer
CPU limits are enforced by throttling
Limits protect node stability by preventing resource starvation
Setting appropriate limits prevents "noisy neighbor" problems
Too restrictive limits can cause application performance issues

Resource Units

CPU:
- 1 or 1000m = 1 full CPU core
- 500m = 500 millicores = 0.5 CPU
- 0.1 = 100 millicores
Memory:
- 1Mi = 1 Mebibyte (1024 KiB)
- 1M = 1 Megabyte (1000 KB)
- 1Gi = 1 Gibibyte (1024 MiB)
- 1G = 1 Gigabyte (1000 MB)

Complete Resource Configuration Example

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: resource-demo-ctr
    image: nginx
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

Quality of Service (QoS) Classes

Kubernetes assigns QoS classes based on resource configuration:

Guaranteed
- Both requests and limits are specified
- Requests equal limits for all resources
- Highest priority during resource contention
- Last to be killed under node pressure
resources: requests: memory: "128Mi" cpu: "500m" limits: memory: "128Mi" cpu: "500m"
Burstable
- At least one container has requests that don't equal limits
- Medium priority during resource contention
- Killed after BestEffort pods when under pressure
resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m"
BestEffort
- No resource requests or limits specified
- Lowest priority during resource contention
- First to be killed under node pressure
# No resources section at all

Resource Quotas

Limit resources within a namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: development
spec:
  hard:
    pods: "10"
    requests.cpu: "4"
    requests.memory: 4Gi
    limits.cpu: "8"
    limits.memory: 8Gi

Limit Ranges

Define default resource limits and requests:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: development
spec:
  limits:
  - default:
      memory: "256Mi"
      cpu: "500m"
    defaultRequest:
      memory: "128Mi"
      cpu: "100m"
    type: Container

Best Practices

Always set resource requests and limits
- Ensures proper scheduling and resource allocation
- Prevents resource starvation and "noisy neighbor" problems
- Match requests to actual application needs based on profiling
- Set reasonable limits to protect node stability
resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "500m"
Implement comprehensive health checks
- Use appropriate probe types (HTTP, TCP, Exec) based on application
- Configure proper timeouts and thresholds
- Include meaningful health endpoints in your application
- Use all three probe types when appropriate (startup, liveness, readiness)
- Ensure health checks verify critical dependencies
Design for proper pod lifecycle handling
- Implement graceful shutdown to handle SIGTERM signals
- Set appropriate terminationGracePeriodSeconds (default: 30s)
- Use preStop hooks for custom shutdown procedures
- Configure proper pod disruption budgets (PDBs)
spec: terminationGracePeriodSeconds: 60 containers: - name: app lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 10; /shutdown.sh"]
Implement proper labels and selectors
- Use consistent labeling strategy across all resources
- Include labels for app, environment, version, component
- Use annotations for non-identifying metadata
- Leverage label selectors for service targeting
metadata: labels: app: myapp component: frontend environment: production version: v1.2.3 tier: frontend annotations: description: "Frontend service for customer portal" team: "frontend-team"
Plan for high availability
- Use deployments with multiple replicas
- Configure pod anti-affinity to distribute across nodes
- Implement pod disruption budgets for controlled maintenance
- Set appropriate update strategies
# Pod Disruption Budget example apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: frontend-pdb spec: minAvailable: 2 # or maxUnavailable: 1 selector: matchLabels: app: frontend
Use namespaces for resource isolation
- Organize resources by project, team, or environment
- Apply resource quotas at namespace level
- Implement network policies for namespace isolation
- Use RBAC roles bound to namespaces
# Create and use namespaces kubectl create namespace team-frontend kubectl config set-context --current --namespace=team-frontend
Implement proper security context
- Run containers as non-root users
- Apply principle of least privilege
- Use read-only root filesystem when possible
- Set appropriate security context
securityContext: runAsUser: 1000 runAsGroup: 3000 fsGroup: 2000 readOnlyRootFilesystem: true allowPrivilegeEscalation: false
Configure appropriate update strategies
- Use rolling updates with appropriate maxSurge/maxUnavailable
- Consider canary deployments for critical applications
- Implement blue/green for zero-downtime requirements
- Test rollback procedures before production deployment
Implement proper monitoring and logging
- Export metrics for CPU, memory, and application-specific data
- Configure centralized logging
- Use structured logging formats (JSON)
- Implement distributed tracing for complex applications

Edit this page

Kubernetes Architecture

Understanding Kubernetes architecture and core components

Kubernetes Networking

Understanding Kubernetes networking concepts and implementation