Kubernetes provides several resource types to manage containerized applications. These workload resources define how your applications should run, scale, update, and recover from failures, forming the foundation of application deployment in Kubernetes.
Workload resources create and manage sets of pods, the smallest and simplest Kubernetes objects that represent a single instance of a process running in your cluster. These higher-level abstractions help you deploy and manage applications without managing individual pods directly.
Smallest deployable unit in Kubernetes Contains one or more containers that share resources Shares network namespace (same IP and port space) Shares storage volumes between containers Co-located and co-scheduled containers Represents a single instance of an application Typically managed by higher-level controllers, not directly
apiVersion : v1
kind : Pod
metadata :
name : nginx-pod
labels :
app : nginx
environment : production
annotations :
description : "Web server pod"
spec :
containers :
- name : nginx
image : nginx:1.14.2
ports :
- containerPort : 80
name : http
resources :
requests :
memory : "64Mi"
cpu : "250m"
limits :
memory : "128Mi"
cpu : "500m"
livenessProbe :
httpGet :
path : /
port : 80
initialDelaySeconds : 30
periodSeconds : 10
volumeMounts :
- name : nginx-config
mountPath : /etc/nginx/conf.d
volumes :
- name : nginx-config
configMap :
name : nginx-configuration
restartPolicy : Always
terminationGracePeriodSeconds : 30
Manages ReplicaSets for stateless applications Handles rolling updates with controlled progression Provides rollback capability to previous versions Ensures declarative updates to pods and ReplicaSets Supports scaling (manual or automatic) Manages application lifecycle and self-healing Most common workload resource for stateless applications
apiVersion : apps/v1
kind : Deployment
metadata :
name : nginx-deployment
labels :
app : nginx
annotations :
kubernetes.io/change-cause : "Update to nginx 1.14.2"
spec :
replicas : 3
strategy :
type : RollingUpdate
rollingUpdate :
maxSurge : 1
maxUnavailable : 0
selector :
matchLabels :
app : nginx
template :
metadata :
labels :
app : nginx
spec :
containers :
- name : nginx
image : nginx:1.14.2
ports :
- containerPort : 80
resources :
requests :
memory : "64Mi"
cpu : "100m"
limits :
memory : "128Mi"
cpu : "200m"
readinessProbe :
httpGet :
path : /
port : 80
initialDelaySeconds : 5
periodSeconds : 5
affinity :
podAntiAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
podAffinityTerm :
labelSelector :
matchExpressions :
- key : app
operator : In
values :
- nginx
topologyKey : "kubernetes.io/hostname"
For stateful applications (databases, distributed systems) Provides stable, unique network identities (predictable pod names) Offers stable, persistent storage for each pod Guarantees ordered deployment and scaling (sequential operations) Provides ordered, graceful deployment and scaling Supports automated rolling updates with controlled termination Maintains sticky identity for pods even after rescheduling
apiVersion : apps/v1
kind : StatefulSet
metadata :
name : postgres
spec :
serviceName : "postgres"
replicas : 3
selector :
matchLabels :
app : postgres
template :
metadata :
labels :
app : postgres
spec :
containers :
- name : postgres
image : postgres:13
ports :
- containerPort : 5432
name : postgres
env :
- name : POSTGRES_PASSWORD
valueFrom :
secretKeyRef :
name : postgres-secrets
key : password
volumeMounts :
- name : postgres-data
mountPath : /var/lib/postgresql/data
volumeClaimTemplates :
- metadata :
name : postgres-data
spec :
accessModes : [ "ReadWriteOnce" ]
storageClassName : "standard"
resources :
requests :
storage : 10Gi
Runs pods on all nodes (or selected nodes using nodeSelector) Used for cluster-wide services and monitoring Automatic pod scheduling as nodes are added to cluster Ensures every node runs exactly one copy of the pod Perfect for infrastructure-related tasks Commonly used for logging, monitoring, and networking Respects taints and tolerations for node selection
apiVersion : apps/v1
kind : DaemonSet
metadata :
name : fluentd-elasticsearch
namespace : kube-system
labels :
k8s-app : fluentd-logging
spec :
selector :
matchLabels :
name : fluentd-elasticsearch
template :
metadata :
labels :
name : fluentd-elasticsearch
spec :
tolerations :
# This toleration allows daemonsets to be scheduled on master nodes
- key : node-role.kubernetes.io/control-plane
operator : Exists
effect : NoSchedule
containers :
- name : fluentd-elasticsearch
image : quay.io/fluentd_elasticsearch/fluentd:v2.5.2
resources :
limits :
memory : 200Mi
requests :
cpu : 100m
memory : 200Mi
volumeMounts :
- name : varlog
mountPath : /var/log
volumes :
- name : varlog
hostPath :
path : /var/log
Jobs : Run pods that execute a task to completionBatch processing and one-time tasks Ensures that a specified number of pods complete successfully Can run pods in parallel Handles pod failures and restarts CronJobs : Schedule Jobs to run at specific timesSimilar to cron on Linux systems Creates Jobs on a time-based schedule Perfect for automated tasks, backups, reports Manages Job history and cleanup
# Job example
apiVersion : batch/v1
kind : Job
metadata :
name : data-processor
spec :
completions : 5
parallelism : 2
backoffLimit : 3
template :
spec :
containers :
- name : processor
image : data-processor:v1
command : [ "./process-data.sh" ]
restartPolicy : OnFailure
# CronJob example
apiVersion : batch/v1
kind : CronJob
metadata :
name : backup-database
spec :
schedule : "0 2 * * *" # Run at 2:00 AM every day
concurrencyPolicy : Forbid
successfulJobsHistoryLimit : 3
jobTemplate :
spec :
template :
spec :
containers :
- name : backup
image : database-backup:v1
command : [ "./backup.sh" ]
restartPolicy : OnFailure
There are several approaches to scaling your workloads in Kubernetes:
# Manual scaling of deployment
kubectl scale deployment nginx-deployment --replicas=5
# Declarative scaling by editing the deployment
kubectl edit deployment nginx-deployment # Then change replicas value
# Imperative scaling with patching
kubectl patch deployment nginx-deployment -p '{"spec":{"replicas":5}}'
# Horizontal Pod Autoscaling (HPA)
kubectl autoscale deployment nginx-deployment --min=2 --max=5 --cpu-percent=80
# Creating an HPA with YAML
cat << EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
EOF
# Vertical Pod Autoscaler (requires VPA operator)
cat << EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
updatePolicy:
updateMode: "Auto"
EOF
Kubernetes provides various methods for updating applications:
# Rolling update by changing container image
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
# Applying updated YAML file
kubectl apply -f updated-deployment.yaml
# Edit deployment directly
kubectl edit deployment/nginx-deployment
# Rollout status monitoring
kubectl rollout status deployment/nginx-deployment
# Pausing a rollout (for canary deployments)
kubectl rollout pause deployment/nginx-deployment
# Resuming a rollout
kubectl rollout resume deployment/nginx-deployment
# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment
# Rollback to specific revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2
# View rollout history
kubectl rollout history deployment/nginx-deployment
# View specific revision details
kubectl rollout history deployment/nginx-deployment --revision=3
# Restart all pods in a deployment (without changing config)
kubectl rollout restart deployment/nginx-deployment
Kubernetes supports several deployment strategies:
Rolling Update (default)
spec :
strategy :
type : RollingUpdate
rollingUpdate :
maxSurge : 25% # Max pods added above desired number
maxUnavailable : 25% # Max pods unavailable during update
Recreate (downtime strategy)
spec :
strategy :
type : Recreate # Terminates all pods before creating new ones
Blue/Green (using labels and services)
# Deploy new version with different label
kubectl apply -f deployment-green.yaml
# Test the new deployment
# Then switch traffic by updating service selector
kubectl patch service my-app -p '{"spec":{"selector":{"version":"green"}}}'
Canary (gradual traffic shifting)
# Main deployment (90% of traffic)
apiVersion : apps/v1
kind : Deployment
metadata :
name : my-app
spec :
replicas : 9
selector :
matchLabels :
app : my-app
version : v1
template :
metadata :
labels :
app : my-app
version : v1
spec :
containers :
- name : my-app
image : my-app:v1
# Canary deployment (10% of traffic)
apiVersion : apps/v1
kind : Deployment
metadata :
name : my-app-canary
spec :
replicas : 1
selector :
matchLabels :
app : my-app
version : v2
template :
metadata :
labels :
app : my-app
version : v2
spec :
containers :
- name : my-app
image : my-app:v2
Kubernetes provides three types of health checks (probes) to ensure application reliability:
spec :
containers :
- name : app
# Liveness Probe: Determines if container is running
# If it fails, container is restarted
livenessProbe :
httpGet :
path : /health
port : 8080
httpHeaders :
- name : Custom-Header
value : CheckValue
initialDelaySeconds : 15 # Wait before first probe
periodSeconds : 10 # How often to probe
timeoutSeconds : 5 # Timeout for probe
successThreshold : 1 # Consecutive successes needed
failureThreshold : 3 # Consecutive failures before restart
# Readiness Probe: Determines if container can serve traffic
# If it fails, pod is removed from service endpoints
readinessProbe :
httpGet :
path : /ready
port : 8080
initialDelaySeconds : 5
periodSeconds : 5
# Startup Probe: Determines if application has started
# Disables liveness and readiness until it succeeds
# Useful for slow-starting containers
startupProbe :
httpGet :
path : /startup
port : 8080
failureThreshold : 30 # Allow 5 minutes (30 * 10s) for startup
periodSeconds : 10
Different probe types for various use cases:
# HTTP probe for web applications
livenessProbe :
httpGet :
path : /healthz
port : http
httpHeaders :
- name : Authorization
value : Bearer TOKEN
# TCP socket probe for databases or non-HTTP services
livenessProbe :
tcpSocket :
port : 3306
initialDelaySeconds : 30
periodSeconds : 10
# Command probe for custom scripts
livenessProbe :
exec :
command :
- sh
- -c
- "redis-cli ping | grep PONG"
initialDelaySeconds : 15
periodSeconds : 15
# gRPC probe for gRPC services (Kubernetes 1.24+)
livenessProbe :
grpc :
port : 9000
service : health.v1.HealthService
periodSeconds : 10
Resource requests specify the minimum amount of resources a container needs to run. Kubernetes uses these values for scheduling decisions.
resources :
requests :
memory : "64Mi" # 64 mebibytes of memory
cpu : "250m" # 250 millicpu (1/4 of a CPU core)
ephemeral-storage : "1Gi" # Local ephemeral storage
hugepages-2Mi : "128Mi" # Optional hugepages resource (specialized)
Key points about requests:
Scheduler guarantees at least these resources are available on the node Used to determine which node can accommodate the pod Defines the minimum resources a container needs to operate Does not limit actual usage - container can use more if available Essential for proper scheduling and capacity planning Resource limits specify the maximum amount of resources a container can use. Exceeding these limits triggers enforcement actions.
resources :
limits :
memory : "128Mi" # Container will be OOM killed if it exceeds this
cpu : "500m" # Container CPU usage will be throttled to this value
ephemeral-storage : "2Gi" # Limits local storage usage
Key points about limits:
Memory limits are enforced by the OOM (Out of Memory) killer CPU limits are enforced by throttling Limits protect node stability by preventing resource starvation Setting appropriate limits prevents "noisy neighbor" problems Too restrictive limits can cause application performance issues CPU :1
or 1000m
= 1 full CPU core500m
= 500 millicores = 0.5 CPU0.1
= 100 millicoresMemory :1Mi
= 1 Mebibyte (1024 KiB)1M
= 1 Megabyte (1000 KB)1Gi
= 1 Gibibyte (1024 MiB)1G
= 1 Gigabyte (1000 MB)
apiVersion : v1
kind : Pod
metadata :
name : resource-demo
spec :
containers :
- name : resource-demo-ctr
image : nginx
resources :
requests :
memory : "64Mi"
cpu : "250m"
limits :
memory : "128Mi"
cpu : "500m"
Kubernetes assigns QoS classes based on resource configuration:
Guaranteed Both requests and limits are specified Requests equal limits for all resources Highest priority during resource contention Last to be killed under node pressure
resources :
requests :
memory : "128Mi"
cpu : "500m"
limits :
memory : "128Mi"
cpu : "500m"
Burstable At least one container has requests that don't equal limits Medium priority during resource contention Killed after BestEffort pods when under pressure
resources :
requests :
memory : "64Mi"
cpu : "250m"
limits :
memory : "128Mi"
cpu : "500m"
BestEffort No resource requests or limits specified Lowest priority during resource contention First to be killed under node pressure
# No resources section at all
Limit resources within a namespace:
apiVersion : v1
kind : ResourceQuota
metadata :
name : compute-quota
namespace : development
spec :
hard :
pods : "10"
requests.cpu : "4"
requests.memory : 4Gi
limits.cpu : "8"
limits.memory : 8Gi
Define default resource limits and requests:
apiVersion : v1
kind : LimitRange
metadata :
name : default-limits
namespace : development
spec :
limits :
- default :
memory : "256Mi"
cpu : "500m"
defaultRequest :
memory : "128Mi"
cpu : "100m"
type : Container
Always set resource requests and limits Ensures proper scheduling and resource allocation Prevents resource starvation and "noisy neighbor" problems Match requests to actual application needs based on profiling Set reasonable limits to protect node stability
resources :
requests :
memory : "256Mi"
cpu : "100m"
limits :
memory : "512Mi"
cpu : "500m"
Implement comprehensive health checks Use appropriate probe types (HTTP, TCP, Exec) based on application Configure proper timeouts and thresholds Include meaningful health endpoints in your application Use all three probe types when appropriate (startup, liveness, readiness) Ensure health checks verify critical dependencies Design for proper pod lifecycle handling Implement graceful shutdown to handle SIGTERM signals Set appropriate terminationGracePeriodSeconds (default: 30s) Use preStop hooks for custom shutdown procedures Configure proper pod disruption budgets (PDBs)
spec :
terminationGracePeriodSeconds : 60
containers :
- name : app
lifecycle :
preStop :
exec :
command : [ "/bin/sh" , "-c" , "sleep 10; /shutdown.sh" ]
Implement proper labels and selectors Use consistent labeling strategy across all resources Include labels for app, environment, version, component Use annotations for non-identifying metadata Leverage label selectors for service targeting
metadata :
labels :
app : myapp
component : frontend
environment : production
version : v1.2.3
tier : frontend
annotations :
description : "Frontend service for customer portal"
team : "frontend-team"
Plan for high availability Use deployments with multiple replicas Configure pod anti-affinity to distribute across nodes Implement pod disruption budgets for controlled maintenance Set appropriate update strategies
# Pod Disruption Budget example
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : frontend-pdb
spec :
minAvailable : 2 # or maxUnavailable: 1
selector :
matchLabels :
app : frontend
Use namespaces for resource isolation Organize resources by project, team, or environment Apply resource quotas at namespace level Implement network policies for namespace isolation Use RBAC roles bound to namespaces
# Create and use namespaces
kubectl create namespace team-frontend
kubectl config set-context --current --namespace=team-frontend
Implement proper security context Run containers as non-root users Apply principle of least privilege Use read-only root filesystem when possible Set appropriate security context
securityContext :
runAsUser : 1000
runAsGroup : 3000
fsGroup : 2000
readOnlyRootFilesystem : true
allowPrivilegeEscalation : false
Configure appropriate update strategies Use rolling updates with appropriate maxSurge/maxUnavailable Consider canary deployments for critical applications Implement blue/green for zero-downtime requirements Test rollback procedures before production deployment Implement proper monitoring and logging Export metrics for CPU, memory, and application-specific data Configure centralized logging Use structured logging formats (JSON) Implement distributed tracing for complex applications