Welcome to from-docker-to-kubernetes

Docker Cache Management

Understanding and optimizing Docker's cache system for faster builds

Understanding Docker's Cache System

Docker's build cache is a powerful mechanism that significantly speeds up the image building process by reusing layers from previous builds. When Docker builds an image, it executes each instruction in the Dockerfile and creates a layer for each instruction. These layers are cached and can be reused in subsequent builds if the instruction hasn't changed.

The cache system works by comparing the instruction in the Dockerfile with the previous build and determining if the context for that instruction has changed. If there's no change, Docker reuses the existing layer rather than recreating it, which saves considerable time and resources.

At a fundamental level, Docker's caching works through content-addressable storage. Each layer is identified by a cryptographic hash of its contents, allowing Docker to precisely track whether anything has changed. This content-driven approach means that even if you run the same build multiple times without changing any files, you'll get the same layer IDs, making builds highly deterministic.

When you build an image, Docker processes each instruction in the Dockerfile sequentially:

  1. It looks for an existing image in the cache that has the same parent and was built with the same instruction
  2. For commands like RUN, Docker checks if the instruction string matches exactly
  3. For ADD and COPY instructions, Docker also examines the contents of the files being copied
  4. If a matching layer is found in the cache, Docker reuses it; otherwise, it builds a new layer

This caching behavior is essential for maintaining build efficiency as projects grow. Consider a typical microservice with hundreds of dependencies - without proper caching, even minor code changes could trigger full rebuilds taking several minutes. With effective cache utilization, the same changes might rebuild in seconds.

Understanding how Docker's cache works is crucial for optimizing build times, especially in CI/CD pipelines and development workflows where image builds happen frequently. Teams that master Docker's cache system can achieve dramatically faster development cycles, more responsive CI/CD pipelines, and more efficient use of build infrastructure.

Cache Invalidation Mechanics

Docker's cache follows specific invalidation rules that determine when a layer can be reused and when it must be rebuilt:

Instruction-Based Invalidation

  • Each Dockerfile instruction is evaluated against the cached layers
  • If the instruction text has changed, the cache is invalidated
  • All subsequent layers must be rebuilt from that point
  • Simple text changes in the Dockerfile can trigger unnecessary rebuilds
  • Example: Changing comments in a Dockerfile won't affect functionality but will invalidate cache

Context-Based Invalidation

  • For ADD and COPY instructions, Docker examines file contents
  • If file contents have changed, the cache is invalidated
  • Docker uses checksums to detect file changes
  • Changes to file metadata (like permissions) also invalidate cache
  • Example: Modifying a copied configuration file will invalidate the cache for that layer and all subsequent layers
  • The exact algorithm involves:
    • Calculating a checksum for each file being copied
    • Comparing checksums with previous build files
    • Considering file path, ownership, and permissions
    • Creating a new layer if any differences are detected
  • Special considerations apply for .dockerignore:
    • Files excluded by .dockerignore aren't sent to the build context
    • Changes to excluded files won't invalidate the cache
    • Modifying .dockerignore itself can change which files are included
    • This can unexpectedly invalidate cache for COPY/ADD instructions
  • The exact algorithm involves:
    • Calculating a checksum for each file being copied
    • Comparing checksums with previous build files
    • Considering file path, ownership, and permissions
    • Creating a new layer if any differences are detected
  • Special considerations apply for .dockerignore:
    • Files excluded by .dockerignore aren't sent to the build context
    • Changes to excluded files won't invalidate the cache
    • Modifying .dockerignore itself can change which files are included
    • This can unexpectedly invalidate cache for COPY/ADD instructions

RUN Instruction Behavior

  • RUN instructions are cache-invalidated by instruction text, not outcomes
  • Changing the RUN command text invalidates the cache, even if the result would be identical
  • External resource changes (like apt repository updates) don't invalidate cache
  • This can lead to stale dependencies if not managed properly
  • Example: RUN apt-get update && apt-get install -y python3 may use outdated package lists if cached
  • The exact string matching mechanism means:
    • Adding a comment to a RUN instruction invalidates the cache
    • Changing whitespace in a RUN instruction invalidates the cache
    • Reordering commands within a RUN instruction invalidates the cache
    • Adding --no-install-recommends to an apt-get command invalidates the cache
  • Non-deterministic commands present challenges:
    • Commands that download content from the internet may get different results each time
    • Commands that use timestamps or random numbers produce different outcomes
    • Build date/time references create different outputs on each build
    • These variations don't automatically invalidate cache unless the command text changes
  • The exact string matching mechanism means:
    • Adding a comment to a RUN instruction invalidates the cache
    • Changing whitespace in a RUN instruction invalidates the cache
    • Reordering commands within a RUN instruction invalidates the cache
    • Adding --no-install-recommends to an apt-get command invalidates the cache
  • Non-deterministic commands present challenges:
    • Commands that download content from the internet may get different results each time
    • Commands that use timestamps or random numbers produce different outcomes
    • Build date/time references create different outputs on each build
    • These variations don't automatically invalidate cache unless the command text changes

Base Image Changes

  • Changes to the base image (FROM instruction) invalidate all subsequent caches
  • Minor version updates or digest changes will trigger full rebuilds
  • Using image tags instead of digests can lead to unexpected cache invalidation
  • Example: Changing FROM ubuntu:20.04 to FROM ubuntu:22.04 invalidates all layers
  • The base image acts as the foundation for all subsequent layers:
    • Each instruction in a Dockerfile builds upon the previous layer
    • If the base layer changes, all dependent layers must be rebuilt
    • Docker checks the base image's digest, not just the tag name
    • Even if two images have the same tag (e.g., 'latest'), different digests invalidate cache
  • Tag stability considerations:
    • 'latest' tag can point to different image versions over time
    • Semantic version tags (e.g., v1.2.3) are generally more stable
    • SHA256 digests provide absolute consistency
    • Using digests guarantees the exact same base image:
      FROM ubuntu@sha256:e6173d4dc55f12012aa34d79abfb129a6d2c249947a3f3a512efaad31433f7e9
      
  • Multi-stage builds have separate cache chains:
    • Each FROM instruction starts a new cache sequence
    • Changing one stage doesn't necessarily invalidate other stages
    • This provides more granular cache control
  • The base image acts as the foundation for all subsequent layers:
    • Each instruction in a Dockerfile builds upon the previous layer
    • If the base layer changes, all dependent layers must be rebuilt
    • Docker checks the base image's digest, not just the tag name
    • Even if two images have the same tag (e.g., 'latest'), different digests invalidate cache
  • Tag stability considerations:
    • 'latest' tag can point to different image versions over time
    • Semantic version tags (e.g., v1.2.3) are generally more stable
    • SHA256 digests provide absolute consistency
    • Using digests guarantees the exact same base image:
      FROM ubuntu@sha256:e6173d4dc55f12012aa34d79abfb129a6d2c249947a3f3a512efaad31433f7e9
      
  • Multi-stage builds have separate cache chains:
    • Each FROM instruction starts a new cache sequence
    • Changing one stage doesn't necessarily invalidate other stages
    • This provides more granular cache control

Optimizing Dockerfile for Cache Efficiency

Properly structuring your Dockerfile can dramatically improve build times by maximizing cache usage:

# BAD EXAMPLE - Poor cache utilization
FROM ubuntu:22.04
WORKDIR /app
COPY . /app/
RUN apt-get update && apt-get install -y python3 python3-pip
RUN pip install -r requirements.txt

# GOOD EXAMPLE - Optimized for cache
FROM ubuntu:22.04
WORKDIR /app
RUN apt-get update && apt-get install -y python3 python3-pip
COPY requirements.txt /app/
RUN pip install -r requirements.txt
COPY . /app/

In the optimized example:

  1. Dependencies are installed before application code is copied
  2. Only the requirements file is copied before installing dependencies
  3. Full application code is copied only after dependencies are installed
  4. This ensures dependency layers are cached and reused even when application code changes

Advanced Cache Strategies

BuildKit Cache Features

Docker BuildKit, introduced in Docker 18.09, provides advanced caching capabilities beyond the traditional Docker build:

Inline Cache

  • Embeds cache information in the image itself
  • Allows cache sharing through registries
  • Enable with --cache-from and --build-arg BUILDKIT_INLINE_CACHE=1
  • Useful for CI/CD pipelines with distributed builders
  • Example:
    # First build
    docker build --tag myapp:latest --build-arg BUILDKIT_INLINE_CACHE=1 .
    docker push myapp:latest
    
    # Later build using the cache
    docker build --cache-from myapp:latest .
    
  • Detailed inline cache mechanics:
    • Cache metadata is stored in image manifest and config
    • Contains layer content hashes and build instruction details
    • Doesn't increase the actual image size meaningfully
    • Works across different build environments and machines
    • Particularly valuable for CI/CD environments where local cache isn't persistent
  • Advanced inline cache strategies:
    • Cache multiple versions to increase hit probability:
      # Use multiple cache sources
      docker build \
        --cache-from myapp:latest \
        --cache-from myapp:dev \
        --cache-from myapp:prev \
        -t myapp:latest .
      
    • Chain caches across feature branches:
      # Feature branch build using main branch cache
      docker build \
        --cache-from myapp:main \
        -t myapp:feature-x \
        --build-arg BUILDKIT_INLINE_CACHE=1 .
      
    • Separate cache-only images from runtime images:
      # Build cache-only image
      docker build -t myapp:cache --target build-stage \
        --build-arg BUILDKIT_INLINE_CACHE=1 .
        
      # Use cache for production image
      docker build -t myapp:prod --cache-from myapp:cache .
      
  • Detailed inline cache mechanics:
    • Cache metadata is stored in image manifest and config
    • Contains layer content hashes and build instruction details
    • Doesn't increase the actual image size meaningfully
    • Works across different build environments and machines
    • Particularly valuable for CI/CD environments where local cache isn't persistent
  • Advanced inline cache strategies:
    • Cache multiple versions to increase hit probability:
      # Use multiple cache sources
      docker build \
        --cache-from myapp:latest \
        --cache-from myapp:dev \
        --cache-from myapp:prev \
        -t myapp:latest .
      
    • Chain caches across feature branches:
      # Feature branch build using main branch cache
      docker build \
        --cache-from myapp:main \
        -t myapp:feature-x \
        --build-arg BUILDKIT_INLINE_CACHE=1 .
      
    • Separate cache-only images from runtime images:
      # Build cache-only image
      docker build -t myapp:cache --target build-stage \
        --build-arg BUILDKIT_INLINE_CACHE=1 .
        
      # Use cache for production image
      docker build -t myapp:prod --cache-from myapp:cache .
      

External Cache Storage

  • Stores cache in external locations
  • Supported backends include local, registry, and S3
  • Configured with --cache-to and --cache-from flags
  • Enables centralized cache management
  • Example:
    # Export cache to registry
    docker build \
      --cache-to type=registry,ref=myregistry.com/myapp:cache \
      --tag myapp:latest .
    
    # Import cache from registry
    docker build \
      --cache-from type=registry,ref=myregistry.com/myapp:cache \
      --tag myapp:latest .
    

Bind Mounts as Build Context

  • Mounts local directories into build context
  • Changes to mounted files don't invalidate cache
  • Requires BuildKit and docker-compose
  • Useful for development environments
  • Example in docker-compose.yml:
    services:
      myapp:
        build:
          context: .
          dockerfile: Dockerfile
          cache_from:
            - myapp:cache
        image: myapp:latest
        volumes:
          - ./src:/app/src
    

Content-Addressable Cache

  • Caches layers based on content rather than build steps
  • Automatically deduplicates identical outputs
  • More resilient to Dockerfile changes
  • No special configuration required with BuildKit
  • Example benefit: Changing the order of RUN commands that produce identical results won't invalidate cache

Cache Busting Techniques

Sometimes you need to intentionally invalidate the cache to ensure fresh content:

# Using build arguments for controlled cache busting
FROM ubuntu:22.04
ARG CACHEBUST=1
RUN apt-get update && apt-get install -y python3 python3-pip

# Using the current date/time for forced updates
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y python3 python3-pip
# Force package update on every build with BUILDKIT
RUN --mount=type=cache,target=/var/cache/apt \
    --mount=type=cache,target=/var/lib/apt \
    apt-get update && apt-get install -y python3 python3-pip

# Using environment variables for targeted cache busting
FROM ubuntu:22.04
ENV REFRESHED_AT=2023-10-15
RUN apt-get update && apt-get install -y security-critical-package

# Using ADD with a URL to force cache invalidation
FROM ubuntu:22.04
ADD https://worldtimeapi.org/api/ip /tmp/bustcache
RUN apt-get update && apt-get install -y python3 python3-pip

# Using inline commands to generate dynamic values
FROM ubuntu:22.04
RUN apt-get update && \
    if [ "$(date +%Y%m%d)" != "$(cat /tmp/last_updated 2>/dev/null)" ]; then \
      apt-get upgrade -y && \
      date +%Y%m%d > /tmp/last_updated; \
    fi

Each of these techniques has specific use cases:

  1. Build arguments (CACHEBUST): Best for manual control, typically incremented when a fresh build is needed
  2. BuildKit cache mounts: Ideal for package managers, preserves the package cache without affecting layer caching
  3. Environment variables: Good for documenting when a layer was last refreshed
  4. Remote URL ADD: Forces a check of remote resources on every build
  5. Inline conditionals: Provides sophisticated logic for determining when to invalidate cache

Best Practices for Cache Management

  1. Use specific base image tags or digests
    • Prefer digest pinning for production builds
    • Example: FROM ubuntu:22.04@sha256:e6173d4dc55f12012aa34d79abfb129a6d2c249947a3f3a512efaad31433f7e9
    • Prevents unexpected base image changes and cache invalidation
  2. Create a dedicated builder image
    • Pre-install build dependencies in a separate image
    • Use as the base for build stages
    • Reduces rebuilding of stable dependencies
    • Example:
      # builder.Dockerfile
      FROM ubuntu:22.04
      RUN apt-get update && apt-get install -y build-essential python3-dev
      
      # main Dockerfile
      FROM myorg/builder:latest AS builder
      # Continue with application-specific build steps
      
  3. Implement proper layer granularity
    • Avoid combining unrelated operations in a single RUN
    • But consolidate related operations to reduce layer count
    • Balance between cache granularity and layer count
    • Example of good granularity:
      # System dependencies (change rarely)
      RUN apt-get update && apt-get install -y \
          curl \
          gnupg \
          ca-certificates \
          && rm -rf /var/lib/apt/lists/*
          
      # Application dependencies (change occasionally)
      COPY requirements.txt .
      RUN pip install --no-cache-dir -r requirements.txt
      
      # Application code (changes frequently)
      COPY . .
      
  4. Use CI cache warming
    • Build images regularly in CI even without code changes
    • Keeps cache fresh for dependencies
    • Reduces build time when actual changes are pushed
    • Implementation example with GitHub Actions:
      name: Cache Warming
      on:
        schedule:
          - cron: '0 2 * * *'  # Daily at 2 AM
      jobs:
        warm-cache:
          runs-on: ubuntu-latest
          steps:
            - uses: actions/checkout@v3
            - name: Set up Docker Buildx
              uses: docker/setup-buildx-action@v2
            - name: Build and push
              uses: docker/build-push-action@v3
              with:
                context: .
                push: true
                tags: myapp:cache
                cache-from: type=registry,ref=myapp:cache
                cache-to: type=registry,ref=myapp:cache,mode=max
      

Performance Monitoring and Optimization

Cache Management in Production Environments

Managing Docker's cache effectively in production CI/CD environments requires additional considerations:

  1. Implement cache retention policies
    • Set maximum cache age or size
    • Automatically prune old cache entries
    • Balance cache benefits against storage costs
    • Example:
      # Prune cache older than 30 days
      docker builder prune --filter until=720h
      
      # Prune all but keep cache for specific images
      docker builder prune --all --filter until=24h --filter reference=myapp:*
      
      # Implement a scheduled cache maintenance job
      cat > /etc/cron.daily/docker-cache-prune << 'EOF'
      #!/bin/bash
      # Keep last 7 days of cache, but ensure at least 20GB free space
      FREE_SPACE=$(df -BG /var/lib/docker | tail -1 | awk '{print $4}' | sed 's/G//')
      if [ "$FREE_SPACE" -lt 20 ]; then
        # Low disk space - aggressive pruning
        docker builder prune -f --filter until=24h
      else
        # Normal maintenance
        docker builder prune -f --filter until=168h
      fi
      
      # Keep tagged images for active branches
      for BRANCH in main develop release-*; do
        # Skip if branch doesn't exist
        git show-ref --verify --quiet refs/heads/$BRANCH || continue
        # Pull images to ensure they're not pruned
        docker pull mycompany/myapp:$BRANCH || true
      done
      EOF
      chmod +x /etc/cron.daily/docker-cache-prune
      
    • Implementing tiered cache retention:
      # Cache retention policy by image type
      
      # Critical production images - keep cache for 90 days
      docker builder prune --filter until=2160h --filter reference=myapp:prod-*
      
      # Development images - keep cache for 14 days
      docker builder prune --filter until=336h --filter reference=myapp:dev-*
      
      # Feature branch images - keep cache for 3 days
      docker builder prune --filter until=72h --filter reference=myapp:feature-*
      
      # Dependency base images - keep cache for 60 days
      docker builder prune --filter until=1440h --filter reference=myapp:base-*
      
  2. Consider security implications
    • Cache can preserve security vulnerabilities
    • Force cache invalidation after security patches
    • Use security scanning in the build process
    • Example with build arguments:
      ARG SECURITY_PATCH_LEVEL=1
      RUN apt-get update && apt-get upgrade -y
      
    • Comprehensive security approach:
      # GitLab CI example with security scanning and cache invalidation
      variables:
        SECURITY_PATCH_LEVEL: ${CI_PIPELINE_IID}  # Auto-increment with each pipeline
      
      stages:
        - security_scan
        - build
        - test
        - deploy
      
      security_scan:
        stage: security_scan
        script:
          - trivy image --exit-code 1 --severity HIGH,CRITICAL ${CI_REGISTRY_IMAGE}:latest || true
          - if [ $? -eq 1 ]; then
              echo "SECURITY_ISSUES=true" >> security.env;
            else
              echo "SECURITY_ISSUES=false" >> security.env;
            fi
        artifacts:
          reports:
            dotenv: security.env
      
      build:
        stage: build
        script:
          # If security issues were found, don't use cache
          - if [ "$SECURITY_ISSUES" = "true" ]; then
              docker build --no-cache -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA} 
                --build-arg SECURITY_PATCH_LEVEL=${SECURITY_PATCH_LEVEL} .;
            else
              docker build --cache-from ${CI_REGISTRY_IMAGE}:latest 
                -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
                --build-arg SECURITY_PATCH_LEVEL=${SECURITY_PATCH_LEVEL} .;
            fi
          - docker push ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
      
    • Security scanning integration:
      # Scan image for vulnerabilities before caching
      docker build -t myapp:latest .
      vulnerability_count=$(trivy image --severity HIGH,CRITICAL --exit-code 0 myapp:latest | grep -c VULNERABILITY || true)
      
      if [ "$vulnerability_count" -gt 0 ]; then
        echo "Found $vulnerability_count vulnerabilities, rebuilding without cache"
        docker build --no-cache -t myapp:latest .
      fi
      
      # Now push to registry with inline cache
      docker build -t myapp:latest --build-arg BUILDKIT_INLINE_CACHE=1 .
      docker push myapp:latest
      
  3. Implement distributed cache in multi-node setups
    • Use registry-based cache for shared cache across builders
    • Configure BuildKit with S3 or other shared storage
    • Enable consistent builds across build agents
    • Example BuildKit configuration:
      # Run BuildKit with S3 cache support
      docker buildx create --driver docker-container \
        --driver-opt network=host \
        --buildkitd-flags '--debug --allow-insecure-entitlement security.insecure' \
        --use
      
    • Advanced distributed cache setup:
      # Create a BuildKit builder with Azure Blob Storage cache
      docker buildx create --name azure-builder \
        --driver docker-container \
        --driver-opt image=moby/buildkit:latest \
        --driver-opt network=host \
        --buildkitd-flags '--debug --allow-insecure-entitlement security.insecure' \
        --bootstrap
      
      # Use the custom builder with Azure Blob Storage
      docker buildx use azure-builder
      
      # Build with Azure cache
      docker buildx build \
        --push \
        --cache-to type=azblob,name=myproject,account=mystorageaccount,container=buildcache \
        --cache-from type=azblob,name=myproject,account=mystorageaccount,container=buildcache \
        -t myregistry.com/myapp:latest .
      
    • Comprehensive multi-region build setup:
      # GitHub Actions workflow for multi-region builds with shared cache
      name: Multi-Region Build
      
      on:
        push:
          branches: [ main ]
      
      jobs:
        setup:
          runs-on: ubuntu-latest
          outputs:
            matrix: ${{ steps.set-matrix.outputs.matrix }}
          steps:
            - id: set-matrix
              run: |
                echo "matrix={\"region\":[\"us-east\",\"eu-west\",\"ap-southeast\"]}" >> $GITHUB_OUTPUT
      
        build:
          needs: setup
          runs-on: ubuntu-latest
          strategy:
            matrix: ${{fromJson(needs.setup.outputs.matrix)}}
          steps:
            - uses: actions/checkout@v3
            
            - name: Set up Docker Buildx
              uses: docker/setup-buildx-action@v2
              
            - name: Login to DockerHub
              uses: docker/login-action@v2
              with:
                username: ${{ secrets.DOCKERHUB_USERNAME }}
                password: ${{ secrets.DOCKERHUB_TOKEN }}
                
            - name: Configure AWS credentials
              uses: aws-actions/configure-aws-credentials@v2
              with:
                aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
                aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
                aws-region: ${{ matrix.region }}
                
            - name: Build and push
              uses: docker/build-push-action@v4
              with:
                context: .
                push: true
                tags: myorg/myapp:${{ matrix.region }}
                cache-from: |
                  type=s3,region=${{ matrix.region }},bucket=mybucket,name=myapp
                  type=registry,ref=myorg/myapp:cache
                cache-to: type=s3,region=${{ matrix.region }},bucket=mybucket,name=myapp,mode=max
      
  4. Enterprise-scale cache management strategies
    • Implement centralized cache storage with proper access controls:
      # Set up dedicated registry for caching with proper authentication
      docker run -d -p 5000:5000 \
        -v /path/to/certs:/certs \
        -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
        -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
        -e REGISTRY_AUTH=htpasswd \
        -e REGISTRY_AUTH_HTPASSWD_