Docker's GPU Acceleration Framework provides a comprehensive ecosystem for integrating and leveraging GPU hardware within containerized environments.
Hardware Acceleration
Direct access to physical GPU resources for performance-critical workloads
Vendor-Agnostic Support
Unified interface for NVIDIA, AMD, and Intel GPU technologies
Resource Optimization
Fine-grained allocation and monitoring of GPU resources
AI/ML Enablement
Streamlined deployment of compute-intensive machine learning workflows
This guide explores the architecture, configuration options, and best practices for implementing GPU-accelerated containers across development and production environments, enabling high-performance computing workloads within the Docker ecosystem.
Run a container with GPU access using the --gpus flag:
# Run NVIDIA GPU-enabled container
docker run --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
# Run with specific GPU devices
docker run --gpus device=0,1 nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
# Run with GPU memory limit (NVIDIA only)
docker run --gpus 'device=0,capabilities=utility,compute,memory:2GB' nvidia/cuda samples
# Allocate specific GPU devices by index
docker run --gpus 'device=1,2' tensorflow/tensorflow:latest-gpu python -c 'import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))'
# Limit visible GPU devices with environment variables (framework-specific)
docker run -e CUDA_VISIBLE_DEVICES=0,1 tensorflow/tensorflow:latest-gpu python -c 'import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))'
# Set compute mode and memory limits (NVIDIA)
docker run --gpus 'all,capabilities=compute,utility,graphics,video,display,memory:6GB' nvidia/cuda:latest nvidia-smi
Accelerate data processing with GPU-enabled containers:
# Run RAPIDS for accelerated data science
docker run --gpus all -p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/rapidsai:cuda11.0-runtime-ubuntu18.04-py3.8
# Image processing with OpenCV CUDA support
docker run --gpus all -v $(pwd)/images:/images custom/opencv-cuda:latest \
python process_images.py --input /images/input --output /images/processed
Configure containers for optimal performance across multiple GPUs:
Data Parallelism: Duplicate model across GPUs, process different data batches
Model Parallelism: Split model layers across GPUs for large models
Pipeline Parallelism: Process different model stages across GPUs
Hybrid Approaches: Combine strategies for optimal resource utilization
# TensorFlow distributed training with data parallelism
docker run --gpus all tensorflow/tensorflow:latest-gpu python -c '
import tensorflow as tf
strategy = tf.distribute.MirroredStrategy()
print("Number of devices:", strategy.num_replicas_in_sync)
with strategy.scope():
model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))])
model.compile(loss="mse", optimizer="sgd")
print("Model compiled with distribution strategy")
'
Use these commands to troubleshoot GPU container issues:
# Check GPU visibility within container
docker run --gpus all nvidia/cuda:latest nvidia-smi
# Verify CUDA installation and version
docker run --gpus all nvidia/cuda:latest nvcc --version
# Test GPU compute capability
docker run --gpus all nvidia/cuda:latest cuda-samples/deviceQuery
# Debug GPU memory usage
docker run --gpus all nvidia/cuda:latest nvidia-smi --query-gpu=memory.used,memory.total --format=csv
# Run with minimal GPU capabilities
docker run --gpus 'device=0,capabilities=compute,utility' --read-only --security-opt=no-new-privileges \
--cap-drop=ALL -u 1000:1000 secure/gpu-app:latest
# Implement proper resource limits
docker run --gpus 1 --memory=4g --cpu-shares=1024 --pids-limit=100 secure/gpu-app:latest
# Use temporary filesystem for sensitive data
docker run --gpus all --tmpfs /tmp:rw,noexec,nosuid secure/gpu-app:latest
Docker's GPU Acceleration Framework transforms how organizations leverage high-performance computing resources in containerized environments. By providing a unified interface across GPU vendors, flexible resource allocation, and robust monitoring capabilities, the framework enables everything from AI/ML workloads to scientific computing and video processing at scale.
Simplified Deployment
Consistent GPU access across development and production
Efficient Resource Utilization
Fine-grained allocation and isolation
Workload Portability
Vendor-agnostic approach for heterogeneous environments
Performance Optimization
Advanced resource management for maximum throughput
Enterprise Readiness
Production-grade security and monitoring integration
As GPU acceleration becomes increasingly critical for modern applications, Docker's comprehensive framework provides the foundation for scalable, secure, and high-performance containerized computing.