Registry & Distribution
Learn about Docker registries, image distribution, and repository management
Docker Registry
A Docker registry is a specialized storage and distribution system for Docker images. It functions as a centralized repository where container images can be stored, versioned, and shared among different systems and users. A robust registry solution enables you to:
- Store Docker images securely with version control
- Distribute images efficiently across development, testing, and production environments
- Manage image versions with tagging and metadata
- Control access to images through authentication and authorization mechanisms
- Track image vulnerabilities and enforce security policies
- Optimize storage through deduplication of image layers
Types of Registries
Docker Hub
- Public registry service operated by Docker, Inc.
- Free tier for public repositories with rate limits (100/200 pulls per 6 hours for anonymous/authenticated users)
- Subscription plans for private repositories and higher rate limits
- Official images maintained by Docker and verified publishers for trusted content
- Automated builds triggered by source code changes in connected repositories
- Webhooks for integration with CI/CD pipelines and other systems
- Vulnerability scanning for detecting security issues in images
- Team management for collaborative development
Private Registry
- Self-hosted solution within your own infrastructure
- Complete control over data location, retention, and security policies
- Network isolation capabilities for air-gapped or high-security environments
- Integration with internal systems like LDAP/Active Directory
- Customizable storage backends (filesystem, S3, Azure Blob, etc.)
- Fine-grained access control policies
- No external rate limits or subscription fees
- Audit logging for compliance and security tracking
- Ability to implement custom validation and security policies
Cloud Provider Registries
- AWS Elastic Container Registry (ECR)
- Native integration with AWS IAM for access control
- Lifecycle policies for automatic cleanup
- Cross-region replication for improved availability
- Image scanning with Amazon ECR and integration with Amazon Inspector
- Google Container Registry (GCR) / Artifact Registry
- Integration with Google Cloud IAM
- Built-in vulnerability scanning
- Global availability with regional storage
- Integration with Google Cloud Build
- Azure Container Registry (ACR)
- Geo-replication across Azure regions
- Integration with Azure Active Directory
- WebHooks for build and deployment automation
- Premium tier offers enhanced throughput and scalability
- Digital Ocean Container Registry
- Integration with Digital Ocean Kubernetes
- Regional storage options
- Simple pricing model
- GitHub Container Registry (GHCR)
- Tight integration with GitHub Actions
- Fine-grained permissions aligned with GitHub's model
- Anonymous access for public images
- Support for OCI artifacts beyond container images
Working with Docker Hub
Docker Hub is the default registry for Docker, offering a simple workflow for storing and sharing images:
When working with Docker Hub, consider:
- Rate limits for pulls (especially in CI/CD environments)
- Image access control (public vs. private)
- Organization accounts for team collaboration
- Automated builds for maintaining up-to-date images
Setting Up a Private Registry
You can run your own registry using the official Docker registry image. A private registry gives you complete control over your image storage and distribution:
For non-localhost registries, you may need to configure Docker daemon to trust an insecure registry by adding it to /etc/docker/daemon.json
:
For production use, always configure TLS certificates for secure communication.
Image Distribution Strategies
Consider these strategies for efficient image distribution:
- Use multi-stage builds for smaller images
- Reduce image size by separating build-time and runtime dependencies
- Only include production-necessary components in final image
- Smaller images reduce network transfer time and storage costs
- Example: Build in one stage, copy only compiled artifacts to runtime stage
- Implement layer caching
- Order Dockerfile instructions from least to most frequently changed
- Group related commands to optimize layer creation
- Use BuildKit's improved caching capabilities
- Cache dependencies separately from application code
- Consider using content-addressable storage
- Deduplicate identical layers across different images
- Reduce overall storage requirements
- Improve pull efficiency for images sharing common layers
- Enable more efficient image distribution
- Implement proper image tagging strategies
- Use semantic versioning (major.minor.patch)
- Include build metadata (git commit, build number)
- Use immutable tags for production deployments
- Consider specialized tags for different environments (dev, staging, prod)
- Document your tagging convention for team consistency
- Set up registry mirrors for distributed teams
- Reduce external bandwidth usage and improve pull speeds
- Provide redundancy in case of registry outages
- Place mirrors geographically close to consumers
- Configure pull-through caching for frequently used images
- Implement automated synchronization between registries
Securing Your Registry
Proper security is essential for any Docker registry, especially in production environments.
Basic Authentication
Basic authentication provides simple username/password protection for your registry:
Authentication can also be integrated with LDAP, OAuth, or other external providers for enterprise environments.
TLS Configuration
TLS encryption is critical for securing registry communications:
Storage Configuration
Configure backend storage options for production-grade registry deployment:
Access Control
For more advanced access control:
Registry API
The Docker Registry HTTP API provides programmatic access to registry operations. This RESTful API allows you to query, manage, and interact with registry data without using the Docker CLI.
The Registry API enables automation for:
- CI/CD pipelines that need to verify image existence
- Custom interfaces and management tools
- Registry migration scripts
- Automated cleanup and maintenance
- Integration with other systems
Third-Party Registry Options
Beyond the basic Docker Registry, there are numerous specialized registry solutions available:
Cloud Provider Registries
- Amazon Elastic Container Registry (ECR)
- Deeply integrated with AWS services (ECS, EKS, Lambda)
- Private repositories with IAM authentication
- Lifecycle policies for automated image cleanup
- Vulnerability scanning with Amazon Inspector
- Cross-region and cross-account replication
- Pay-as-you-go pricing model
- Google Container Registry (GCR) / Artifact Registry
- Native integration with Google Cloud Build and GKE
- Automatic vulnerability scanning
- IAM access controls and audit logging
- Regional storage with global access
- Support for Docker, OCI, and language-specific packages
- Storage pricing + data transfer costs
- Azure Container Registry (ACR)
- Integrated with Azure DevOps and AKS
- Premium tier with geo-replication
- Automated builds with Tasks
- Content trust for image signing
- Tiered pricing (Basic, Standard, Premium)
- Webhook support for build and deployment automation
- DigitalOcean Container Registry
- Simple integration with DO Kubernetes
- Straightforward pricing by storage tiers
- Global availability
- Integrated vulnerability scanning
Self-Hosted Solutions
- Harbor
- Open source, enterprise-focused registry
- Role-based access control
- Policy-based image replication
- Vulnerability scanning integration
- Image signing and verification
- Support for multiple registries and Helm charts
- WebHooks for event notification
- Audit logging for compliance
- Nexus Repository
- Multi-format artifact management (not just Docker)
- Support for npm, Maven, NuGet, PyPI, etc.
- Role-based access control
- Component lifecycle management
- Repository health check
- Available in free OSS and commercial editions
- Proxy and cache remote repositories
- JFrog Artifactory
- Universal artifact management platform
- High availability configuration
- Advanced security features
- Metadata-based search
- Build integration
- Extensive REST API
- Commercial offering with enterprise support
- Replication and federation capabilities
- GitLab Container Registry
- Integrated with GitLab CI/CD
- Built into GitLab installations
- Project-based permissions
- Vulnerability scanning
- Image clean-up policies
- No additional configuration needed with GitLab
Image Signing and Trust
Image signing ensures the authenticity and integrity of container images. Docker Content Trust (DCT) provides a way to verify both the publisher and the content of images.
For enterprise environments:
- Implement Notary for advanced signing workflows
- Set up a secure offline root key
- Establish a key rotation policy
- Configure CI/CD systems to sign images automatically
- Integrate with vulnerability scanning to only sign secure images
- Use admission controllers in Kubernetes to verify signatures
DCT creates two types of keys:
- Root key: Master key that should be kept securely offline
- Repository keys: Used for signing specific repositories
When enabled, Docker will only pull signed images with verified signatures.
Registry Garbage Collection
Registries accumulate unreferenced layers over time as images are updated or deleted. Garbage collection reclaims this storage space by removing "dangling" blobs.
Important considerations for garbage collection:
- Registry should be in read-only mode during garbage collection to prevent corruption
- Deleted manifests may still appear in API results until the registry is restarted
- Consider setting up scheduled garbage collection for production registries
- Use registry storage quotas to prevent unexpected growth
- Monitor storage usage before and after garbage collection
- Some storage drivers have specific garbage collection considerations
Example cron job for weekly garbage collection:
Registry Configuration
The Docker Registry is highly configurable through its config.yml
file. Below is an annotated example with common configuration options:
Common configuration use cases:
- High Availability Setup: Configure multiple registry instances with shared storage
- Performance Optimization: Adjust cache settings and implement Redis
- Security Hardening: Configure TLS, authentication, and authorization
- Storage Management: Set up quotas, garbage collection, and storage drivers
- Integration: Configure webhooks for event notifications
- Compliance: Enable audit logging and access controls
Best Practices
Implementing these best practices will help you manage your container registry effectively and securely:
Tagging Strategy
- Use semantic versioning (SemVer)
- Format: MAJOR.MINOR.PATCH (e.g., 1.2.3)
- MAJOR: Breaking changes
- MINOR: New features, backward compatible
- PATCH: Bug fixes, backward compatible
- Include build information
- Add build numbers or timestamps
- Format examples:
1.2.3-build.456
1.2.3-20230415.1
1.2.3-alpha.1
,1.2.3-beta.2
,1.2.3-rc.1
- Never use "latest" in production
- "latest" is mutable and unpredictable
- Makes rollbacks difficult or impossible
- Complicates auditing and versioning
- Obscures which version is actually running
- Tag with git commit hashes
- Provides direct traceability to source code
- Format example:
1.2.3-a7ff23e
- Helpful for debugging specific versions
- Can automate in CI/CD pipelines
- Consider using digest references
- Immutable and tamper-evident
- Format:
image@sha256:digest
- Guarantees exact image content
- Can be used with vulnerability scanners
- Best for security-critical deployments
- Implement tag lifecycle policies
- Automate cleanup of old tags
- Preserve important historical versions
- Document retention policies
Security
- Regular vulnerability scanning
- Integrate scanners like Trivy, Clair, or Anchore
- Block deployment of vulnerable images
- Configure scheduled rescans of existing images
- Track CVEs and patch affected images
- Image signing
- Implement Docker Content Trust
- Use Notary for advanced signing workflows
- Enforce signature verification on pull
- Store root keys securely offline
- Rotate signing keys periodically
- Access control
- Implement principle of least privilege
- Use namespaces for project isolation
- Configure role-based access control
- Audit user access regularly
- Implement approval workflows for sensitive repositories
- Audit logging
- Log all registry operations
- Capture who, what, when information
- Store logs securely and immutably
- Set up alerts for suspicious activities
- Retain logs for compliance requirements
- Forward logs to SIEM systems
- Regular garbage collection
- Schedule routine garbage collection
- Monitor storage utilization
- Configure retention policies
- Implement image lifecycle management
- Remove untagged and outdated images
- Network security
- Use TLS 1.2+ for all registry traffic
- Implement proper certificate management
- Consider network segmentation
- Use VPNs or private endpoints for access
Efficiency
- Layer caching
- Optimize Dockerfiles for cache utilization
- Use BuildKit's improved caching
- Implement shared build caches in CI/CD
- Configure appropriate cache lifetimes
- Consider remote caching for distributed teams
- Optimized storage backends
- Choose appropriate backend for scale (S3, Azure Blob, etc.)
- Configure compression settings
- Implement data deduplication where available
- Monitor storage performance metrics
- Implement lifecycle policies for automated management
- Pull-through caching
- Set up proxy registries for frequently used images
- Cache external images locally to reduce bandwidth
- Configure appropriate cache TTL values
- Schedule cache warming for critical images
- Implement health checks for upstream registries
- Load balancing
- Distribute registry load across multiple instances
- Implement round-robin or other load balancing strategies
- Configure connection draining for maintenance
- Monitor per-instance performance metrics
- Set up high availability clusters
- Geographic distribution
- Deploy registries close to users/clusters
- Implement registry replication across regions
- Use content delivery networks where appropriate
- Configure smart routing based on client location
- Synchronize metadata between distributed instances
Distribution Strategies
Best practices for image distribution:
- Use semantic versioning for tags
- Follow MAJOR.MINOR.PATCH convention
- Include metadata like build number, git hash
- Document tagging policies for all users
- Implement CI/CD pipelines for automatic pushes
- Build and push on every commit or PR merge
- Configure pipeline-specific credentials
- Implement quality gates before pushing
- Tag with both version and git hash for traceability
- Scan images for vulnerabilities before distribution
- Integrate scanning tools in CI/CD pipeline
- Block distribution of images with critical vulnerabilities
- Generate bill of materials for compliance
- Implement continuous rescanning of existing images
- Implement image promotion workflows across environments
- Use separate repositories or tags for dev/staging/prod
- Implement explicit promotion process
- Never rebuild for promotion - use the same image
- Maintain promotion audit trail
- Implement approval gates for production promotion
- Automate cleanup of old images
- Implement retention policies based on age or count
- Preserve images deployed to production
- Run garbage collection regularly
- Maintain audit trail of deleted images
- Consider legal/compliance requirements for retention
- Document image usage and requirements
- Include README in image repositories
- Document environment variables and volumes
- Specify resource requirements
- Include health check information
- Document upgrade procedures
- List compatible versions of dependencies
Working with Image Layers
Understanding Layers
- Each instruction in Dockerfile creates a layer
- Every RUN, COPY, ADD command creates a new layer
- Other instructions create metadata-only changes
- Layers represent filesystem differences
- Maximum of 127 layers per image (practical limit)
- Older images may use different layering technology
- Layers are cached and reused
- Unchanged layers use cache during builds
- Cache invalidation occurs at first change
- All subsequent layers must be rebuilt
- Order instructions from least to most frequently changed
- Use .dockerignore to prevent cache invalidation
- Efficient distribution relies on layer sharing
- Common layers are stored once on disk
- Images with same base share foundation layers
- Registry transfers only missing layers
- Improves pull performance and reduces storage
- Enables efficient large-scale deployments
- Base images form common foundation layers
- Choose appropriate base images for sharing
- Consider using organization-specific base images
- Standardizing on base images increases sharing
- Update base images regularly for security patches
- Track base image usage across organization
- Only changed layers are transferred during pulls
- Docker client checks which layers it already has
- Registry serves only missing layers
- Layers are verified by content-addressable hashes
- Network transfer is minimized
- Consider squashing for production if sharing isn't beneficial
Layer Management
Layer Optimization
- Group related commands to reduce layer count
- Use multi-stage builds to eliminate build-only layers
- Remove temporary files in the same layer they're created
- Consider squashing layers for production images
- Use appropriate base images to maximize layer sharing
Multi-Registry Operations
Working with multiple registries is common in enterprise environments, where you might have internal registries for development and external registries for distribution.
When working with multiple registries, consider:
- Implementing a registry of registries (federation)
- Setting up synchronization between registries
- Managing credentials securely
- Implementing consistent naming conventions across registries
- Tracking image provenance as images move between registries
Image Mirroring and Caching
Registry mirrors can improve pull performance, reduce external bandwidth usage, and provide redundancy. They work by caching images from upstream registries locally.
To set up your own registry mirror with pull-through caching:
Benefits of registry mirrors:
- Reduced external bandwidth consumption
- Faster image pulls for frequently used images
- Protection against upstream registry outages
- Rate limit avoidance (particularly for Docker Hub)
- Ability to audit and control image usage
For large organizations, consider:
- Hierarchical mirroring architecture
- Geographic distribution of mirrors
- Automatic health checking and failover
- Monitoring of cache hit/miss ratios
- Periodic purging of unused cached images
Advanced Topics
Content Trust
- Sign and verify images
- Implement Notary for signature management
- Configure separate keys for different environments
- Document key management procedures
- Implement hardware security modules for key storage
- Set up automated signing in CI/CD pipelines
- Ensure image authenticity
- Verify publisher identity
- Confirm content hasn't been modified
- Check signature freshness and expiry
- Implement chain of trust validation
- Integrate with secure software supply chain
- Prevent tampering
- Store signatures separately from images
- Use threshold signing for critical images
- Implement key rotation procedures
- Monitor for unauthorized signature attempts
- Log all verification activities
- Enable in daemon.json
- Verify before deployment
- Add verification steps in deployment pipelines
- Block deployment of unsigned images
- Implement policy controllers in Kubernetes
- Document verification requirements
- Add signature metadata to deployment artifacts
Registry API
- RESTful API
- Well-documented endpoints
- Standard HTTP status codes
- JSON response format
- Authentication with Bearer tokens
- Rate limiting and throttling controls
- Image manipulation
- Push and pull operations
- Layer uploads and downloads
- Cross-repository blob mounting
- Image manifest management
- Content addressable blob storage
- Repository management
- List and search repositories
- Tag management operations
- Access control configuration
- Repository metadata handling
- Namespace organization
- Catalog operations
- Paginated repository listing
- Search and filtering capabilities
- Metadata aggregation
- Tag enumeration
- Cross-repository operations
- Webhook integration
- Event-based notifications
- Customizable event filtering
- Delivery retry mechanisms
- Authentication for webhook endpoints
- Webhook delivery logging and auditing
Garbage Collection
Advanced Storage Management
- Implement storage quotas per repository/namespace
- Configure automated pruning policies
- Set up storage analytics and monitoring
- Implement data compression for storage efficiency
- Configure cross-region replication for disaster recovery
Troubleshooting
Common registry issues and solutions:
- Authentication failures
- Check credentials in ~/.docker/config.json
- Verify token expiration and refresh if needed
- Confirm user has appropriate permissions
- Inspect registry logs for auth errors
- Test authentication with curl directly:
- Network connectivity
- Verify firewall and proxy settings
- Check DNS resolution for registry hostname
- Test basic connectivity with ping/telnet
- Inspect registry logs for connection errors
- Check Docker daemon network configuration
- Verify network policies allow registry traffic
- Test with curl bypassing proxy:
- Certificate issues
- Ensure proper TLS configuration
- Verify certificate validity period
- Check certificate chain is complete
- Confirm hostname matches certificate CN/SAN
- Add CA certificates to trusted store
- For self-signed certs, configure Docker client:
- Storage problems
- Check disk space and quotas
- Verify filesystem permissions
- Inspect storage driver logs
- Run filesystem checks on storage volumes
- Monitor I/O performance metrics
- Consider storage backend scalability
- Run diagnostic commands:
- Rate limiting
- Be aware of registry pull limits (especially Docker Hub)
- Implement authenticated pulls to increase limits
- Configure registry mirrors to cache images
- Use pull-through cache for frequently used images
- Distribute pulls across multiple accounts if necessary
- Monitor rate limit headers:
- Layer availability
- Ensure all layers are accessible
- Check for incomplete uploads or transfers
- Verify storage backend integrity
- Run registry garbage collection with dry-run
- Re-push images if layers are corrupted
- Restore from backups if necessary
- Test layer download directly:
- Performance issues
- Monitor registry resource usage
- Configure appropriate cache settings
- Implement efficient storage backends
- Scale registry horizontally if needed
- Use CDN for large-scale distribution
- Analyze registry metrics:
- Registry corruption
- Back up registry data before repairs
- Verify filesystem integrity
- Re-index corrupted repositories
- Consider rebuilding registry from scratch
- Implement regular integrity checks