Welcome to from-docker-to-kubernetes

Docker Service Discovery & DNS

Comprehensive guide to implementing service discovery and DNS solutions in Docker environments for efficient container networking

Introduction to Service Discovery in Docker

Service discovery provides the critical foundation for container communication in dynamic Docker environments. As containers start, stop, and scale, their network locations change constantly, requiring automated mechanisms to locate services:

  • Dynamic infrastructure: Enable containers to find each other without hardcoded addresses
  • Automatic updates: Reflect container lifecycle changes in real-time
  • Distributed coordination: Maintain service registries across multi-host environments
  • Load balancing: Distribute traffic across multiple instances of the same service
  • Health checking: Route traffic only to healthy container instances

This guide explores the concepts, tools, and implementation patterns for effective service discovery and DNS management in Docker environments, with practical examples to help you build reliable container networking solutions.

Service Discovery Fundamentals

Service Registry Patterns

At the core of service discovery is the service registry—a database of available service instances and their locations:

  1. Self-registration pattern: Services register themselves with the registry
  2. Third-party registration: External agent detects services and registers them
  3. Client-side discovery: Clients query the registry and choose service instances
  4. Server-side discovery: Load balancer queries registry and routes client requests

The choice of pattern impacts system complexity, resilience, and performance characteristics.

DNS-Based Discovery

DNS provides a familiar and standardized approach to service discovery in Docker environments:

# Run a container with a custom hostname
docker run -d --name web --hostname web.local nginx

# Query the container using its hostname
docker exec -it web ping web.local

Docker's built-in DNS server automatically resolves container names within the same network, providing basic service discovery capabilities without additional components.

Docker Networking and Service Discovery

Docker DNS Resolution

Docker's embedded DNS resolver enables containers to locate each other by name:

Bridge Network Discovery

Default bridge networks have limited discovery capabilities:

# Default bridge - no DNS resolution between containers
docker run -d --name container1 nginx
docker run -d --name container2 nginx

# This will fail - container names aren't resolved on default bridge
docker exec -it container1 ping container2

User-defined bridge networks enable automatic service discovery:

# Create user-defined bridge network
docker network create app-network

# Connect existing containers to the network
docker network connect app-network container1
docker network connect app-network container2

# Now DNS resolution works
docker exec -it container1 ping container2

Multi-Host Networking

Docker Swarm mode provides built-in service discovery across multiple hosts:

# Initialize a swarm
docker swarm init --advertise-addr 192.168.1.10

# Deploy a service across the swarm
docker service create --name api --replicas 3 --network app-network myapp-api:latest

# Service is accessible by name from any container in the app-network
docker exec -it client curl http://api:8080/

Docker Compose Service Discovery

Automatic DNS Resolution

Docker Compose configures DNS resolution between services defined in the same compose file:

# docker-compose.yml
version: '3'
services:
  web:
    image: nginx:latest
    depends_on:
      - api
  
  api:
    image: myapp-api:latest
    depends_on:
      - database
  
  database:
    image: postgres:14
    environment:
      POSTGRES_PASSWORD: example

Services can reach each other by service name:

# Inside the web container
curl http://api:8000/

# Inside the api container
psql -h database -U postgres

Network Configuration

Compose allows fine-tuning of service discovery through network configuration:

# docker-compose.yml with custom networking
version: '3'
services:
  web:
    image: nginx:latest
    networks:
      - frontend
      - backend
  
  api:
    image: myapp-api:latest
    networks:
      - backend
  
  database:
    image: postgres:14
    networks:
      - backend
    environment:
      POSTGRES_PASSWORD: example

networks:
  frontend:
  backend:
    internal: true  # No external connectivity

This configuration:

  1. Isolates the database from external access
  2. Allows the web service to communicate with both frontend and backend
  3. Enables the API service to access the database on the backend network

External Service Discovery Tools

Consul

Consul provides a feature-rich service discovery solution with a distributed key-value store:

# docker-compose.yml for Consul
version: '3'
services:
  consul:
    image: consul:latest
    ports:
      - "8500:8500"
    command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0
    volumes:
      - consul-data:/consul/data

  service-registrator:
    image: gliderlabs/registrator:latest
    depends_on:
      - consul
    volumes:
      - /var/run/docker.sock:/tmp/docker.sock
    command: -internal consul://consul:8500

volumes:
  consul-data:

Service registration using Consul's HTTP API:

# Register a service manually
curl -X PUT -d '{
  "ID": "api-1",
  "Name": "api",
  "Address": "172.17.0.3",
  "Port": 8080,
  "Check": {
    "HTTP": "http://172.17.0.3:8080/health",
    "Interval": "10s"
  }
}' http://localhost:8500/v1/agent/service/register

etcd

etcd provides a distributed key-value store suitable for service discovery:

# docker-compose.yml for etcd
version: '3'
services:
  etcd:
    image: bitnami/etcd:latest
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd:2379
    ports:
      - "2379:2379"
    volumes:
      - etcd-data:/bitnami/etcd

volumes:
  etcd-data:

Service registration with etcd:

# Register a service endpoint
docker exec -it etcd etcdctl put /services/api/node1 '{"host": "172.17.0.3", "port": 8080}'

# Query service endpoints
docker exec -it etcd etcdctl get --prefix /services/api

CoreDNS

CoreDNS can extend Docker's DNS capabilities with custom plugins and configurations:

# docker-compose.yml for CoreDNS
version: '3'
services:
  coredns:
    image: coredns/coredns:latest
    ports:
      - "53:53/udp"
    volumes:
      - ./Corefile:/Corefile
    command: -conf /Corefile

With a corresponding Corefile:

.:53 {
    errors
    health
    ready
    hosts {
        172.17.0.3 api.service.local
        172.17.0.4 redis.service.local
        172.17.0.5 postgres.service.local
        fallthrough
    }
    forward . 8.8.8.8
    cache 30
    loop
    reload
}

Docker Swarm Service Discovery

Mesh Networking

Docker Swarm implements a mesh network that automatically provides service discovery for swarm services:

# Create an overlay network
docker network create --driver overlay --attachable services

# Deploy a service on the network
docker service create \
  --name api \
  --network services \
  --replicas 3 \
  myapp-api:latest

All containers can reach the service by name (api), and Docker handles load balancing across the three replicas.

Virtual IP (VIP) Mode

Swarm assigns a Virtual IP to each service for transparent load balancing:

# Inspect the service to see its VIP
docker service inspect --format '{{json .Endpoint.VirtualIPs}}' api

Client connections to the service name resolve to the VIP, which distributes traffic across all service instances. This requires no client-side configuration for load balancing.

DNS Round Robin Mode

Swarm supports DNS Round Robin as an alternative to VIP mode:

# Create a service with DNS Round Robin
docker service create \
  --name api \
  --network services \
  --endpoint-mode dnsrr \
  --replicas 3 \
  myapp-api:latest

With DNS Round Robin:

  1. DNS queries for api return multiple A records
  2. Clients must handle connection management across multiple IPs
  3. No extra network hop is required (unlike VIP mode)

Custom DNS Solutions

Service Meshes

Service meshes like Istio extend basic DNS-based discovery with advanced capabilities:

# Deploy Istio with integrated DNS
istioctl install --set profile=demo

# Deploy a service with service discovery enabled
kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: api
  labels:
    app: api
spec:
  ports:
  - port: 80
    name: http
  selector:
    app: api
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api
        image: myapp-api:latest
        ports:
        - containerPort: 80
EOF

Service meshes enhance service discovery with:

  1. Advanced load balancing algorithms
  2. Circuit breaking and fault tolerance
  3. Traffic shifting and request routing
  4. End-to-end encryption

Custom DNS Servers

In complex scenarios, dedicated DNS servers can enhance Docker's built-in discovery:

# docker-compose.yml for Dnsmasq
version: '3'
services:
  dnsmasq:
    image: jpillora/dnsmasq
    ports:
      - "53:53/udp"
    environment:
      - HTTP_USER=admin
      - HTTP_PASS=admin
    volumes:
      - ./dnsmasq.conf:/etc/dnsmasq.conf

With a corresponding dnsmasq.conf:

# dnsmasq.conf
address=/api.local/172.17.0.3
address=/database.local/172.17.0.4
address=/cache.local/172.17.0.5

Health Checking and Circuit Breaking

Service Health Checks

Effective service discovery requires health checking to avoid routing traffic to unhealthy instances:

# Docker Compose with health checks
version: '3'
services:
  api:
    image: myapp-api:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Docker Swarm integrates health checks with service discovery:

# Create a service with health checking
docker service create \
  --name api \
  --health-cmd "curl -f http://localhost:8080/health || exit 1" \
  --health-interval 30s \
  --replicas 3 \
  myapp-api:latest

Client-Side Circuit Breaking

Circuit breakers prevent cascading failures when services become unavailable:

// Example of circuit breaking in Go using resilience4j
import (
    "github.com/eapache/go-resiliency/breaker"
    "time"
    "net/http"
)

func main() {
    // Create a circuit breaker that trips after 5 consecutive failures
    // and resets after 10 seconds
    cb := breaker.New(5, 1, 10*time.Second)
    
    err := cb.Run(func() error {
        resp, err := http.Get("http://api:8080/endpoint")
        if err != nil {
            return err
        }
        defer resp.Body.Close()
        if resp.StatusCode >= 500 {
            return fmt.Errorf("server error: %d", resp.StatusCode)
        }
        return nil
    })
    
    if err != nil {
        // Handle error or circuit open state
    }
}

Security Considerations

Network Segmentation

Service discovery should be implemented with security boundaries:

# docker-compose.yml with network segmentation
version: '3'
services:
  frontend:
    image: myapp-frontend:latest
    networks:
      - frontend
      - services
  
  api:
    image: myapp-api:latest
    networks:
      - services
      - data
  
  database:
    image: postgres:14
    networks:
      - data
    environment:
      POSTGRES_PASSWORD: example

networks:
  frontend:
    driver: bridge
  services:
    driver: bridge
    internal: false
  data:
    driver: bridge
    internal: true  # No external access

This configuration:

  1. Isolates the database on an internal network
  2. Prevents direct frontend access to the database
  3. Allows the API to mediate between frontend and database

Service Discovery Authentication

Secure your service discovery system with authentication:

# docker-compose.yml for Consul with ACLs
version: '3'
services:
  consul:
    image: consul:latest
    ports:
      - "8500:8500"
    command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0 -acl-enable
    environment:
      - CONSUL_HTTP_TOKEN=your-master-token
    volumes:
      - ./consul-acl.json:/consul/config/acl.json

With a corresponding ACL configuration:

// consul-acl.json
{
  "acl": {
    "enabled": true,
    "default_policy": "deny",
    "down_policy": "extend-cache",
    "tokens": {
      "master": "your-master-token"
    }
  }
}

TLS for Service Communication

Secure service-to-service communication with TLS:

# docker-compose.yml with TLS configuration
version: '3'
services:
  api:
    image: myapp-api:latest
    volumes:
      - ./certs:/certs
    environment:
      - SSL_CERT_FILE=/certs/server.crt
      - SSL_KEY_FILE=/certs/server.key
      - SSL_CA_FILE=/certs/ca.crt

Monitoring and Troubleshooting

Service Discovery Monitoring

Monitor your service discovery system to ensure reliability:

# Check Consul cluster health
curl http://localhost:8500/v1/health/service/consul

# Verify service registration
curl http://localhost:8500/v1/catalog/service/api

# Check DNS resolution
dig @127.0.0.1 -p 8600 api.service.consul SRV

Common Troubleshooting Techniques

When service discovery issues arise:

  1. Verify network connectivity:
    docker exec -it container1 ping container2
    
  2. Check DNS resolution:
    docker exec -it container1 nslookup service-name
    
  3. Inspect network configuration:
    docker network inspect app-network
    
  4. View service discovery logs:
    docker logs consul
    
  5. Test with direct IP addressing:
    docker exec -it container1 curl http://172.17.0.3:8080/
    

Best Practices

Design Patterns

Effective service discovery implementations follow these patterns:

  1. Service abstraction: Clients should interact with service names, not instances
  2. Self-healing: Registration and deregistration should be automatic
  3. Environment-aware: Configuration should adapt to development, testing, and production
  4. Degradation tolerant: Systems should handle service discovery outages gracefully
  5. Cached lookups: Clients should cache discovery results to improve performance

Production Recommendations

For production environments:

  1. Distributed registry: Deploy service discovery with high availability
  2. Automatic sync: Keep service registries in sync across data centers
  3. TTL-based cleanup: Automatically remove stale service registrations
  4. Monitoring integration: Alert on service discovery issues
  5. Documentation: Maintain clear documentation of service endpoints and dependencies

Conclusion

Effective service discovery and DNS management form the backbone of reliable container networking in Docker environments. By implementing the patterns and tools described in this guide, you can build dynamic, scalable systems where containers can locate and communicate with each other seamlessly, even as they scale across multiple hosts and environments.

Whether you're using Docker's built-in DNS capabilities, Docker Compose networks, Swarm mode service discovery, or external tools like Consul and etcd, the principles remain the same: abstract service locations, automate registration and discovery, and build resilient systems that can adapt to the dynamic nature of containerized environments.