Docker Compose

啟動服務

docker-compose up

Remote

docker -H=remote-docker-engine:2375
docker -H=10.123.2.1:2375 run nginx

版本演進

Version 1

使用 links 連接容器:

redis:
    image: redis
db:
    image: postgres:9.4
vote:
    image: voting-app
    ports:
        - 5000:80
    links:
        - redis
result:
    image: result-app
    ports:
        - 5001:80
    links:
        - db
worker:
    image: worker
    links:
        - redis
        - db

Version 2

引入 services 和網路概念:

version: 2
services:
    redis:
        image: redis
        networks:
            - back-end
    db:
        image: postgres:9.4
        networks:
            - back-end
    vote:
        image: voting-app
        ports:
            - 5000:80
        depends_on:
            - redis
        networks:
            - front-end
            - back-end
    result:
        image: result
        networks:
            - front-end
            - back-end

networks:
    front-end:
    back-end:

Version 3

簡化配置結構:

version: 3
services:
    redis:
        image: redis
    db:
        image: postgres:9.4
    vote:
        image: voting-app
        ports:
            - 5000:80

Deploy

傳統部署方式:

docker run user/simple-webapp
docker run mongodb
docker run redis:alpine
docker run ansible

Docker Compose 部署方式:

services:
    web:
        image: "user/simple-webapp"
    database:
        image: "mongodb"
    messaging:
        image: "redis:alpine"
    orchestration:
        image: "ansible"

Old Way Container Connect

docker run -d --name=redis user/simple-webapp
docker run -d --name=db postgres
docker run -d --name=vote -p 5000:80 --link redis:redis voting-app
docker run -d --name=result -p 5001:80 --link db:db result-app
docker run -d --name=worker --link redis:redis --link db:db --link redis:redis worker

Compose

redis:
    image: redis
db:
    image: postgres:9.4
vote:
    build: ./vote
    ports:
        - 5000:80
    links:
        - redis
result:
    build: ./result
    ports:
        - 5001:80
    links:
        - db
worker:
    build: ./worker
    links:
        - redis
        - db
Docker Compose Overview

Commands

啟動與停止

# 啟動所有服務
docker-compose up -d

# 啟動特定服務
docker-compose up -d service_name

# 停止所有服務
docker-compose down

# 停止特定服務 (保持其他服務運行)
docker-compose stop service_name

# 啟動特定服務
docker-compose start service_name

# 重啟特定服務
docker-compose restart service_name

# 暫停/恢復服務
docker-compose pause service_name
docker-compose unpause service_name

容器管理

# 移除停止的容器
docker-compose rm service_name

# 強制移除容器 (即使在運行)
docker-compose rm -f service_name

# 重建並重啟特定服務
docker-compose up -d --build service_name

# 強制重建容器 (無快取)
docker-compose up -d --force-recreate service_name

# 移除所有服務並清理 volumes
docker-compose down -v

# 移除所有服務、volumes 和 images
docker-compose down -v --rmi all

Build & Image Management

# 構建所有服務
docker-compose build

# 構建特定服務
docker-compose build service_name

# 強制重建 (無快取)
docker-compose build --no-cache

# 拉取映像
docker-compose pull

# 拉取特定服務映像
docker-compose pull service_name

# 推送映像
docker-compose push service_name

Scaling & Resource Management

# 擴展服務實例數量
docker-compose up -d --scale web=3 --scale worker=2

# 查看服務狀態
docker-compose ps

# 查看特定服務狀態
docker-compose ps service_name

# 查看資源使用情況
docker-compose top

# 查看容器統計
docker stats $(docker-compose ps -q)

Logs & Debugging

# 查看所有服務日誌
docker-compose logs

# 查看特定服務日誌
docker-compose logs service_name

# 即時追蹤日誌
docker-compose logs -f service_name

# 查看最近 100 行日誌
docker-compose logs --tail=100 service_name

# 查看帶時間戳的日誌
docker-compose logs -t service_name

Container Interaction

# 在運行的容器中執行指令
docker-compose exec service_name bash

# 執行一次性指令
docker-compose run service_name python manage.py migrate

# 執行一次性指令 (不創建依賴服務)
docker-compose run --no-deps service_name npm test

# 在後台執行一次性指令
docker-compose run -d service_name python worker.py

Configuration & Validation

# 驗證配置檔案語法
docker-compose config

# 查看最終配置 (變數替換後)
docker-compose config --services

# 列出所有服務名稱
docker-compose config --services

# 使用不同的 compose 檔案
docker-compose -f docker-compose.yml -f docker-compose.prod.yml config

# 設定環境變數檔案
docker-compose --env-file .env.prod up -d

Environment Management

# 開發環境
docker-compose up -d

# 生產環境
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# 測試環境
docker-compose -f docker-compose.test.yml up -d --abort-on-container-exit

# 清理開發環境
docker-compose down -v --remove-orphans
docker system prune -f

Health Check & Monitoring

# 檢查不健康的服務
docker-compose ps --filter "health=unhealthy"

# 等待服務健康後再執行
docker-compose up -d
docker-compose exec service_name wait-for-it.sh db:5432 -- echo "DB is ready"

# 查看容器內部網路
docker-compose exec service_name ip addr show

# 檢查端口連通性
docker-compose exec service_name telnet redis 6379

Demo

version: '3.8'

services:
  # Frontend - React + Vite Development Server
  frontend:
    build:
      context: ./frontend
      dockerfile: Dockerfile.dev
    ports:
      - "3000:3000"
    volumes:
      - ./frontend:/app
      - /app/node_modules
    environment:
      - VITE_API_URL=http://localhost:8080
      - VITE_PYTHON_API_URL=http://localhost:8001
      - VITE_NODE_API_URL=http://localhost:8002
      - VITE_TRACING_ENDPOINT=http://tempo:3200
    depends_on:
      - go-server
      - python-server
      - nodejs-server
    networks:
      - app-network
    restart: unless-stopped
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=frontend,environment=development"

  # Go Backend Server with OpenTelemetry
  go-server:
    build:
      context: ./go-server
      dockerfile: Dockerfile
    ports:
      - "8080:8080"
    environment:
      - GIN_MODE=debug
      - REDIS_URL=redis:6379
      - REDIS_PASSWORD=redis123
      - DB_HOST=postgres
      - DB_PORT=5432
      - DB_NAME=goapp
      - DB_USER=postgres
      - DB_PASSWORD=password123
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317
      - OTEL_RESOURCE_ATTRIBUTES=service.name=go-server,service.version=1.0.0
      - JAEGER_ENDPOINT=http://tempo:14268/api/traces
    volumes:
      - ./go-server:/app
    depends_on:
      - redis
      - postgres
      - tempo
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=go-server,environment=development"

  # Python FastAPI Server with OpenTelemetry
  python-server:
    build:
      context: ./python-server
      dockerfile: Dockerfile
    ports:
      - "8001:8001"
    environment:
      - PYTHONPATH=/app
      - REDIS_URL=redis://redis:6379
      - REDIS_PASSWORD=redis123
      - DATABASE_URL=postgresql://postgres:password123@postgres:5432/pythonapp
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317
      - OTEL_RESOURCE_ATTRIBUTES=service.name=python-server,service.version=1.0.0
      - OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true
    volumes:
      - ./python-server:/app
    depends_on:
      - redis
      - postgres
      - tempo
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8001/docs"]
      interval: 30s
      timeout: 10s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=python-server,environment=development"

  # Node.js Express Server with OpenTelemetry
  nodejs-server:
    build:
      context: ./nodejs-server
      dockerfile: Dockerfile
    ports:
      - "8002:8002"
    environment:
      - NODE_ENV=development
      - REDIS_URL=redis://redis:6379
      - REDIS_PASSWORD=redis123
      - DATABASE_URL=postgresql://postgres:password123@postgres:5432/nodeapp
      - PORT=8002
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317
      - OTEL_RESOURCE_ATTRIBUTES=service.name=nodejs-server,service.version=1.0.0
      - OTEL_NODE_ENABLED_INSTRUMENTATIONS=http,express,redis,pg
    volumes:
      - ./nodejs-server:/app
      - /app/node_modules
    depends_on:
      - redis
      - postgres
      - tempo
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8002/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=nodejs-server,environment=development"

  # Nginx Reverse Proxy & Load Balancer
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
      - ./nginx/logs:/var/log/nginx
    depends_on:
      - frontend
      - go-server
      - python-server
      - nodejs-server
    networks:
      - app-network
    restart: always
    healthcheck:
      test: ["CMD", "nginx", "-t"]
      interval: 30s
      timeout: 10s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=nginx,environment=development"

  # Redis Cache Server
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    command: >
      redis-server
      --appendonly yes
      --requirepass redis123
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    volumes:
      - redis_data:/data
      - ./redis/redis.conf:/usr/local/etc/redis/redis.conf
    networks:
      - app-network
    restart: always
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "redis123", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=redis,environment=development"

  # RedisInsight - Redis Management UI
  redisinsight:
    image: redis/redisinsight:latest
    ports:
      - "5540:5540"
    environment:
      - RITRUSTEDORIGINS=http://localhost:5540
    volumes:
      - redisinsight_data:/data
    depends_on:
      - redis
    networks:
      - app-network
    restart: unless-stopped

  # PostgreSQL Database
  postgres:
    image: postgres:15-alpine
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password123
      - POSTGRES_MULTIPLE_DATABASES=goapp,pythonapp,nodeapp
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./postgres/init-scripts:/docker-entrypoint-initdb.d
      - ./postgres/postgresql.conf:/etc/postgresql/postgresql.conf
    networks:
      - app-network
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
        labels: "service=postgres,environment=development"

  # pgAdmin - PostgreSQL Management UI
  pgadmin:
    image: dpage/pgadmin4:latest
    ports:
      - "5050:80"
    environment:
      - PGADMIN_DEFAULT_EMAIL=admin@example.com
      - PGADMIN_DEFAULT_PASSWORD=admin123
      - PGADMIN_CONFIG_SERVER_MODE=False
    volumes:
      - pgadmin_data:/var/lib/pgadmin
    depends_on:
      - postgres
    networks:
      - app-network
    restart: unless-stopped

  # LGTM Stack - Loki (Log Aggregation)
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - ./lgtm/loki:/etc/loki
      - loki_data:/loki
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:3100/ready || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3

  # LGTM Stack - Tempo (Distributed Tracing)
  tempo:
    image: grafana/tempo:2.2.0
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - ./lgtm/tempo:/etc/tempo.yaml:ro
      - tempo_data:/tmp/tempo
    ports:
      - "3200:3200"   # tempo
      - "4317:4317"   # otlp grpc
      - "4318:4318"   # otlp http
      - "14268:14268" # jaeger ingest
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:3200/ready || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3

  # LGTM Stack - Mimir (Long-term Metrics Storage)
  mimir:
    image: grafana/mimir:2.10.0
    command: ["-config.file=/etc/mimir.yaml"]
    ports:
      - "9009:9009"
    volumes:
      - ./lgtm/mimir:/etc/mimir.yaml:ro
      - mimir_data:/data
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:9009/ready || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3

  # LGTM Stack - Grafana (Visualization & Dashboards)
  grafana:
    image: grafana/grafana:10.1.0
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin123
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_FEATURE_TOGGLES_ENABLE=traceqlEditor
      - GF_INSTALL_PLUGINS=redis-datasource,postgresql-datasource
    volumes:
      - grafana_data:/var/lib/grafana
      - ./lgtm/grafana/provisioning:/etc/grafana/provisioning
      - ./lgtm/grafana/dashboards:/var/lib/grafana/dashboards
    depends_on:
      - loki
      - tempo
      - mimir
      - postgres
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:3000/api/health || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Promtail (Log Shipper to Loki)
  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - ./lgtm/promtail:/etc/promtail
    command: -config.file=/etc/promtail/config.yml
    depends_on:
      - loki
    networks:
      - app-network
    restart: unless-stopped

  # OTEL Collector (OpenTelemetry Collector)
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.87.0
    command: ["--config=/etc/otelcol-contrib/otel-collector.yaml"]
    volumes:
      - ./lgtm/otel-collector:/etc/otelcol-contrib
    ports:
      - "1888:1888"   # pprof extension
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
      - "13133:13133" # health_check extension
      - "4317:4317"   # OTLP gRPC receiver
      - "4318:4318"   # OTLP http receiver
      - "55679:55679" # zpages extension
    depends_on:
      - loki
      - tempo
      - mimir
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:13133"]
      interval: 30s
      timeout: 10s
      retries: 3

# Named Volumes for Data Persistence
volumes:
  postgres_data:
    driver: local
  redis_data:
    driver: local
  redisinsight_data:
    driver: local
  pgadmin_data:
    driver: local
  loki_data:
    driver: local
  tempo_data:
    driver: local
  mimir_data:
    driver: local
  grafana_data:
    driver: local

# Custom Networks
networks:
  app-network:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/16

LGTM Stack Microservices Architecture

Components Breakdown

Loki (日誌聚合系統)

是 Grafana Labs 開發的水平可擴展、高可用性的日誌聚合系統。它的設計理念是只索引標籤而不索引日誌內容,這使得它比傳統的 ELK Stack 更輕量且成本更低。在我們的架構中,Loki 收集來自所有微服務的日誌,提供統一的日誌查詢界面。

Grafana (可視化儀表板)

作為統一的可視化平台,它可以連接到 Loki、Tempo 和 Mimir,提供日誌、追蹤和指標的統一視圖。這種整合讓開發者可以在同一個界面中關聯不同類型的可觀測性數據。

Tempo (分散式追蹤)

專門處理分散式追蹤數據,它可以追蹤請求在微服務架構中的完整路徑。當用戶請求經過 Nginx → Go Server → Redis → PostgreSQL 這樣的調用鏈時,Tempo 能夠顯示每個環節的時間消耗和錯誤情況。

Mimir (長期指標存儲)

是 Grafana Labs 推出的 Prometheus 替代方案,提供更好的水平擴展能力和長期存儲功能。它與 Prometheus 完全兼容,但在大規模環境中表現更佳。

Practical Monitoring Workflows

場景一:性能問題排查

當你發現 API 響應時間異常時,你可以:

  1. 在 Grafana 中查看響應時間指標(來自 Mimir)
  2. 點擊異常時間點,查看相關的 trace(來自 Tempo)
  3. 從 trace 中識別瓶頸環節,然後查看相關服務的日誌(來自 Loki)
  4. 結合 PostgreSQL 和 Redis 的數據源,分析資料庫和快取的性能

場景二:錯誤率監控

你可以設置基於多個數據源的告警:

  • 當錯誤率超過閾值時(來自 Mimir 的指標)
  • 當特定錯誤關鍵字出現在日誌中時(來自 Loki)
  • 當 trace 的錯誤率異常時(來自 Tempo)

Quick Start Commands

# 啟動完整 LGTM Stack 環境
docker-compose up -d

# 只啟動核心應用服務 (不包含監控)
docker-compose up -d frontend go-server python-server nodejs-server nginx redis postgres

# 只啟動監控服務
docker-compose up -d loki tempo mimir grafana promtail otel-collector

# 檢查監控服務健康狀態
docker-compose ps loki tempo mimir grafana

# 查看 OTEL Collector 日誌 (用於除錯追蹤數據)
docker-compose logs -f otel-collector

# 重啟特定監控組件
docker-compose restart grafana

Access URLs

ServiceURLCredentials
Grafana Dashboardhttp://localhost:3001admin/admin123
Frontend Applicationhttp://localhost:3000-
Go Server APIhttp://localhost:8080-
Python Server APIhttp://localhost:8001-
Node.js Server APIhttp://localhost:8002-
RedisInsighthttp://localhost:5540-
pgAdminhttp://localhost:5050admin@example.com/admin123
Loki APIhttp://localhost:3100-
Tempo APIhttp://localhost:3200-
Mimir APIhttp://localhost:9009-