Skip to content

wacyi/go-observability-stack

Go Observability Stack – Production-Ready Monitoring

Comprehensive observability solution for Go applications and microservices


OpenTelemetry β€’ Grafana Tempo β€’ Loki β€’ Prometheus β€’ Grafana β€’ DragonflyDB

License: MIT Docker Go OpenTelemetry

πŸ“ ARCHITECTURE OVERVIEW

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Your Go Proxy Service (port 8089)               β”‚
β”‚                    /var/log/proxy-services/*.log                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                                     β”‚
             β”‚ OTLP (traces/metrics)               β”‚ Log files
             β–Ό                                     β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ OTel Collector     β”‚              β”‚   Promtail      β”‚
    β”‚ :4317 (gRPC)       β”‚              β”‚   :9080         β”‚
    β”‚ :4318 (HTTP)       β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚ :8889 (Prom)       β”‚                       β”‚
    β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”˜                       β”‚
       β”‚      β”‚      β”‚                            β”‚
       β”‚      β”‚      └──────────┐                 β”‚
       β”‚      β”‚                 β”‚                 β”‚
       β–Ό      β–Ό                 β–Ό                 β–Ό
    β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚Tempoβ”‚ β”‚Prometheusβ”‚   β”‚Metrics β”‚      β”‚  Loki   β”‚
    β”‚:3200β”‚ β”‚  :9090   β”‚   β”‚Generator      β”‚  :3100  β”‚
    β””β”€β”€β”¬β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
       β”‚         β”‚             β”‚                 β”‚
       β”‚         β”‚             └────────┐        β”‚
       β”‚         β”‚                      β”‚        β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                   β”‚   Grafana    β”‚
                   β”‚   :3000      β”‚
                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  DragonflyDB (Redis Cache)      β”‚
    β”‚  :6382 (host) β†’ :6379 (container)
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow:

  1. Go Proxy sends traces/metrics via OTLP β†’ OTel Collector
  2. Go Proxy writes logs to /var/log/proxy-services/ β†’ Promtail reads them
  3. OTel Collector forwards traces β†’ Tempo
  4. OTel Collector exposes metrics β†’ Prometheus scrapes them
  5. Promtail ships logs β†’ Loki
  6. Tempo generates span metrics β†’ Prometheus
  7. Grafana queries all three backends for unified observability

Separate Service:

  • DragonflyDB provides Redis-compatible caching for the Go application

=============================================================================

πŸš€ QUICK START GUIDE

πŸ“¦ What's in the Package

observability/
β”œβ”€β”€ πŸ“„ README.md                  βœ… Updated with new name
β”œβ”€β”€ πŸ“ CONTRIBUTING.md           βœ… Updated with new name
β”œβ”€β”€ πŸ“œ LICENSE                   βœ… MIT License
β”œβ”€β”€ πŸ“‹ CODE_OF_CONDUCT.md        βœ… Community standards
β”œβ”€β”€ πŸ”’ .env.example              βœ… Safe template
β”œβ”€β”€ 🚫 .gitignore                βœ… Protects secrets
β”œβ”€β”€ πŸ“˜ OPENSOURCE_GUIDE.md       βœ… Publishing steps
β”œβ”€β”€ 🎯 GITHUB_SETUP.md           βœ… Repository setup guide (NEW!)
β”œβ”€β”€ 🐳 docker-compose.yaml       βœ… Service definitions
└── βš™οΈ  Configuration files       βœ… All YAML configs

STEP 1: Create .env File

Copy .env.example to .env and set your secure passwords:

cp .env.example .env
# Edit .env and set secure passwords

Or create .env manually with:

DFLY_PASSWORD=your_secure_dragonfly_password
GRAFANA_ADMIN_PASSWORD=your_secure_grafana_password

STEP 2: Start the Stack

docker-compose up -d

STEP 3: Verify Services

docker-compose ps

All services should show "Up (healthy)" status: βœ“ dragonfly βœ“ otel-collector βœ“ tempo βœ“ loki βœ“ promtail βœ“ prometheus βœ“ grafana

STEP 4: Configure Your Go Application

Update your main .env file with:

# DragonflyDB (Redis-compatible cache)
REDIS_HOST=localhost
REDIS_PORT=6382
REDIS_PASSWORD=<copy from observability/.env DFLY_PASSWORD>
REDIS_DB=0

# OpenTelemetry Configuration
OTEL_ENABLED=true
OTEL_ENDPOINT=localhost:4318
OTEL_SERVICE_NAME=proxy-services
OTEL_ENVIRONMENT=production

STEP 5: Ensure Log Directory Exists

Promtail reads logs from /var/log/proxy-services/

sudo mkdir -p /var/log/proxy-services
sudo chown $USER:staff /var/log/proxy-services

Configure your Go app to write logs to: /var/log/proxy-services/app.log

STEP 6: Start Your Application

cd proxy-services
go run main.go

STEP 7: Go Application Integration Examples

Connecting to DragonflyDB (Redis)

Add the following to your Go application to connect to DragonflyDB:

package main

import (
    "context"
    "log"
    "os"

    "github.com/redis/go-redis/v9"
)

func initRedis() *redis.Client {
    rdb := redis.NewClient(&redis.Options{
        Addr:     os.Getenv("REDIS_HOST") + ":" + os.Getenv("REDIS_PORT"),
        Password: os.Getenv("REDIS_PASSWORD"),
        DB:       0, // use default DB
    })

    // Test connection
    pong, err := rdb.Ping(context.Background()).Result()
    if err != nil {
        log.Fatal("Failed to connect to Redis:", err)
    }
    log.Println("Redis connected:", pong)

    return rdb
}

Initializing OpenTelemetry with OTLP Exporter

Add OpenTelemetry initialization to your main function:

package main

import (
    "context"
    "log"
    "os"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
    "go.opentelemetry.io/otel/sdk/resource"
    semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
    "go.opentelemetry.io/otel/sdk/trace"
)

func initOpenTelemetry(ctx context.Context) func() {
    // Create OTLP HTTP exporter
    exporter, err := otlptracehttp.New(ctx,
        otlptracehttp.WithEndpoint(os.Getenv("OTEL_ENDPOINT")),
        otlptracehttp.WithInsecure(), // Use WithTLSConfig for production
    )
    if err != nil {
        log.Fatal("Failed to create OTLP exporter:", err)
    }

    // Create resource
    res, err := resource.New(ctx,
        resource.WithAttributes(
            semconv.ServiceNameKey.String(os.Getenv("OTEL_SERVICE_NAME")),
            semconv.ServiceVersionKey.String("1.0.0"),
            semconv.DeploymentEnvironmentKey.String(os.Getenv("OTEL_ENVIRONMENT")),
        ),
    )
    if err != nil {
        log.Fatal("Failed to create resource:", err)
    }

    // Create tracer provider
    tracerProvider := trace.NewTracerProvider(
        trace.WithBatcher(exporter),
        trace.WithResource(res),
    )

    // Set global tracer provider
    otel.SetTracerProvider(tracerProvider)

    // Return cleanup function
    return func() {
        if err := tracerProvider.Shutdown(ctx); err != nil {
            log.Printf("Error shutting down tracer provider: %v", err)
        }
    }
}

Use it in your main function:

func main() {
    ctx := context.Background()

    // Initialize OpenTelemetry
    shutdown := initOpenTelemetry(ctx)
    defer shutdown()

    // Initialize Redis
    rdb := initRedis()
    defer rdb.Close()

    // Your application logic here
    tracer := otel.Tracer("proxy-services")
    ctx, span := tracer.Start(ctx, "main")
    defer span.End()

    // Example Redis usage
    err := rdb.Set(ctx, "key", "value", 0).Err()
    if err != nil {
        span.RecordError(err)
    }

    log.Println("Application started successfully")
}

=============================================================================

🌐 ACCESS URLS

Service URL Credentials

Grafana http://localhost:3000 admin / <GRAFANA_ADMIN_PASSWORD> Prometheus http://localhost:9090 (no auth) Tempo http://localhost:3200 (no auth) Loki http://localhost:3100 (no auth) OTel Collector http://localhost:8888 (metrics endpoint) OTel Prometheus http://localhost:8889 (metrics exporter) Promtail http://localhost:9080 (no auth)

DragonflyDB localhost:6382 Password: <DFLY_PASSWORD>

All ports are bound to 127.0.0.1 (localhost only) for security.

Remote Access via SSH Tunnel

If running on a remote server, forward ports to your local machine:

ssh -L 3000:localhost:3000 -L 9090:localhost:9090 user@your-server-ip

This allows you to access:

  • Grafana at http://localhost:3000 on your local machine
  • Prometheus at http://localhost:9090 on your local machine

For additional services, add more -L flags:

ssh -L 3000:localhost:3000 \
    -L 9090:localhost:9090 \
    -L 3200:localhost:3200 \
    -L 3100:localhost:3100 \
    user@your-server-ip

=============================================================================

πŸ“Š GRAFANA CONFIGURATION

AUTO-PROVISIONED DATA SOURCES:

βœ… Prometheus (default) - Metrics from OTel & scrape targets βœ… Tempo - Distributed traces with service map βœ… Loki - Application logs with trace correlation βœ… OTel Collector - Direct collector metrics

INTEGRATED FEATURES:

β€’ Traces β†’ Logs: Click trace ID to see related logs β€’ Traces β†’ Metrics: See service metrics for trace spans β€’ Logs β†’ Traces: Click traceID field to view full trace β€’ Service Map: Visualize service dependencies β€’ Node Graph: Interactive trace visualization

EXAMPLE QUERIES:

Logs (Loki): {job="proxy-services"} |= "error" {job="proxy-services", level="error"} {job="proxy-services"} | json | level="warn" {job="proxy-services"} |= "traceID" | json

Traces (Tempo): β€’ Search by service name: proxy-services β€’ Filter by duration: > 100ms β€’ Filter by status: error β€’ Use trace ID from logs

Metrics (Prometheus): rate(http_requests_total[5m]) histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) proxy_services_cache_hits_total up{job="otel-collector"}

=============================================================================

πŸ”§ COMPONENT DETAILS

DRAGONFLY DB

β€’ Redis-compatible in-memory cache β€’ Port: 6382 (host) β†’ 6379 (container) β€’ Max memory: 2GB with eviction β€’ Password protected β€’ Persistent storage: dfly_data volume

OTEL COLLECTOR

β€’ Receives: OTLP gRPC (4317) and HTTP (4318) β€’ Exports traces to: Tempo via OTLP β€’ Exports metrics to: Prometheus exporter (8889) β€’ Scrapes: Optional Prometheus /metrics from your app β€’ Batch processing: 1024 records, 10s timeout

TEMPO

β€’ Distributed tracing backend β€’ Receives: OTLP (gRPC 4317, HTTP 4318) from collector β€’ HTTP API: Port 3200 β€’ Storage: Local filesystem (/var/tempo) β€’ Trace retention: 48 hours β€’ Metrics generator: Sends span metrics to Prometheus

LOKI

β€’ Log aggregation system β€’ Push API: Port 3100 β€’ Schema: TSDB (modern, v13) β€’ Storage: Local filesystem (/loki) β€’ Log retention: 30 days (720h) β€’ Max line size: 256KB β€’ Ingestion rate: 10MB/s (burst 20MB/s)

PROMTAIL

β€’ Log collector and shipper β€’ Watches: /var/log/proxy-services/*.log β€’ Pipeline: JSON parsing (zerolog format) β€’ Extracts: level, timestamp, message, caller β€’ Ships to: Loki (port 3100) β€’ Position tracking: /tmp/positions

PROMETHEUS

β€’ Time-series metrics database β€’ HTTP API: Port 9090 β€’ Scrape interval: 15s β€’ Remote write: Enabled (for Tempo metrics) β€’ Scrape targets:

  • otel-collector:8888 (collector metrics)
  • otel-collector:8889 (application metrics)
  • prometheus:9090 (self-monitoring)
  • tempo:3200 (tempo metrics)
  • loki:3100 (loki metrics)

GRAFANA

β€’ Visualization and dashboards β€’ Web UI: Port 3000 β€’ Anonymous access: Disabled β€’ Sign-up: Disabled β€’ Data sources: Auto-provisioned with correlation β€’ Persistent storage: grafana_data volume

=============================================================================

πŸ› TROUBLESHOOTING

CHECK SERVICE HEALTH:

docker-compose ps
docker-compose logs -f <service-name>

SERVICE-SPECIFIC DEBUGGING:

OTel Collector not receiving traces: docker-compose logs -f otel-collector # Check your app connects to localhost:4318 or :4317 curl http://localhost:8888/metrics

Tempo not showing traces: docker-compose logs -f tempo curl http://localhost:3200/api/search # Verify OTel collector can reach tempo:4317

Loki not receiving logs: docker-compose logs -f loki docker-compose logs -f promtail # Verify /var/log/proxy-services/*.log exists and has content curl http://localhost:3100/ready

Promtail not reading logs: docker-compose logs -f promtail # Check log file permissions ls -la /var/log/proxy-services/ # Verify JSON format in logs

Prometheus not scraping: docker-compose logs -f prometheus curl http://localhost:9090/api/v1/targets # Check if targets are UP

DragonflyDB connection issues: docker-compose logs -f dragonfly # Test connection docker exec -it observability-dragonfly-1 redis-cli -p 6379 -a "$DFLY_PASSWORD" ping # Should return: PONG

Grafana data source issues: docker-compose logs -f grafana # Check data sources in Grafana UI: Configuration β†’ Data Sources # Test each data source

NETWORK DEBUGGING:

# Check if services can communicate
docker-compose exec otel-collector wget -O- http://tempo:3200/ready
docker-compose exec otel-collector wget -O- http://loki:3100/ready
docker-compose exec promtail wget -O- http://loki:3100/ready

RESET EVERYTHING:

cd observability
docker-compose down -v  # ⚠️  Removes all data!
docker-compose up -d

RESTART SINGLE SERVICE:

docker-compose restart <service-name>

=============================================================================

πŸ“ CONFIGURATION FILES

docker-compose.yaml - Service definitions and networking otel-config.yaml - OpenTelemetry Collector pipeline tempo.yaml - Tempo tracing configuration loki.yaml - Loki log aggregation config promtail-config.yaml - Promtail log collection rules prometheus.yaml - Prometheus scrape configs grafana-datasources.yaml - Auto-provisioned data sources .env - Secrets (not in git)

=============================================================================

πŸ”’ SECURITY NOTES

βœ… IMPLEMENTED SECURITY:

β€’ All ports bound to 127.0.0.1 (localhost only) β€’ DragonflyDB password protected β€’ Grafana password protected β€’ Grafana sign-up disabled β€’ Grafana anonymous access disabled β€’ Docker network isolation (observability network) β€’ No ports exposed to external network

⚠️ PRODUCTION RECOMMENDATIONS:

β€’ Use Docker secrets instead of .env files β€’ Add TLS/SSL certificates for all services β€’ Implement authentication for all endpoints β€’ Set up reverse proxy (nginx/traefik) with SSL β€’ Configure firewall rules β€’ Enable audit logging β€’ Implement log retention policies β€’ Set up automated backups β€’ Use separate networks for production β€’ Rotate passwords regularly β€’ Monitor failed login attempts

=============================================================================

πŸ“¦ DOCKER VOLUMES

dfly_data - DragonflyDB persistent cache tempo_data - Tempo trace storage loki_data - Loki log storage prom_data - Prometheus time-series data grafana_data - Grafana dashboards and settings promtail_positions - Promtail read positions

To back up data: docker run --rm -v observability_grafana_data:/data -v $(pwd):/backup
alpine tar czf /backup/grafana-backup.tar.gz -C /data .

=============================================================================

🎯 FEATURE HIGHLIGHTS

βœ… Complete observability stack in one command βœ… Auto-configured with best practices βœ… Trace-to-log correlation out of the box βœ… Service dependency mapping βœ… High-performance Redis-compatible cache βœ… Modern TSDB-based log storage βœ… Metrics generation from traces βœ… Health checks for all services βœ… Proper service startup ordering βœ… Localhost-only binding for security βœ… Persistent storage for all data βœ… JSON log parsing (zerolog compatible) βœ… 30-day log retention βœ… 48-hour trace retention βœ… Distributed tracing support βœ… Prometheus remote write enabled

=============================================================================

πŸ“š ADDITIONAL RESOURCES

OpenTelemetry: https://opentelemetry.io/docs/ Grafana Tempo: https://grafana.com/docs/tempo/ Grafana Loki: https://grafana.com/docs/loki/ Prometheus: https://prometheus.io/docs/ Promtail: https://grafana.com/docs/loki/latest/clients/promtail/ DragonflyDB: https://www.dragonflydb.io/docs Grafana: https://grafana.com/docs/grafana/

=============================================================================

🀝 CONTRIBUTING

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

By participating in this project, you agree to abide by our Code of Conduct.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

=============================================================================

Created: November 2025 by Abdorizak Abdalla Hassan βš™οΈ
Version: 2.0
Repository: masjidAbuhureyra/go-observability-stack

=============================================================================

About

πŸ”­ Production-ready observability stack with OpenTelemetry, Grafana, Tempo, Loki, Prometheus & DragonflyDB. One-command setup with full trace-to-log correlation.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors