This project runs a reusable observability platform for all of your projects.
It uses:
- OpenTelemetry Collector as the single ingestion endpoint for every app.
- OpenObserve for logs, metrics, traces, dashboards, alerts, and general APM.
- Phoenix for LLM, RAG, agent, prompt, dataset, and evaluation observability.
- PostgreSQL for durable Phoenix storage.
- Caddy as the reverse proxy and TLS endpoint for browser access and OTLP HTTP ingestion.
- Local and S3 backup scripts for disaster recovery.
The important design decision is that your applications talk to OpenTelemetry, not to OpenObserve or Phoenix directly. This keeps your projects portable if you later replace or add backends.
Application services
-> OpenTelemetry SDK / auto-instrumentation
-> OTLP HTTP or gRPC
-> OpenTelemetry Collector
-> OpenObserve for logs, metrics, traces
-> Phoenix for LLM traces
Browser users
-> Caddy HTTPS reverse proxy
-> OpenObserve UI
-> Phoenix UI
Default exposed endpoints:
| Purpose | Endpoint |
|---|---|
| OpenObserve UI | https://openobserve.observability.duckdns.org |
| Phoenix UI | https://phoenix.observability.duckdns.org |
| OTLP HTTP ingest | https://otel.observability.duckdns.org |
| Raw OTLP HTTP ingest | http://host:4318 |
| Raw OTLP gRPC ingest | http://host:4317, bound to localhost by default |
For production and EC2, prefer https://otel.observability.duckdns.org over raw 4318.
| File | Purpose |
|---|---|
docker-compose.yml |
Runs OpenObserve, Phoenix, PostgreSQL, Collector, and Caddy. |
config/otel-collector.yaml |
Receives OTLP and exports telemetry to OpenObserve and Phoenix. |
caddy/Caddyfile |
Routes public HTTPS hostnames to the internal services. |
.env.example |
Template for credentials, domains, stream names, and backup settings. |
scripts/backup-local.sh |
Creates local backups of OpenObserve data, Phoenix PostgreSQL, config, and .env. |
scripts/backup-s3.sh |
Creates a local backup, then syncs it to S3 with server-side encryption. |
scripts/restore-local.sh |
Restores OpenObserve data and Phoenix PostgreSQL from a local backup folder. |
USER_MANAGEMENT.md |
Guide for creating and managing users in OpenObserve and Phoenix. |
AWS_SETUP.md |
Infrastructure and configuration steps for AWS deployment. |
-
Copy the environment template:
cp .env.example .env
-
Edit
.envand replace everychange-meandreplacevalue. -
Generate the OpenObserve auth header:
printf '%s' 'admin@example.com:your-openobserve-password' | base64
Put the result into
.envlike this:OPENOBSERVE_AUTH_HEADER=Basic pasted_base64_value
-
Generate the Collector ingest password hash:
docker run --rm httpd:2.4-alpine htpasswd -nbB otel_ingest 'your-ingest-password'Put the entire output into
.env, but escape every$as$$because Docker Compose treats$as interpolation syntax:OTEL_COLLECTOR_HTPASSWD=otel_ingest:$$2y$$05$$...
-
Generate a Phoenix secret:
openssl rand -hex 32
Put it into:
PHOENIX_SECRET=generated_value
-
Start the stack:
docker compose up -d
-
Open Phoenix, log in as the initial admin, then create a system API key:
Email: admin@localhost Password: PHOENIX_DEFAULT_ADMIN_INITIAL_PASSWORD from .env Phoenix UI -> Settings -> API Keys -> System API Key -
Add that key to
.env:PHOENIX_API_KEY=your_system_api_key
-
Restart the Collector:
docker compose up -d otel-collector
Until PHOENIX_API_KEY is set, OpenObserve can receive traces, logs, and metrics, but Phoenix export will fail authentication.
For local-only development without DNS:
-
Set these in
.env:OPENOBSERVE_SITE=http://openobserve.local PHOENIX_SITE=http://phoenix.local OTEL_SITE=http://otel.local
-
Add these to
/etc/hostson this machine:127.0.0.1 openobserve.local 127.0.0.1 phoenix.local 127.0.0.1 otel.local -
For other devices on the LAN, add the same names to their hosts files, but point them to this machine's LAN IP:
192.168.x.x openobserve.local 192.168.x.x phoenix.local 192.168.x.x otel.local -
Use these local endpoints:
OpenObserve: http://openobserve.local:8080 Phoenix: http://phoenix.local:8080 Collector HTTP: http://otel.local:8080 Collector HTTP: http://localhost:4318 Collector gRPC: http://localhost:4317
For EC2 production, set HTTP_BIND=80 and HTTPS_BIND=443.
The Compose file intentionally does not expose OpenObserve or Phoenix directly. Caddy is the browser entry point. If you want raw local UI ports during development, temporarily add ports to the relevant services.
Recommended EC2 shape for a small team:
- Ubuntu LTS.
- Docker Engine and Docker Compose plugin.
- At least 2 vCPU, 4 GB RAM for light use.
- 8 GB RAM or more if you ingest many logs/traces.
- EBS gp3 volume mounted where this repo lives.
- Security group allowing:
22/tcpfrom your IP only.80/tcpand443/tcpfrom users/apps that need access.- Do not expose
5080,6006,5432, or raw4317publicly. - Expose raw
4318only if you cannot use Caddy/TLS.
DNS should point these names to the EC2 public IP or load balancer:
openobserve.observability.duckdns.org
phoenix.observability.duckdns.org
otel.observability.duckdns.org
Caddy will request TLS certificates automatically when the domains resolve correctly and ports 80/443 are reachable.
Every app should send to the Collector, not directly to OpenObserve or Phoenix.
For OTLP HTTP:
OTEL_SERVICE_NAME=my-service
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_EXPORTER_OTLP_ENDPOINT=https://otel.observability.duckdns.org
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic base64_of_otel_ingest_colon_password
OTEL_TRACES_EXPORTER=otlp
OTEL_METRICS_EXPORTER=otlp
OTEL_LOGS_EXPORTER=otlpFor local raw HTTP:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic base64_of_otel_ingest_colon_passwordThe Basic header for applications is not the htpasswd hash. It is the base64 value of:
otel_ingest:your-ingest-password
Generate it with:
printf '%s' 'otel_ingest:your-ingest-password' | base64The Collector forwards the same trace to both backends. That means a trace ID created by your app should be searchable in both OpenObserve and Phoenix.
Use the same OpenTelemetry trace context across normal application spans and LLM spans:
- Do not create separate root traces for LLM calls if they are part of a request.
- Let OpenInference, LangChain, LlamaIndex, OpenAI SDK instrumentation, or your manual spans run under the active request context.
- Include
service.name,deployment.environment,service.version, and important domain identifiers liketenant.id,user.id,request.id, orconversation.idwhere allowed by your privacy policy.
In OpenObserve, use the trace view for request-level debugging and the logs view for log-to-trace correlation. In Phoenix, use the same trace ID to inspect LLM call inputs, outputs, tool calls, retrieval steps, and evaluations.
OpenObserve has its own login controlled by:
OPENOBSERVE_ROOT_EMAIL
OPENOBSERVE_ROOT_PASSWORD
OPENOBSERVE_RETENTION_DAYSPhoenix auth is enabled with:
PHOENIX_ENABLE_AUTH=True
PHOENIX_SECRET
PHOENIX_DEFAULT_ADMIN_INITIAL_PASSWORD
PHOENIX_RETENTION_DAYSFor detailed instructions on adding more users, see USER_MANAGEMENT.md.
After the first Phoenix startup, changing PHOENIX_DEFAULT_ADMIN_INITIAL_PASSWORD does not reset the admin password. Change it inside Phoenix or follow Phoenix admin password reset procedures.
The Collector requires Basic auth before accepting OTLP data:
OTEL_COLLECTOR_HTPASSWD=otel_ingest:hashed_passwordApplications send:
Authorization: Basic base64(otel_ingest:plain_password)
Do not expose OpenObserve ingestion credentials to applications. Only the Collector should know OPENOBSERVE_AUTH_HEADER.
Use these rules in production:
- Expose only Caddy on ports 80 and 443.
- Keep PostgreSQL internal to Docker.
- Keep OpenObserve and Phoenix internal to Docker.
- Keep raw Collector gRPC bound to localhost unless you explicitly need it remotely.
- Prefer OTLP HTTP over HTTPS through Caddy for remote services.
- Restrict SSH to your IP address.
- Use AWS Security Groups, OS firewall rules, or both.
Do not commit .env. It contains:
- OpenObserve admin password.
- OpenObserve ingestion auth header.
- Phoenix PostgreSQL password.
- Phoenix JWT signing secret.
- Phoenix admin bootstrap password.
- Phoenix API key.
- Collector ingest hash.
For EC2, store a second encrypted copy in AWS Systems Manager Parameter Store, Secrets Manager, or an encrypted S3 location.
OpenObserve stores local data under:
./data/openobserve
Phoenix stores durable data in PostgreSQL under:
./data/phoenix-postgres
Caddy stores TLS certificates under:
./data/caddy
Do not delete ./data unless you are intentionally resetting the stack.
The stack defaults to a 30-day retention cap for both systems:
OPENOBSERVE_RETENTION_DAYS=30
PHOENIX_RETENTION_DAYS=30OpenObserve uses its compaction retention policy to delete older stream data. Phoenix uses its default project retention policy so new projects inherit the 30-day cap unless you override them in the UI.
Backups include:
- OpenObserve local data directory.
- Phoenix PostgreSQL dump.
- Runtime config and
.env.
Run a local backup:
./scripts/backup-local.shThe backup is written to:
./backups/YYYYMMDDTHHMMSSZ
Run an S3 backup:
./scripts/backup-s3.shConfigure S3 in .env:
AWS_S3_BACKUP_URI=s3://your-bucket/observability
AWS_PROFILE=defaultThe S3 script uses:
aws s3 sync ... --sse AES256
For stronger control, enable bucket versioning, default encryption with KMS, lifecycle retention, and restricted IAM permissions.
Minimum IAM permissions for backup upload:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:ListBucket", "s3:GetBucketLocation"],
"Resource": [
"arn:aws:s3:::your-bucket",
"arn:aws:s3:::your-bucket/observability/*"
]
}
]
}For restore from S3, add s3:GetObject.
For a local server:
15 2 * * * cd /path/to/observability && ./scripts/backup-local.sh >> ./backups/backup.log 2>&1For EC2 with S3:
15 2 * * * cd /path/to/observability && ./scripts/backup-s3.sh >> ./backups/backup.log 2>&1Recommended retention:
- Local: 7 to 14 daily backups.
- S3: 30 to 90 daily backups.
- Monthly archive: 6 to 12 months if logs are business-critical.
Restore from a local backup:
./scripts/restore-local.sh ./backups/YYYYMMDDTHHMMSSZRestore the latest backup from S3 back onto the EBS-backed local directories:
./scripts/restore-s3.shRestore a specific backup folder from S3:
./scripts/restore-s3.sh YYYYMMDDTHHMMSSZThe restore process:
- Stops Collector, Phoenix, and OpenObserve.
- Replaces
data/openobservefrom backup. - Recreates the Phoenix PostgreSQL database.
- Restores the Phoenix dump.
- Starts the stack again.
Always test restore on a separate machine or directory before trusting backups.
Start:
docker compose up -dStop:
docker compose downRestart one service:
docker compose restart otel-collectorView logs:
docker compose logs -f otel-collector
docker compose logs -f openobserve
docker compose logs -f phoenixUpgrade images:
./scripts/backup-local.sh
docker compose pull
docker compose up -dDo not upgrade without a fresh backup.
Start conservative:
- Use sampling in applications for very high-volume traces.
- Avoid putting secrets, full prompts, full completions, access tokens, or private user data into spans.
- Use stable attributes with bounded cardinality.
- Avoid high-cardinality labels like raw URLs, full SQL queries, email addresses, or unbounded user text.
- Keep verbose debug logs out of production unless you are actively investigating.
Recommended attributes:
service.name
service.version
deployment.environment
http.route
http.method
http.status_code
db.system
messaging.system
llm.model_name
llm.provider
openinference.span.kind
Use OpenObserve dashboards and alerts for service health. Use Phoenix projects and datasets for LLM workflows.
Phoenix is most useful when your LLM spans include:
- Model provider and model name.
- Prompt template or prompt identifier.
- Input/output token counts.
- Latency.
- Tool calls.
- Retrieval query.
- Retrieved document IDs and scores.
- Evaluation results.
- Error messages and retry counts.
Be careful with prompt and response payloads. If they may contain secrets, customer data, or personal data, redact them before export or only export metadata.
Before exposing this stack:
- Replace all placeholder secrets in
.env. - Confirm OpenObserve login works.
- Confirm Phoenix login works.
- Create Phoenix system API key and restart Collector.
- Confirm app telemetry reaches OpenObserve.
- Confirm LLM traces reach Phoenix.
- Confirm Caddy HTTPS certificates are issued.
- Confirm Security Group exposes only 80/443 and restricted SSH.
- Run
./scripts/backup-local.sh. - Run
./scripts/backup-s3.shon EC2. - Test restore in a separate directory or host.
- Document who has admin access.
- Rotate ingestion credentials if a developer leaves or a project is compromised.
- OpenObserve OTLP ingestion: https://openobserve.ai/docs/ingestion/logs/otlp/
- OpenObserve traces: https://openobserve.ai/docs/user-guide/data-exploration/traces/traces/
- Phoenix self-hosting: https://arize.com/docs/phoenix/self-hosting/deploying-phoenix
- Phoenix configuration: https://arize.com/docs/phoenix/self-hosting/configuration
- Phoenix authentication: https://arize.com/docs/phoenix/self-hosting/authentication
- OpenTelemetry Collector configuration: https://opentelemetry.io/docs/collector/configuration/