Common issues and solutions for itsup infrastructure.
itsup status # Infrastructure status
docker ps -a # All containers
docker network ls # Networksitsup proxy logs traefik # Traefik logs
itsup svc {project} logs # Project logs
tail -f logs/*.log # System logs (access, api, monitor)itsup validate # All projects
itsup validate {project} # Specific projectitsup svc {project} restart # Restart project
itsup proxy restart traefik # Restart Traefik
itsup run # Restart everythingSymptom: itsup dns up fails with network error.
Possible Causes:
- Port 53 already in use
- Network conflict
- Docker daemon not running
Solutions:
Check port 53:
sudo netstat -tlnp | grep :53
# Or
sudo lsof -i :53If systemd-resolved using port 53:
# Disable stub resolver
sudo sed -i 's/#DNSStubListener=yes/DNSStubListener=no/' /etc/systemd/resolved.conf
sudo systemctl restart systemd-resolvedCheck Docker daemon:
sudo systemctl status docker
sudo systemctl start docker # If not runningCheck for conflicting networks:
docker network ls | grep proxynet
docker network rm proxynet # If exists but corrupted
itsup dns up # Will recreateSymptom: itsup proxy up fails or Traefik not responding.
Check Traefik logs:
itsup proxy logs traefikCommon errors and fixes:
"Error while creating certificate":
# Let's Encrypt rate limit hit
# Wait 1 hour or use staging environment
vim projects/traefik.yml
# Add:
certificatesResolvers:
letsencrypt:
acme:
caServer: https://acme-staging-v02.api.letsencrypt.org/directory"Cannot connect to Docker daemon":
# dockerproxy not running or misconfigured
itsup proxy logs dockerproxy
# Check dockerproxy is accessible
curl http://localhost:2375/version # Should return JSON
# Restart dockerproxy
itsup proxy restart dockerproxy"Address already in use :80":
# Another service using port 80
sudo netstat -tlnp | grep :80
# Stop conflicting service
sudo systemctl stop nginx # Or apache2, etc.Symptom: bin/start-api.sh fails or API not responding.
Check API logs:
tail -f logs/api.logCommon issues:
"ModuleNotFoundError":
# Missing dependencies
source .venv/bin/activate
pip install -r requirements.txt"Port 8080 already in use":
# Find process using port
sudo lsof -i :8080
# Kill or change API port"Permission denied":
# Check file permissions
ls -l bin/start-api.sh
chmod +x bin/start-api.shSymptom: itsup monitor start fails.
Common causes:
"Permission denied (eBPF)":
# Monitor requires root for eBPF
sudo itsup monitor start"OpenSnitch database not found":
# Check OpenSnitch is installed
sudo systemctl status opensnitch
# Check database exists
ls -l /var/lib/opensnitch/opensnitch.sqlite3
# Start without OpenSnitch integration
itsup monitor start # Without --use-opensnitch flagSymptom: itsup apply {project} fails with error.
Check deployment logs:
itsup apply {project} --verboseCommon errors:
"Service '{service}' failed to build":
# Build context issue or Dockerfile error
# Check Dockerfile syntax
docker build projects/{project}/
# Check build context
ls projects/{project}/"Cannot start service: port already allocated":
# Port conflict with another container
docker ps | grep {port}
# Change port in docker-compose.yml or stop conflicting container"Network 'proxynet' not found":
# DNS stack not running
itsup dns up
# Verify network exists
docker network ls | grep proxynet"Error while fetching server API version":
# Docker daemon not running or not accessible
sudo systemctl status docker
sudo systemctl start dockerSymptom: Container starts but immediately exits and restarts.
Check container logs:
itsup svc {project} logs {service}Common causes:
Application crash on startup:
- Check logs for error messages
- Verify environment variables are set correctly
- Test image manually:
docker run -it {image} sh
Health check failing:
# Check health check status
docker inspect {container} | jq '.[0].State.Health'
# Disable health check temporarily (for debugging)
vim projects/{project}/docker-compose.yml
# Comment out healthcheck section
itsup apply {project}Missing volume or file:
# Check volume mounts
docker inspect {container} | jq '.[0].Mounts'
# Verify host paths exist
ls -l /path/to/volumeSymptom: Container running but domain returns 404 or connection refused.
Check step-by-step:
1. Verify container is running:
itsup svc {project} ps
docker ps | grep {project}2. Check container network:
docker inspect {container} | jq '.[0].NetworkSettings.Networks'
# Should show connection to proxynet3. Check Traefik sees the service:
itsup proxy logs traefik | grep {project}
# Should show "Adding route" or "Server added"4. Check Traefik labels:
docker inspect {container} | jq '.[0].Config.Labels' | grep traefik
# Should show traefik.enable=true and routing labels5. Test direct access (bypass Traefik):
# Find container IP
docker inspect {container} | jq -r '.[0].NetworkSettings.Networks.proxynet.IPAddress'
# Test directly
curl http://{container-ip}:{port}6. Test via Traefik:
# Test with Host header
curl -H "Host: {domain}" http://localhost/
# Should return service responseCommon fixes:
Missing Traefik labels:
# Regenerate config
itsup apply {project}
# Verify labels in generated file
grep "traefik.enable" upstream/{project}/docker-compose.ymlWrong domain in ingress.yml:
vim projects/{project}/ingress.yml
# Verify domain matches DNS/host file
itsup apply {project}Service not listening on configured port:
# Check what port service actually uses
docker exec {container} netstat -tlnp
# Update ingress.yml to match actual portSymptom: Container can't reach internet or external APIs.
Check container connectivity:
# Test DNS resolution
docker exec {container} nslookup google.com
# Test internet connectivity
docker exec {container} ping -c 3 8.8.8.8
# Test HTTPS
docker exec {container} curl https://www.google.comCommon causes:
DNS not working:
# Check container's DNS config
docker inspect {container} | jq '.[0].HostConfig.Dns'
# Use Docker's default DNS
vim projects/{project}/docker-compose.yml
# Remove any custom DNS settingsFirewall blocking:
# Check iptables rules
sudo iptables -L DOCKER-USER -n -v
# Check if monitor blocked the connection
itsup monitor logs | grep {container}
# Whitelist destination
echo "destination-ip-or-domain" >> config/monitor-whitelist.txt
itsup monitor restartNetwork isolation:
# Verify container has internet access
docker run --rm --network {network} alpine ping -c 3 8.8.8.8
# If fails, check Docker network configuration
docker network inspect {network}Symptom: Container A can't reach container B.
Check both containers are on same network:
docker network inspect proxynet
# Should show both containersTest connectivity:
# From container A
docker exec {container-a} ping {container-b}
docker exec {container-a} curl http://{container-b}:{port}Common fixes:
Not on same network:
# In docker-compose.yml
services:
app:
networks:
- proxynet
- backend
db:
networks:
- backend # Add proxynet if neededWrong hostname:
# Use service name as hostname (not container name)
# Correct: http://db:5432
# Wrong: http://project-db-1:5432Symptom: HTTPS returns "certificate not valid" or "NET::ERR_CERT_AUTHORITY_INVALID".
Check certificate status:
itsup proxy logs traefik | grep -i certificateCommon causes:
Rate limit hit:
Error while obtaining certificate: too many certificates already issued
Fix: Wait 1 hour or use staging server (see Proxy Stack Won't Start).
Challenge failed:
Error while obtaining certificate: challenge failed
Fix:
# Verify domain DNS points to server
nslookup {domain}
# Verify port 80 is accessible from internet
curl http://{domain}
# Check Traefik logs for specific challenge error
itsup proxy logs traefik | grep -i challengeFix by forcing renewal:
# Remove certificate (forces re-issue)
rm proxy/traefik/acme.json
itsup proxy restart traefik
# Watch certificate issuance
itsup proxy logs traefik | grep -i certificateSymptom: HTTPS works but browser shows "certificate expired".
Check certificate expiry:
echo | openssl s_client -connect {domain}:443 2>/dev/null | openssl x509 -noout -datesAuto-renewal should handle this. If not:
Force renewal:
rm proxy/traefik/acme.json
itsup proxy restart traefikCheck renewal is working:
# Traefik should log renewal attempts
itsup proxy logs traefik | grep -i renewSymptom: HTTPS site loads but browser shows "mixed content" warnings.
Cause: Site serving HTTP resources on HTTPS page.
Fix in application:
- Use protocol-relative URLs:
//cdn.example.com/script.js - Or force HTTPS:
https://cdn.example.com/script.js - Add middleware to Traefik to enforce HTTPS headers
Add security headers:
# In projects/traefik.yml
http:
middlewares:
security-headers:
headers:
forceSTSHeader: true
stsSeconds: 31536000
stsIncludeSubdomains: true
contentSecurityPolicy: "upgrade-insecure-requests"# In ingress.yml
ingress:
- service: web
middleware: [security-headers]Symptom: Container starts but environment variables are empty or undefined.
Check secrets file exists:
ls -l secrets/{project}.txt
cat secrets/{project}.txt | grep {VAR}Decrypt if encrypted:
itsup decrypt {project}Verify variable in compose file:
grep {VAR} projects/{project}/docker-compose.yml
# Should show: - VAR=${VAR}Check container environment:
docker exec {container} env | grep {VAR}Force reload:
itsup svc {project} down
itsup apply {project}Symptom: itsup encrypt or itsup decrypt fails.
Check SOPS is installed:
sops --versionCheck SOPS configuration:
cat .sops.yamlCheck GPG/age keys:
# For GPG
gpg --list-secret-keys
# For age
ls -l ~/.config/sops/age/keys.txtManual decryption (debug):
sops -d secrets/{project}.enc.txtIf corrupt, restore from git:
git checkout HEAD -- secrets/{project}.enc.txt
itsup decrypt {project}Symptom: Server CPU constantly high.
Check which container:
docker stats --no-stream
# Shows CPU usage per containerInspect container:
# Check process list
docker exec {container} ps aux
# Check logs for errors
itsup svc {project} logs {service}Common causes:
Restart loop: Container crashing and restarting constantly
- Fix: Check logs, fix application error
Infinite loop: Application bug causing CPU spin
- Fix: Stop container, fix bug, redeploy
Resource exhaustion: Container needs more CPU
- Fix: Add resource limits or increase host capacity
Symptom: Server memory constantly high or OOM errors.
Check which container:
docker stats --no-stream
# Shows memory usage per containerAdd memory limits (prevent one container from hogging all memory):
# In docker-compose.yml
services:
app:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256MCheck for memory leaks:
# Monitor over time
watch -n 5 "docker stats --no-stream | grep {container}"Symptom: Application responds slowly or times out.
Check Traefik logs:
tail -f logs/access.log
# Look for response times (last column in CLF format)Test direct vs through Traefik:
# Direct (should be fast)
time curl http://{container-ip}:{port}
# Through Traefik (compare)
time curl https://{domain}If Traefik is slow:
- Check middleware (auth, rate limiting can slow requests)
- Check Traefik logs for errors
- Check Traefik resource usage
If application is slow:
- Check application logs
- Check database connection
- Profile application
Symptom: Any docker command hangs or fails.
Check daemon status:
sudo systemctl status dockerRestart daemon:
sudo systemctl restart dockerCheck logs:
sudo journalctl -u docker -n 100Symptom: "no space left on device" errors.
Check disk usage:
df -h
docker system df # Docker-specific disk usageClean up Docker:
# Remove stopped containers
docker container prune -f
# Remove unused images
docker image prune -a -f
# Remove unused volumes
docker volume prune -f
# Remove unused networks
docker network prune -f
# All-in-one cleanup
docker system prune -a --volumes -fFor itsup containers specifically:
itsup down --clean # Removes stopped itsup containersSymptom: docker rm fails with "container is running" or "device or resource busy".
Force stop and remove:
docker stop -t 1 {container} # Stop with 1s timeout
docker rm -f {container} # Force removeIf still fails:
# Check if container is being recreated
docker events | grep {container}
# Restart Docker daemon
sudo systemctl restart dockerBefore asking for help, collect:
- System information:
uname -a
docker --version
docker compose version- itsup version:
itsup --version
git log -1- Status:
itsup status
docker ps -a
docker network ls- Logs (with verbose output):
itsup apply {project} --verbose > debug.log 2>&1- Configuration (redact secrets):
cat projects/{project}/docker-compose.yml
cat projects/{project}/ingress.yml- GitHub Issues: https://github.com/user/srv/issues
- Project README: Check for troubleshooting section
- Docker Docs: https://docs.docker.com/
- Traefik Docs: https://doc.traefik.io/traefik/
-
Editing
upstream/instead ofprojects/- Always edit source (
projects/), not generated artifacts
- Always edit source (
-
Forgetting to decrypt secrets
- Run
itsup decrypt {project}after cloning repo
- Run
-
Not loading secrets at deployment
itsup applyloads secrets automatically- Manual
docker composecommands needenvparameter
-
Committing plaintext secrets
- Only commit
.enc.txtfiles .txtfiles are gitignored
- Only commit
-
Not restarting after config changes
itsup applyregenerates and restarts- Manual changes to
upstream/are lost on next apply