This document covers the full deployment and teardown lifecycle on AWS EKS.
For local development setup, see the Quick Start section in README.
| Tool | Install |
|---|---|
| AWS CLI v2 | brew install awscli |
| Terraform >= 1.0 | brew install terraform |
| kubectl | brew install kubectl |
| Helm | brew install helm |
| Docker | Docker Desktop / OrbStack |
| jq | brew install jq |
deploy-all.sh will verify all tools are installed before proceeding.
- An AWS account with permissions to create VPC, EKS, RDS, IAM, ACM, Route 53, and ECR resources.
- AWS CLI configured with valid credentials (
aws configure, SSO, or environment variables). - A registered domain managed via Route 53 (default:
kolya.fun).
# Option 1: SSO profile
aws sso login --profile <your-profile>
export AWS_PROFILE=<your-profile>
# Option 2: environment variables
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
# Set target region
export AWS_REGION=us-west-1 # or whatever region you wantOne certificate covers both domains (kbp.kolya.fun and api.kbp.kolya.fun) using Subject Alternative Names (SAN). The cert must be in the same region as the ALBs.
CERT_ARN=$(aws acm request-certificate \
--region $AWS_REGION \
--domain-name kbp.kolya.fun \
--subject-alternative-names "api.kbp.kolya.fun" \
--validation-method DNS \
--query 'CertificateArn' \
--output text)
echo "Certificate ARN: $CERT_ARN"ACM generates one CNAME record per domain. Retrieve them:
aws acm describe-certificate \
--region $AWS_REGION \
--certificate-arn $CERT_ARN \
--query 'Certificate.DomainValidationOptions[*].{Domain:DomainName,Name:ResourceRecord.Name,Value:ResourceRecord.Value}' \
--output tableYou will see two records (one for each domain):
-----------------------------------------------------------------------
| DescribeCertificate |
+---------------------+----------------------+------------------------+
| Domain | Name | Value |
+---------------------+----------------------+------------------------+
| kbp.kolya.fun | _abc123.kbp... | _def456.acm-... |
| api.kbp.kolya.fun | _abc123.api.kbp... | _ghi789.acm-... |
+---------------------+----------------------+------------------------+
Note: Both domains may share the same CNAME record if they share the same root domain. In that case only one record needs to be added.
Get your hosted zone ID first:
ZONE_ID=$(aws route53 list-hosted-zones-by-name \
--dns-name kolya.fun \
--query 'HostedZones[0].Id' \
--output text | cut -d'/' -f3)
echo "Hosted Zone ID: $ZONE_ID"Add each CNAME record returned in the previous step. Repeat for each unique record:
aws route53 change-resource-record-sets \
--hosted-zone-id $ZONE_ID \
--change-batch '{
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "<Name-from-table-above>",
"Type": "CNAME",
"TTL": 300,
"ResourceRecords": [{"Value": "<Value-from-table-above>"}]
}
}
]
}'DNS propagation usually takes 1–5 minutes. Poll until status is ISSUED:
while true; do
STATUS=$(aws acm describe-certificate \
--region $AWS_REGION \
--certificate-arn $CERT_ARN \
--query 'Certificate.Status' \
--output text)
echo "$(date '+%H:%M:%S') Status: $STATUS"
[[ "$STATUS" == "ISSUED" ]] && break
sleep 15
done
echo "Certificate issued: $CERT_ARN"Record $CERT_ARN — deploy-all.sh will prompt for it during Step 4 (K8s app deployment).
echo $CERT_ARN
# arn:aws:acm:us-west-1:612674025488:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxaws s3 mb s3://tf-state-<account-id>-${AWS_REGION}-kolya --region $AWS_REGION./deploy-all.sh --region $AWS_REGIONThe script will interactively:
- Validate AWS credentials and region
- Configure
terraform.tfvars(Step 0) — auto-detects account/region/feature flags, prompts for domains - Prompt for S3 backend config (bucket name) — generates
iac/providers.tffrom template - Select or create a Terraform workspace
- Run
terraform init+plan+apply(VPC, EKS, RDS, etc.) - Deploy Helm charts (ALB Controller, Karpenter, Metrics Server)
- Build and push Docker images to ECR (domains read from
terraform.tfvars) - Deploy K8s application (config generated from
terraform.tfvars, secrets via ESO) - Auto-enable WAF after ALBs are ready
./deploy-all.sh --step 5Each region uses its own Terraform state — no conflicts. Resource names include the region, so resources don't overlap.
Key step: reset providers.tf
If iac/providers.tf already exists (pointing to the old region's state bucket), delete it first:
rm iac/providers.tfdeploy-all.sh will then prompt you to configure the new region's S3 backend.
Then follow Section A from Step 1.
# Run a specific step only
./deploy-all.sh --step 0 # Configure terraform.tfvars
./deploy-all.sh --step 1 # Terraform only
./deploy-all.sh --step 2 # Helm only
./deploy-all.sh --step 3 # Docker build only
./deploy-all.sh --step 4 # K8s app deploy only
./deploy-all.sh --step 5 # Toggle Global Accelerator
# Skip confirmations (CI/CD)
./deploy-all.sh --yes
# K8s management
cd k8s && ./deploy.sh status # View app status
cd k8s && ./deploy.sh logs # View logs
cd k8s && ./deploy.sh update # Update app configiac/providers.tf determines which region's state Terraform operates on. To switch:
# 1. Delete current providers.tf
rm iac/providers.tf
# 2. Re-run deploy-all.sh for the target region
./deploy-all.sh --region <target-region>
# It will prompt for the new S3 backend configOr manually:
# 1. Delete current providers.tf
rm iac/providers.tf
# 2. Regenerate providers.tf (edit from template)
cd iac
# Enter the target region's S3 bucket and region
# 3. Re-init Terraform
terraform init -reconfigure| Concern | How it's handled |
|---|---|
| Account / region variables | Written to terraform.tfvars by Step 0 (single source of truth) |
| Domain names | Written to terraform.tfvars by Step 0; read by all subsequent steps |
| Terraform backend | Generated from providers.tf.template on first run |
| Terraform workspace | Interactive selection with confirmation at each step |
| WAF / GA / Cognito toggles | Persisted in terraform.tfvars; auto-detected from state by Step 0 |
| WAF ordering | Auto-enabled in terraform.tfvars after ALBs are ready in Step 4 |
| Global Accelerator | Disabled by default, toggle via --step 5 (updates terraform.tfvars) |
| Cognito callback URLs | Auto-derived from frontend_domain in terraform.tfvars |
| ESO credentials | Pod Identity → ESO controller in external-secrets namespace |
Use destroy.sh to safely tear down all resources for a specific account, region, and workspace.
# Interactive mode (prompts for everything)
./destroy.sh
# Specify account and region
./destroy.sh --account 123456789012 --region us-west-1
# Specify all parameters
./destroy.sh --account 123456789012 --region us-west-1 --workspace kolya- Verify AWS identity — validates credentials, confirms account ID, region, and workspace
- Check EKS cluster — if the cluster exists, connects and lists all resources in namespace
kbp - Clean up K8s resources — deletes Ingress first (triggers ALB cleanup), waits 30s, then deletes ExternalSecrets, remaining resources, and namespace
- Configure Terraform backend — prompts for S3 bucket if
providers.tfis missing - Verify workspace — ensures the workspace exists in the backend
- Run
terraform plan -destroy— shows what will be destroyed - Final confirmation — requires typing
destroyto proceed - Run
terraform destroy— destroys all infrastructure
- K8s resources (especially Ingress/ALB) must be deleted before Terraform destroy, otherwise ALBs and target groups will block Terraform
- The script handles this automatically by cleaning up K8s resources first
- If the EKS cluster doesn't exist (already destroyed), K8s cleanup is skipped
--accountand--regionhave the highest priority; if provided they skip auto-detect but still verify against current credentials
# 1. Ensure AWS credentials are configured for the target account
export AWS_PROFILE=my-profile
aws sso login --profile my-profile
# 2. Run destroy
./destroy.sh --account 123456789012 --region us-west-1 --workspace kolya
# 3. (Optional) Delete the S3 state bucket if no longer needed
aws s3 rb s3://tf-state-123456789012-us-west-1-kolya --force --region us-west-1The Terraform workspace (prod vs anything else) determines resource sizing:
| Category | Setting | Non-Prod | Prod |
|---|---|---|---|
| Backend Pod | CPU request / limit | 100m / 500m | 200m / 1000m |
| Memory request / limit | 256Mi / 512Mi | 512Mi / 1024Mi | |
| HPA min replicas | 1 | 2 | |
| Frontend Pod | CPU request / limit | 50m / 200m | 100m / 500m |
| Memory request / limit | 128Mi / 256Mi | 256Mi / 512Mi | |
| HPA min replicas | 1 | 2 | |
| EKS Core Nodes | Instance type | t4g.small |
t4g.medium |
| EBS volume size | 30 GB | 100 GB | |
| Karpenter Nodes | Instance category | t (t4g) |
m (m7g) |
| EBS volume size | 30 GB | 100 GB | |
| CPU limit | 100 | 1000 | |
| Memory limit | 100 Gi | 1000 Gi | |
| RDS Aurora | Deletion protection | Disabled | Enabled |
| Backup retention (days) | 1 | 7 | |
| Preferred backup window | Not set | 03:00-04:00 UTC | |
| Copy tags to snapshot | No | Yes | |
| Skip final snapshot | Yes | No | |
| Apply immediately | Yes | No | |
| CloudWatch log exports | None | ["postgresql"] |
|
| Monitoring interval (sec) | 0 (disabled) | 60 | |
| Performance Insights | Disabled | Enabled | |
| Cognito | Advanced security mode | AUDIT |
ENFORCED |
| Deletion protection | Disabled | Enabled | |
| Global Accelerator | Flow logs | Disabled | Enabled |
The following environment variables control backend runtime behavior. They can be set in the K8s ConfigMap or via ESO-managed secrets.
| Variable | Type | Default | Description |
|---|---|---|---|
KBR_STREAM_FIRST_CONTENT_TIMEOUT |
int | 600 |
Seconds to wait for the first content chunk after a stream starts. If exceeded, the request fails over to the next region/model. Set to 0 to disable failover. |
KBR_STREAM_MODEL_FALLBACK_CHAIN |
string | "" |
Comma-separated model fallback chain for Level 2 degradation. Example: anthropic.claude-opus-4-0-20250514-v1:0,anthropic.claude-sonnet-4-20250514-v1:0. Empty string disables model degradation. |
Logs include a [token_name] field that identifies which API key produced each log line, enabling per-key filtering.
%(asctime)s - %(name)s - %(levelname)s - [%(token_name)s] %(message)s
When no token context is available the field shows [-].
Example output:
2026-04-11 08:23:01,234 - kolya_br_proxy.router - INFO - [my-team-key] streaming request to us-west-2
2026-04-11 08:23:05,678 - kolya_br_proxy.router - WARNING - [-] health check from unknown caller
AWS Global Accelerator routes traffic over the AWS backbone network, reducing latency for geographically distant users by 40-60%.
Important: Global Accelerator requires ALBs created in Step 4. Always run Steps 1-4 first.
./deploy-all.sh --step 5The script detects the current GA state and offers the appropriate action (enable or disable).
| Service | GA Port | ALB Port | Protocol |
|---|---|---|---|
| Frontend | 443 | 443 | HTTPS |
| Frontend | 80 | 80 | HTTP |
| API | 8443 | 443 | HTTPS |
| API | 8080 | 80 | HTTP |
GA_DNS=$(terraform output -raw global_accelerator_dns_name)
# kbp.kolya.fun CNAME $GA_DNS
# ga-api.kbp.kolya.fun CNAME $GA_DNS| Component | Monthly Cost |
|---|---|
| Fixed fee | $18.00 |
| Data transfer (100 GB) | $1.50 |
| Total (typical) | ~$19.50 |
After Ingress resources create ALBs, configure DNS records:
# Get ALB addresses
kubectl get ingress -n kbp| Record | Type | Value |
|---|---|---|
kbp.kolya.fun |
CNAME | Frontend ALB hostname |
api.kbp.kolya.fun |
CNAME | API ALB hostname |
# Local
cd backend && uv run alembic upgrade head
# On EKS (uv is not in the production image, use python directly)
kubectl exec -it deployment/backend -n kbp -- alembic upgrade head
# Create a new migration (local only)
cd backend && uv run alembic revision --autogenerate -m "describe your change"kubectl rollout undo deployment/backend -n kbp
kubectl rollout undo deployment/frontend -n kbp./deploy-all.sh --step 5 # detects GA is enabled, offers to disablekubectl get pods -n kube-system | grep aws-load-balancer
kubectl logs -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller
kubectl describe ingress -n kbpkubectl describe pod <pod-name> -n kbp
kubectl logs <pod-name> -n kbpCommon causes: image pull failure (ECR permissions), config error (check ESO sync), insufficient resources (check Karpenter).
kubectl get secret backend-secrets -n kbp -o yaml
kubectl run -it --rm debug --image=postgres:15 --restart=Never -- \
psql "postgresql://postgres:PASSWORD@RDS_ENDPOINT:5432/DATABASE" # pragma: allowlist secretkubectl top nodes
kubectl top pods -n kbp
kubectl rollout restart deployment metrics-server -n kube-system