Skip to content

cloudon-one/kubelaunch-essentials

Repository files navigation

Terraform Terragrunt AWS EKS Kubernetes Security License

KubeLaunch Essentials

Production-ready Kubernetes platform on AWS EKS with integrated security controls, GitOps automation, service mesh, and observability — deployed entirely via Infrastructure as Code.


Table of Contents


Architecture

graph TB
    subgraph AWS["AWS Infrastructure"]
        OIDC["GitHub OIDC"] --> S3["S3 State Backend<br/>KMS + DynamoDB"]
        Lambda["Secrets Rotation<br/>Lambda"] --> SM["Secrets Manager"]
        Audit["Security Audit<br/>CloudWatch"]
    end

    subgraph Core["1. Core Platform"]
        Karpenter["Karpenter"] & ExDNS["External DNS"] & CertMgr["Cert Manager"] & ExtSec["External Secrets"]
    end

    subgraph Mesh["2. Service Mesh"]
        Istio["Istio mTLS"] & Kong["Kong Gateway"] & Jaeger["Jaeger"]
    end

    subgraph Sec["3. Security"]
        Kyverno["Kyverno"] & Falco["Falco"] & Velero["Velero"]
    end

    subgraph Obs["4. Observability"]
        Loki["Loki Stack"] & Kubecost["Kubecost"] & Compliance["CIS Scanner"]
    end

    subgraph Tools["5. Platform Tools"]
        ArgoCD["ArgoCD"] & Atlantis["Atlantis"] & Vault["Vault"] & Airflow["Airflow"]
    end

    SM --> ExtSec
    ExtSec --> ArgoCD & Vault
    Kyverno -.->|Policy| Tools & Mesh
    Falco -.->|Monitor| Core
    Velero -.->|Backup| Tools
Loading

Deployment order: Core Platform -> Service Mesh -> Security -> Observability -> Platform Tools. Destroy in reverse.


Component Matrix

Layer Component Version Purpose
Core Platform Karpenter v1.10.0 Node auto-provisioning
External DNS - DNS automation
Cert Manager v1.20.0 Certificate lifecycle
External Secrets v2.2.0 AWS Secrets sync
Service Mesh Istio v1.29.1 mTLS, traffic management
Kong Gateway v3.9.1 API gateway
Jaeger v2.16.0 Distributed tracing
Security Kyverno v3.7.1 Admission control (4 policies)
Falco v8.0.1 Runtime threat detection (eBPF)
Velero v12.0.0 Backup & disaster recovery
Observability Loki Stack v3.7.1 Log aggregation
Kubecost v3.0.3 FinOps / cost monitoring
Compliance Scanner v1.2.0 CIS 1.8 benchmark scanning
Platform Tools ArgoCD v3.3.6 GitOps deployment
Atlantis - Terraform PR automation
Vault v1.21.4 Secrets management
Airflow v3.1.8 Workflow orchestration
AWS Infra State Backend - S3 + DynamoDB + KMS
GitHub OIDC - Federated CI/CD auth
Secrets Rotation - Lambda auto-rotation
Security Audit - CloudWatch monitoring

Repository Structure

.
├── aws-infrastructure/              # AWS foundation (Terraform, local modules)
│   ├── state-backend/               # S3 + DynamoDB + KMS for state
│   ├── github-oidc/                 # GitHub Actions OIDC federation
│   ├── external-secrets-iam/        # IRSA roles for External Secrets
│   ├── secrets-rotation-lambda/     # Automated secrets rotation
│   └── security-audit-automation/   # CloudWatch security monitoring
│
├── k8s-platform-tools/              # Kubernetes platform (Terragrunt, remote modules)
│   ├── core-platform/               # Karpenter, External DNS, Cert Manager, External Secrets
│   ├── service-mesh/                # Istio, Kong, Jaeger
│   ├── security/                    # Kyverno, Falco, Velero
│   ├── observability/               # Loki, Kubecost, Compliance Scanner
│   ├── platform-tools/              # ArgoCD, Atlantis, Vault, Airflow
│   ├── ci-cd-templates/             # Reusable GitHub Actions workflows
│   ├── github-actions-templates/    # Language-specific test coverage actions
│   ├── common.hcl                   # Shared Terragrunt config (state, provider, versions)
│   └── platform_vars.yaml           # Single source of truth for all config
│
└── .github/workflows/               # OIDC-based CI/CD pipeline

Quick Start

Prerequisites

Tool Version Purpose
Terraform >= 1.12.2 Infrastructure provisioning
Terragrunt >= 1.0.0 Configuration orchestration
AWS CLI v2 AWS authentication
kubectl >= 1.28 Cluster access
Helm v3.x Chart management

Deploy

# 1. Configure platform variables
cd k8s-platform-tools
cp platform_vars.yaml.example platform_vars.yaml  # Edit with your values

# 2. Bootstrap AWS infrastructure
cd ../aws-infrastructure/state-backend && terraform init && terraform apply
cd ../github-oidc && terragrunt apply
cd ../external-secrets-iam && terragrunt apply

# 3. Deploy platform layers (in order)
cd ../../k8s-platform-tools/core-platform && terragrunt run -a -- apply
cd ../service-mesh && terragrunt run -a -- apply
cd ../security && terragrunt run -a -- apply
cd ../observability && terragrunt run -a -- apply
cd ../platform-tools && terragrunt run -a -- apply

# 4. Deploy operational security
cd ../../aws-infrastructure/security-audit-automation && terragrunt apply
cd ../secrets-rotation-lambda && terragrunt apply

Destroy (reverse order)

cd k8s-platform-tools
terragrunt run -a --working-dir platform-tools -- destroy
terragrunt run -a --working-dir observability -- destroy
terragrunt run -a --working-dir security -- destroy
terragrunt run -a --working-dir service-mesh -- destroy
terragrunt run -a --working-dir core-platform -- destroy

Configuration

All platform configuration lives in k8s-platform-tools/platform_vars.yaml with three sections:

YAML Path Components
Platform.Tools.<name>.inputs Core platform, service mesh, platform tools
Platform.Security.<name>.inputs Kyverno, Falco, Velero
Platform.Observability.<name>.inputs Compliance Scanner
common.* Shared values (region, VPC, EKS, tags)

Key convention: Component directory name must match the YAML key exactly (resolved via basename(get_terragrunt_dir())).

Environment Selection

ENV=dev terragrunt apply    # default
ENV=prod terragrunt apply   # production

Secrets Management

All sensitive values are stored in AWS Secrets Manager and referenced as:

admin_password: "aws-secretsmanager:///dev/argocd/admin-password"

Secrets are synced to Kubernetes via External Secrets Operator with IRSA.


Security Hardening

Phase 1: Foundation

Control Implementation
State encryption S3 KMS + DynamoDB KMS with key rotation
State locking DynamoDB with prevent_destroy lifecycle
CI/CD auth GitHub OIDC federation (no long-lived keys)
Secrets access IRSA least-privilege per component

Phase 2: Runtime

Control Implementation
Admission control Kyverno: approved registries, no latest tag, resource limits, security contexts
Threat detection Falco eBPF: privileged containers, sensitive file access, C2 connections
Backup & DR Velero: daily full, hourly critical, weekly maintenance with S3+KMS
Network policies Default-deny with explicit allow for DNS, k8s API

Phase 3: Operational

Control Implementation
Secrets rotation Lambda-based monthly rotation with SNS notifications
Compliance scanning Weekly CIS 1.8 benchmarks with S3 reports
Security monitoring CloudWatch alarms: failed auth (>10/5min), privilege escalation, policy violations
Security dashboard Centralized CloudWatch dashboard

CI/CD Integration

Main Workflow (.github/workflows/terragrunt-plan-apply-oidc.yaml)

OIDC-authenticated pipeline with manual dispatch:

Workflow Dispatch → OIDC Auth → Init → Validate → Plan → [Approval] → Apply
Feature Detail
Authentication AWS OIDC (no static credentials)
Environments dev, qa, prod (selectable)
Approval Required before apply via GitHub Issues
Artifacts Plan output stored 30 days
Tools Terraform 1.12.2, Terragrunt 1.0.0

Reusable Templates (k8s-platform-tools/ci-cd-templates/)

Template Purpose
terragrunt-plan-apply.yaml Full pipeline: TFSEC, Checkov, Infracost, drift detection
reusable-docker-build.yaml Multi-platform Docker builds with Trivy scanning
terragrunt-fmt-commit.yaml Auto-format with TFLint, PR creation
get-env-func.yaml Branch-to-environment mapping

Operations

Monitoring

# Security dashboard
cd aws-infrastructure/security-audit-automation && terragrunt output dashboard_url

# Compliance reports
aws s3 ls s3://<owner>-<env>-compliance-reports/

# Falco alerts
kubectl logs -n falco -l app.kubernetes.io/name=falco | grep CRITICAL

# Kyverno policy reports
kubectl get clusterpolicyreport -o yaml

# Velero backup status
velero backup get

State Management

terragrunt state list                     # List resources
terragrunt state pull > backup.tfstate    # Backup state
terragrunt force-unlock <LOCK_ID>         # Unlock stuck state

Troubleshooting

Problem Solution
State locked terragrunt force-unlock <LOCK_ID>
Config not applied Verify directory name matches platform_vars.yaml key
Module fetch fails Check git access to github.com/cloudon-one/k8s-platform-modules, verify ref=dev exists
OIDC auth fails aws iam list-open-id-connect-providers and check role trust policy
Policy blocks deploy Set Kyverno to Audit: kubectl patch clusterpolicy <name> -p '{"spec":{"validationFailureAction":"Audit"}}'
Secrets not syncing kubectl describe externalsecret <name> -n <namespace>
Dependency errors Verify parent layer is deployed; check deployment order

Documentation

Document Description
Security Review Initial security audit findings
Security Implementation Plan Complete security roadmap
Phase 1 Deployment Foundation security guide
Phase 2 Deployment Runtime security guide
Phase 3 Deployment Operational security guide
IaC Summary Infrastructure as Code overview

Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/my-feature)
  3. Follow existing Terragrunt/Terraform patterns
  4. Update platform_vars.yaml for configuration changes
  5. Open a Pull Request

License

MIT License - see LICENSE for details.


Built for production Kubernetes deployments

About

A preconfigured Kubernetes environment with Terragrunt-based automation, service mesh, and observability baked in—ready to deploy in minutes.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Contributors

Languages