jmjava
diff --git a/‎.gitattributes‎
Lines changed: 1 addition & 0 deletions b/‎.gitattributes‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 7 additions & 0 deletions b/‎.gitignore‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 22 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 29 additions & 13 deletions b/‎README.md‎
Lines changed: 29 additions & 13 deletions
diff --git a/‎docs/AGENT-REGRESSION.md‎
Lines changed: 56 additions & 0 deletions b/‎docs/AGENT-REGRESSION.md‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎docs/CUSTOMIZATION.md‎
Lines changed: 4 additions & 1 deletion b/‎docs/CUSTOMIZATION.md‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎docs/DAG-AND-PROPAGATION.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/DAG-AND-PROPAGATION.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/ENVIRONMENTS-AND-CLUSTERS.md‎
Lines changed: 23 additions & 0 deletions b/‎docs/ENVIRONMENTS-AND-CLUSTERS.md‎
Lines changed: 23 additions & 0 deletions
diff --git a/‎docs/GITHUB-PAGES.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/GITHUB-PAGES.md‎
Lines changed: 2 additions & 0 deletions
@@ -1 +1,2 @@
 docs/demos/recordings/*.mp4 filter=lfs diff=lfs merge=lfs -text
+docs/assets/**/*.png filter=lfs diff=lfs merge=lfs -text
@@ -9,6 +9,13 @@ credentials.json
 
 # IDE
 .idea/
+.cursor/
+
+# Demo toolchain — regenerable outputs (Manim, VHS, local test runs)
+docs/demos/.venv/
+docs/demos/animations/media/
+docs/demos/terminal/rendered/
+docs/demos/test-results/
 # .vscode/ checked in for launch configs (intercept step-debug)
 *.swp
 *.swo
 
@@ -0,0 +1,22 @@
+# Agent instructions (tekton-dag)
+
+## Regression — iterate until complete
+
+Do **not** stop after a single partial test run. Follow **[docs/AGENT-REGRESSION.md](docs/AGENT-REGRESSION.md)**:
+
+- Run **`bash scripts/run-regression-agent.sh`** (or **`run-regression-agent-full.sh`** if Results + DB must pass).
+- **Loop**: fix failures → re-run until **`regression exit code: 0`** and done criteria in the doc are met.
+
+## Quick commands
+
+| Intent | Command |
+|--------|---------|
+| Best effort for current env | `bash scripts/run-regression-agent.sh` |
+| Strict + Tekton Results | `bash scripts/run-regression-agent-full.sh` |
+| Timestamped log + correct exit code | `bash scripts/run-regression-stream.sh …` |
+
+Python bootstrap: see [docs/REGRESSION.md](docs/REGRESSION.md).
+
+## Cursor
+
+Rule **regression-iterate** (always on) reinforces the same behavior in `.cursor/rules/regression-iterate.mdc`.
@@ -4,17 +4,31 @@ Standalone Tekton pipeline system for **local development and proof-of-concept**
 
 ## Demo Videos
 
-🎬 **[Watch all videos on GitHub Pages →](https://jmjava.github.io/tekton-dag/)**  
+🎬 **GitHub Pages (all segments + players):** [jmjava.github.io/tekton-dag/](https://jmjava.github.io/tekton-dag/)  
 *Publishing & 404 troubleshooting: [docs/GITHUB-PAGES.md](docs/GITHUB-PAGES.md).*
 
-| Video | Description | Duration |
-|-------|-------------|----------|
-| 📹 Architecture Overview | System architecture, DAG model, polyglot support, pipelines | 2m 40s |
-| 📹 Intercept Routing | PR vs normal traffic routing, header-based interception | 2m 4s |
-| 📹 Local Debugging | mirrord integration, IDE breakpoints, live cluster debugging | 2m 0s |
-| 📹 Multi-Team Helm | Helm chart deployment, team isolation, custom hooks | 2m 4s |
-
-*All videos are generated programmatically from source files. See [Milestone 8](milestones/milestone-8.md) for details.*
+Each row links to the **in-browser player** on Pages (`#seg-…`) and to the **composed MP4** in the repo. Anchors match [docs/index.html](docs/index.html).
+
+| # | Video | Description | Duration | Watch (Pages) | MP4 in repo |
+|---|-------|-------------|----------|---------------|-------------|
+| 01 | Architecture overview | System architecture, DAG model, polyglot support, pipelines | ~2m 46s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-01) | [`01-architecture.mp4`](docs/demos/recordings/01-architecture.mp4) |
+| 02 | Quick start | Kind, Tekton, images, tasks (VHS) | ~1m 50s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-02) | [`02-quickstart.mp4`](docs/demos/recordings/02-quickstart.mp4) |
+| 03 | Bootstrap dataflow | Stack bootstrap PipelineRun walkthrough | ~2m 30s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-03) | [`03-bootstrap-dataflow.mp4`](docs/demos/recordings/03-bootstrap-dataflow.mp4) |
+| 04 | PR pipeline | PR flow, intercepts, tests (VHS) | ~2m 30s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-04) | [`04-pr-pipeline.mp4`](docs/demos/recordings/04-pr-pipeline.mp4) |
+| 05 | Intercept routing | PR vs normal traffic routing, header-based interception | ~2m 6s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-05) | [`05-intercept-routing.mp4`](docs/demos/recordings/05-intercept-routing.mp4) |
+| 06 | Local debugging | mirrord integration, IDE breakpoints, live cluster debugging | ~1m 57s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-06) | [`06-local-debug.mp4`](docs/demos/recordings/06-local-debug.mp4) |
+| 07 | Orchestrator API | REST API, stacks, test plan, graph (VHS) | ~2m 14s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-07) | [`07-orchestrator.mp4`](docs/demos/recordings/07-orchestrator.mp4) |
+| 08 | Multi-team Helm | Helm chart deployment, team isolation, custom hooks | ~1m 59s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-08) | [`08-multi-team-helm.mp4`](docs/demos/recordings/08-multi-team-helm.mp4) |
+| 09 | Tekton Results | Results API and persisted history (VHS) | ~1m 40s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-09) | [`09-results-db.mp4`](docs/demos/recordings/09-results-db.mp4) |
+| 10 | Newman / regression | API tests and local test tiers (VHS) | ~2m 0s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-10) | [`10-newman-tests.mp4`](docs/demos/recordings/10-newman-tests.mp4) |
+| 11 | Test-trace graph | Blast radius, graph query, focused tests (mixed) | ~2m 0s | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-11) | [`11-test-trace-graph.mp4`](docs/demos/recordings/11-test-trace-graph.mp4) |
+| 12 | Regression suite (M12.2) | Full regression story + agent workflows | varies | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-12) | [`12-regression-suite.mp4`](docs/demos/recordings/12-regression-suite.mp4) |
+| 13 | Management GUI architecture (M12.2) | Vue, Flask, orchestrator | varies | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-13) | [`13-management-gui-architecture.mp4`](docs/demos/recordings/13-management-gui-architecture.mp4) |
+| 14 | GUI Tekton extension (M12.2) | Extending the GUI for Tekton | varies | [▶ Pages](https://jmjava.github.io/tekton-dag/#seg-14) | [`14-gui-tekton-extension.mp4`](docs/demos/recordings/14-gui-tekton-extension.mp4) |
+
+**Full concat files:** [01–11 on Pages](https://jmjava.github.io/tekton-dag/#full-demo) → [`full-demo.mp4`](docs/demos/recordings/full-demo.mp4) · [01–14 on Pages](https://jmjava.github.io/tekton-dag/#full-demo-m12-2) → [`full-demo-with-m12-2.mp4`](docs/demos/recordings/full-demo-with-m12-2.mp4)
+
+*All videos are generated programmatically — run [`docs/demos/generate-all.sh`](docs/demos/generate-all.sh). See [Milestone 8](milestones/milestone-8.md), [M12.2](milestones/milestone-12.2.md), and [docs/demos/README.md](docs/demos/README.md).*
 
 ---
 
@@ -33,12 +47,14 @@ Standalone Tekton pipeline system for **local development and proof-of-concept**
 | [M10](milestones/milestone-10.md) | **Completed** | Multi-team scaling: orchestration service, Helm chart, ArgoCD, batched builds |
 | [M10.1](milestones/milestone-10-1.md) | **Completed** | Orchestration service testing: Postman/Newman (15 requests, 30 assertions), integration validation |
 | [M11](milestones/milestone-11.md) | **Completed** | Vue 3 Management GUI + Python/Flask backend (replaces `reporting-gui/`). Multi-team, multi-cluster, DAG visualization. 69 Playwright E2E tests, 56 pytest unit tests, Postman collection. |
-| [M12](milestones/milestone-12.md) | **Completed** | Architecture customization: shared Python package, Helm ConfigMap/PVC templates, parameterized pipelines (no hardcoded `localhost:5000`), `scripts/common.sh`, build image variants (Java 11/17/21, Node 18/20/22, Python 3.10–3.12, PHP 8.1–8.3), custom pipeline hook tasks (pre/post build/test), stack JSON schema, 62 orchestrator pytest tests, 14 shared-package tests. Full docs: CUSTOMIZATION.md, MAINTENANCE.md, Helm README. |
-| [M12.2](milestones/milestone-12.2.md) | **Completed** | Documentation sync and archive: README + docs index, obsolete session plan archived |
+| [M12](milestones/milestone-12.md) | **Completed** | Architecture customization: shared Python package, Helm ConfigMap/PVC templates, parameterized pipelines (no hardcoded `localhost:5000`), `scripts/common.sh`, build image variants (Java 11/17/21, Node 18/20/22, Python 3.10–3.12, PHP 8.1–8.3), custom pipeline hook tasks (pre/post build/test), stack JSON schema, 62 orchestrator pytest tests, 14 shared-package tests. Full docs: [CUSTOMIZATION.md](docs/CUSTOMIZATION.md), [TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md](docs/TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md), MAINTENANCE.md, Helm README. |
+| [M12.2](milestones/milestone-12.2.md) | **Partial** | **Part A done:** doc sync + archive. **Part B open:** regression + Management GUI [docs & demo plan](docs/TESTING-AND-REGRESSION-OVERVIEW.md) / [GUI extension](docs/MANAGEMENT-GUI-EXTENSION.md) / [video segments](docs/demos/segments-m12-2-regression-gui.md) |
 
 Older milestones (M2, M3) are in [milestones/completed/](milestones/completed/).
 
-**Next up:** Finish remaining [M8](milestones/milestone-8.md) items (VHS recordings, Slidev export, full demo concat) as needed; ongoing maintenance via [CUSTOMIZATION.md](docs/CUSTOMIZATION.md) and [MAINTENANCE.md](docs/MAINTENANCE.md).
+**Next up:** Finish remaining [M8](milestones/milestone-8.md) items (VHS recordings, Slidev export, full demo concat) as needed; **[M12.2 Part B](milestones/milestone-12.2.md)** regression + GUI [docs](docs/TESTING-AND-REGRESSION-OVERVIEW.md) and [demo segments](docs/demos/segments-m12-2-regression-gui.md); ongoing maintenance via [CUSTOMIZATION.md](docs/CUSTOMIZATION.md) and [MAINTENANCE.md](docs/MAINTENANCE.md).
+
+**Regression (humans & Cursor agents):** run **`scripts/run-regression-agent.sh`** and iterate with fixes until green — see [AGENTS.md](AGENTS.md) and [docs/AGENT-REGRESSION.md](docs/AGENT-REGRESSION.md). Full tier list: [docs/REGRESSION.md](docs/REGRESSION.md).
 
 ---
 
@@ -218,7 +234,7 @@ C4Container
 
 Full diagram set: [docs/c4-diagrams.md](docs/c4-diagrams.md).
 
-> **ArgoCD** is optional for local dev. In production, ArgoCD syncs the Helm chart per team via ApplicationSet. See [ArgoCD + Tekton architecture guide](docs/argocd-architecture-guide.md) and [argocd/applicationset.yaml](argocd/applicationset.yaml).
+> **ArgoCD** is optional for local dev. In a **production deployment**, ArgoCD syncs the Helm chart per team via ApplicationSet (separate from the validation cluster where pipelines run). See [ArgoCD + Tekton architecture guide](docs/argocd-architecture-guide.md), [Environments and clusters](docs/ENVIRONMENTS-AND-CLUSTERS.md), and [argocd/applicationset.yaml](argocd/applicationset.yaml).
 
 ---
 
 
@@ -0,0 +1,56 @@
+# Agent playbook: regression until fully done
+
+Human or **Cursor agent**: do not treat testing as complete after a single partial command (e.g. pytest only, or Newman only). **Iterate**: run → read output → fix → re-run **until** the target command exits **0** and the log matches the done criteria below.
+
+## Target commands (pick one)
+
+| Goal | Command (repo root) |
+|------|---------------------|
+| **Strict cluster** (orchestrator + `newman` + **required** `stack-dag-verify` PipelineRun) | `bash scripts/run-regression-stream.sh --cluster --require-dag-verify` |
+| **+ Tekton Results DB** (fails if Results API missing) | `bash scripts/run-regression-stream.sh --cluster --require-dag-verify --with-results-verify` |
+| **No cluster** (explicitly limited) | `bash scripts/run-regression-stream.sh --local-only` — then **state** that Tekton/Newman/Results were not run. |
+
+Convenience (detects `kubectl`; uses strict cluster if context works, else local-only with a stderr banner):
+
+```bash
+bash scripts/run-regression-agent.sh
+```
+
+Bootstrap Python once if needed:
+
+```bash
+python3 -m venv .venv && . .venv/bin/activate
+pip install -r orchestrator/requirements.txt -r management-gui/backend/requirements-dev.txt
+pip install -e 'libs/tekton-dag-common[test]' -e 'libs/baggage-python[test]'
+```
+
+## Done criteria (all must hold for “cluster regression done”)
+
+1. Final line: **`regression exit code: 0`** (use `run-regression-stream.sh`, not a broken pipe to `while read`).
+2. Log contains **either**:
+   - **`Phase 2 PASSED`** (standalone `verify-dag-phase2`), **or**
+   - **`Tekton DAG verify: will run inside run-full-test-and-verify-results.sh`** and that script completes successfully, **or**
+   - You documented **why** the cluster cannot run Phase 2 (and you did not claim full cluster verification).
+3. **No unexplained `SKIP Tekton DAG verify`** when `stack-dag-verify` should exist — if you see it, fix install/namespace or pass `--require-dag-verify` and resolve the failure.
+4. Newman/orchestrator: no port-forward bind failure; if it fails, free port or set `ORCHESTRATOR_TEST_PORT`.
+
+## Iteration loop (required agent behavior)
+
+```
+repeat {
+  run the chosen target command; capture full log
+  if exit 0 and done criteria met → STOP (success)
+  identify failing tier (pytest / Playwright / Phase 2 / Newman / Results)
+  apply minimal fix (code, script, manifest, env, cluster)
+} until success or human abort
+```
+
+Do **not** stop after only:
+
+- Phase 1 + unit tests, or  
+- Newman without a **Succeeded** `stack-dag-verify` path (see [REGRESSION.md](REGRESSION.md) “What actually runs Tekton PipelineRuns?”).
+
+## References
+
+- [REGRESSION.md](REGRESSION.md) — tiers, flags, env vars  
+- [GITHUB-PAGES.md](GITHUB-PAGES.md) — if work touches Pages  
@@ -2,6 +2,8 @@
 
 Practical recipes for extending teams, stacks, build images, registries, pipelines, and intercept behavior. Examples use paths from the main **tekton-dag** platform repository.
 
+**New team or new stack from scratch?** See [TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md](TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md) for which **baggage / header-forwarding library** to use per language and an end-to-end **stack creation checklist** (YAML, Helm, orchestrator).
+
 ---
 
 ## 1. Add a new team
@@ -57,7 +59,8 @@ cp teams/squad-b/team.yaml helm/tekton-dag/raw/teams/squad-b/
 
 ## 2. Add an app to a stack
 
-**Goal:** Register another service in the DAG (build, image, tests, downstream edges).
+**Goal:** Register another service in the DAG (build, image, tests, downstream edges).  
+**Baggage:** add the matching library from `libs/` and set `propagation-role`—see [TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md](TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md).
 
 Edit or create a stack file under `stacks/`. Each entry under `apps` needs a unique `name`, `repo`, `role`, `build` tool settings, and optional `downstream` / `tests`:
 
 
@@ -94,3 +94,5 @@ So:
 | **“Which app has changes”** | The app whose PR triggered the run = the single **changed-app** (and thus in **build-apps**); that’s the app being intercepted for that run. |
 
 For sequence diagrams that show intercept scenarios (originator only, forwarder only, terminal only, or multiple intercepted), see [docs/c4-diagrams.md](c4-diagrams.md) (“Dynamic Diagram: PR Intercept Scenarios”).
+
+To implement propagation in application code and onboard a new stack, see [TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md](TEAM-ONBOARDING-STACKS-AND-BAGGAGE.md).
@@ -0,0 +1,23 @@
+# Environments and clusters
+
+This project’s docs and demos often talk about **two traffic paths** (PR vs non-PR) in a cluster. That is **not** the same as “this cluster is production.”
+
+## How we name things
+
+| Term | Meaning |
+|------|--------|
+| **Validation / pre-production cluster** | A **dedicated** Kubernetes cluster (or namespace slice) used to **build, deploy, and test** changes. It is usually **similar in shape** to production (same kinds of services, ingress, policies) so tests are realistic. **Pipelines and intercept demos run here.** |
+| **Production cluster** | Where **released** workloads run for real users. Code and images are **promoted** here **after** validation — it is a **separate** cluster (or strictly separated environment), not the same place PR pipelines mutate. |
+| **Baseline deployment** (or **mainline**) | The **non-PR** revision of a service **in the validation cluster** — the “steady” line already merged and deployed there. Demos that used to say “production” for this path mean **baseline**, not customer production. |
+| **PR deployment** | An **ephemeral** revision deployed **alongside** the baseline **in the validation cluster**, reachable when requests carry the dev-session header. |
+
+So: **“normal traffic”** in intercept diagrams = traffic to the **baseline** pods in the **validation** environment, **unless** the document explicitly says it is about the production cluster.
+
+## Scripts and promotion
+
+- `scripts/promote-pipelines.sh` moves Tekton YAML between namespaces (e.g. test → prod-facing namespace). That is an **operator-controlled promotion step**; it does not imply that PR validation runs in the production cluster.
+- Regression and E2E scripts target **whatever cluster your kubeconfig points at** — treat that as validation unless you intentionally point at production (not recommended for destructive tests).
+
+## Demos and narration
+
+Demo videos and Manim scenes use **baseline / mainline** wording for the non-PR path so they stay accurate for teams that **only** run pipelines in a validation cluster and ship artifacts onward to production separately.
@@ -6,6 +6,8 @@
 
 - Workflow: [.github/workflows/pages.yml](../.github/workflows/pages.yml)
 - **Artifact root** is the [`docs/`](../docs/) folder (so `docs/index.html` becomes the site homepage).
+- Demo page [`index.html`](index.html) embeds **all 14 segments** plus **full-demo** / **full-demo-with-m12-2**; deep-link with fragments e.g. `…/tekton-dag/#seg-07` (see root [README](../README.md) demo table).
+- Checkout uses **`lfs: true`** so large media tracked with Git LFS (e.g. demo MP4s under [`docs/demos/recordings/`](demos/recordings/)) are present in the deployed site. If videos are missing on Pages, confirm those files are committed and that LFS objects are pushed (`git lfs push --all origin main`).
 - Repository **Settings → Pages**: source should be **GitHub Actions** (not “Deploy from a branch”).
 
 ## If the site returns 404
Original file line number	Diff line number	Diff line change
`@@ -1 +1,2 @@`
`1`	`1`	`docs/demos/recordings/*.mp4 filter=lfs diff=lfs merge=lfs -text`
	`2`	`+docs/assets/*/.png filter=lfs diff=lfs merge=lfs -text`