tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack by dognose24 · Pull Request #49082 · Automattic/jetpack

dognose24 · 2026-05-22T02:15:15Z

Proposed changes

Adds a self-contained containerized sandbox under tools/ai-sandbox/ for running AI-coding-agent flows against a real WordPress + Gutenberg instance, plus a wp-verify Playwright test harness that runs against the sandbox WP.

Container stack (Docker Compose)

docker-compose.yml — agent container (jetpack-ai-sandbox) with the monorepo bind-mounted; mysql + wordpress + wpcli services behind a wp-verify Compose profile so they only start when explicitly requested.
docker-compose.worktree.yml — co-mounts a host git worktree for parallel agent sessions on the same sandbox.
docker-compose.wp-verify.yml — opt-in port binding (127.0.0.1) + Docker socket access for host-runnable wp-verify flows.
Dockerfile — Node 24 + pnpm 10 + PHP 8.4 + Composer 2.9 base image.
entrypoint.sh, hooks/pre-push — sandbox-side scope-gate enforcing what an agent can push.

wp-verify subsystem

Playwright suite that runs against the sandbox WP:

wp-verify.sh — orchestrator: brings up the WP stack, waits for WordPress + Gutenberg readiness, runs Playwright.
wp-verify/playwright.config.ts — host- or container-runnable; uses WP_BASE for the target URL.
wp-verify/global-setup.ts — authenticated storage state for /wp-admin tests.
wp-verify/check.cjs — non-Playwright sanity check.
wp-verify/eslint.config.mjs — local ESLint config.
wp-verify/mu-loader.php — MU-plugin loader for staged plugin code.
wp-verify/tests/ — dashboard-mount.spec.ts (active) and pie-chart-tooltip.spec.ts (skipped until a chart lands on the dashboard).

Documentation

README.md — quick-start, env vars (WP_BASE, WP_VERIFY_HOST_PORT, WP_VERIFY_INSTANCE), and the relationship to the rest of the harness.
docs/{agent-boundaries,agent-metrics,build-runtime-contract,route-contract,ui-scope-contract}.md — contract documents that the rest of the harness (skills, review-cycle workflow) reference.

What this PR is / isn't

Is: an opt-in tooling subsystem under tools/ai-sandbox/. Nothing else in the repo references this directory; behavior of every existing package is unaffected.
Isn't: a runtime dependency for any Jetpack plugin or package. The sandbox runs as a separate Docker stack; it's developer/agent tooling.

Why parallel batch

This is one of 4 PRs upstreaming accumulated work from dognose24/jetpack:

premium-analytics: add AGENTS.md, package docs, and research entries #49081 — premium-analytics package docs
tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack #49082 (this PR) — tools/ai-sandbox/ subsystem
agents: add premium-analytics implement/verify/regression skills + governance #49083 — .agents/skills/ + .claude/commands/ agent skills + governance
ci: add pr-review-cycle workflow #49084 — .github/workflows/pr-review-cycle.yml auto-triggered review workflow

This PR's README.md references the skills in PR #C; that reference is cosmetic — the subsystem itself works as standalone tooling without any skill files present.

Does this pull request change what data or activity we track or use?

No. The sandbox is opt-in developer/agent tooling; nothing is enqueued, persisted, or sent outside a developer's local Docker.

Testing instructions

cd tools/ai-sandbox && docker compose up -d --build — verify the agent container builds and starts.
tools/ai-sandbox/wp-verify.sh up — verify the WP + MySQL + WPCLI services come up healthy and WordPress finishes installation.
pnpm install && pnpm exec playwright test --config tools/ai-sandbox/wp-verify/playwright.config.ts from the repo root (with WP_BASE set per README.md) — verify the dashboard-mount spec passes; pie-chart-tooltip is intentionally skipped (no chart in the dashboard yet).
tools/ai-sandbox/wp-verify.sh down — verify clean teardown.

🤖 Generated with Claude Code

…ght stack Adds a self-contained subsystem under tools/ai-sandbox/ for running AI-coding-agent flows against a real WordPress instance: Container stack (Docker Compose): * docker-compose.yml — agent container (jetpack-ai-sandbox) with the monorepo bind-mounted; mysql + wordpress + wpcli services live behind a `wp-verify` Compose profile so they only start when explicitly requested via wp-verify.sh. * docker-compose.worktree.yml — co-mounts a host git worktree so multiple parallel agent sessions can share the same sandbox. * docker-compose.wp-verify.yml — opt-in port binding (127.0.0.1) + Docker socket access for host-runnable wp-verify flows. * Dockerfile — Node 24 + pnpm 10 + PHP 8.4 + Composer 2.9 base image. * entrypoint.sh, hooks/pre-push — sandbox-side scope-gate enforcing what an agent can push. wp-verify subsystem (Playwright against a real WP + Gutenberg): * wp-verify.sh — orchestrator that brings up the WP stack, waits for WordPress + Gutenberg readiness, and runs Playwright tests against the running site. * wp-verify/playwright.config.ts — host- or container-runnable test config; uses WP_BASE for the target URL. * wp-verify/global-setup.ts — authenticated storage state for /wp-admin tests. * wp-verify/check.cjs — non-Playwright sanity check. * wp-verify/eslint.config.mjs — local ESLint config. * wp-verify/mu-loader.php — MU-plugin loader for staged plugin code. * wp-verify/tests/ — two example specs (dashboard-mount, pie-chart-tooltip — second one skipped until a chart lands on the dashboard). docs/ — five contract documents that the rest of the harness (skills, review-cycle workflow) reference: * agent-boundaries.md * agent-metrics.md * build-runtime-contract.md * route-contract.md * ui-scope-contract.md README.md — quick-start, env vars (WP_BASE, WP_VERIFY_HOST_PORT, WP_VERIFY_INSTANCE), and the relationship to the rest of the harness. This PR is one of a parallel batch upstreaming the dognose24/jetpack fork's accumulated tooling work. See "Related PRs" in the PR body. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

github-actions · 2026-05-22T02:16:17Z

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

✅ Include a description of your PR changes.
✅ Add testing instructions.
✅ Specify whether this PR includes any changes to data or privacy.
✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖

anomiex

As was previously requested on #48208, I suggest you add a line to .github/CODEOWNERS so you get pinged on changes to tools/ai-sandbox/; the jetpack-monorepo team doesn't have the bandwidth to maintain this at this time.

dognose24 requested a review from a team as a code owner May 22, 2026 02:15

This was referenced May 22, 2026

premium-analytics: add AGENTS.md, package docs, and research entries #49081

Open

agents: add premium-analytics implement/verify/regression skills + governance #49083

Open

ci: add pr-review-cycle workflow #49084

Open

github-actions Bot added [Tests] Includes Tests Docs labels May 22, 2026

github-actions Bot added the OSS Citizen This Pull Request was opened by an Open Source contributor. label May 22, 2026

dognose24 self-assigned this May 22, 2026

anomiex requested changes May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack#49082

tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack#49082
dognose24 wants to merge 1 commit into
Automattic:trunkfrom
dognose24:upstream/ai-sandbox-subsystem

dognose24 commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

anomiex left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dognose24 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Container stack (Docker Compose)

wp-verify subsystem

Documentation

What this PR is / isn't

Why parallel batch

Does this pull request change what data or activity we track or use?

Testing instructions

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

anomiex left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dognose24 commented May 22, 2026 •

edited

Loading