Skip to content

tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack#49082

Open
dognose24 wants to merge 1 commit into
Automattic:trunkfrom
dognose24:upstream/ai-sandbox-subsystem
Open

tools/ai-sandbox: add containerized agent sandbox + wp-verify Playwright stack#49082
dognose24 wants to merge 1 commit into
Automattic:trunkfrom
dognose24:upstream/ai-sandbox-subsystem

Conversation

@dognose24
Copy link
Copy Markdown
Contributor

@dognose24 dognose24 commented May 22, 2026

Proposed changes

Adds a self-contained containerized sandbox under tools/ai-sandbox/ for running AI-coding-agent flows against a real WordPress + Gutenberg instance, plus a wp-verify Playwright test harness that runs against the sandbox WP.

Container stack (Docker Compose)

  • docker-compose.yml — agent container (jetpack-ai-sandbox) with the monorepo bind-mounted; mysql + wordpress + wpcli services behind a wp-verify Compose profile so they only start when explicitly requested.
  • docker-compose.worktree.yml — co-mounts a host git worktree for parallel agent sessions on the same sandbox.
  • docker-compose.wp-verify.yml — opt-in port binding (127.0.0.1) + Docker socket access for host-runnable wp-verify flows.
  • Dockerfile — Node 24 + pnpm 10 + PHP 8.4 + Composer 2.9 base image.
  • entrypoint.sh, hooks/pre-push — sandbox-side scope-gate enforcing what an agent can push.

wp-verify subsystem

Playwright suite that runs against the sandbox WP:

  • wp-verify.sh — orchestrator: brings up the WP stack, waits for WordPress + Gutenberg readiness, runs Playwright.
  • wp-verify/playwright.config.ts — host- or container-runnable; uses WP_BASE for the target URL.
  • wp-verify/global-setup.ts — authenticated storage state for /wp-admin tests.
  • wp-verify/check.cjs — non-Playwright sanity check.
  • wp-verify/eslint.config.mjs — local ESLint config.
  • wp-verify/mu-loader.php — MU-plugin loader for staged plugin code.
  • wp-verify/tests/dashboard-mount.spec.ts (active) and pie-chart-tooltip.spec.ts (skipped until a chart lands on the dashboard).

Documentation

  • README.md — quick-start, env vars (WP_BASE, WP_VERIFY_HOST_PORT, WP_VERIFY_INSTANCE), and the relationship to the rest of the harness.
  • docs/{agent-boundaries,agent-metrics,build-runtime-contract,route-contract,ui-scope-contract}.md — contract documents that the rest of the harness (skills, review-cycle workflow) reference.

What this PR is / isn't

  • Is: an opt-in tooling subsystem under tools/ai-sandbox/. Nothing else in the repo references this directory; behavior of every existing package is unaffected.
  • Isn't: a runtime dependency for any Jetpack plugin or package. The sandbox runs as a separate Docker stack; it's developer/agent tooling.

Why parallel batch

This is one of 4 PRs upstreaming accumulated work from dognose24/jetpack:

This PR's README.md references the skills in PR #C; that reference is cosmetic — the subsystem itself works as standalone tooling without any skill files present.

Does this pull request change what data or activity we track or use?

No. The sandbox is opt-in developer/agent tooling; nothing is enqueued, persisted, or sent outside a developer's local Docker.

Testing instructions

  1. cd tools/ai-sandbox && docker compose up -d --build — verify the agent container builds and starts.
  2. tools/ai-sandbox/wp-verify.sh up — verify the WP + MySQL + WPCLI services come up healthy and WordPress finishes installation.
  3. pnpm install && pnpm exec playwright test --config tools/ai-sandbox/wp-verify/playwright.config.ts from the repo root (with WP_BASE set per README.md) — verify the dashboard-mount spec passes; pie-chart-tooltip is intentionally skipped (no chart in the dashboard yet).
  4. tools/ai-sandbox/wp-verify.sh down — verify clean teardown.

🤖 Generated with Claude Code

…ght stack

Adds a self-contained subsystem under tools/ai-sandbox/ for running
AI-coding-agent flows against a real WordPress instance:

Container stack (Docker Compose):
* docker-compose.yml — agent container (jetpack-ai-sandbox) with the
  monorepo bind-mounted; mysql + wordpress + wpcli services live
  behind a `wp-verify` Compose profile so they only start when
  explicitly requested via wp-verify.sh.
* docker-compose.worktree.yml — co-mounts a host git worktree so
  multiple parallel agent sessions can share the same sandbox.
* docker-compose.wp-verify.yml — opt-in port binding (127.0.0.1) +
  Docker socket access for host-runnable wp-verify flows.
* Dockerfile — Node 24 + pnpm 10 + PHP 8.4 + Composer 2.9 base image.
* entrypoint.sh, hooks/pre-push — sandbox-side scope-gate enforcing
  what an agent can push.

wp-verify subsystem (Playwright against a real WP + Gutenberg):
* wp-verify.sh — orchestrator that brings up the WP stack, waits for
  WordPress + Gutenberg readiness, and runs Playwright tests against
  the running site.
* wp-verify/playwright.config.ts — host- or container-runnable test
  config; uses WP_BASE for the target URL.
* wp-verify/global-setup.ts — authenticated storage state for
  /wp-admin tests.
* wp-verify/check.cjs — non-Playwright sanity check.
* wp-verify/eslint.config.mjs — local ESLint config.
* wp-verify/mu-loader.php — MU-plugin loader for staged plugin code.
* wp-verify/tests/ — two example specs (dashboard-mount,
  pie-chart-tooltip — second one skipped until a chart lands on the
  dashboard).

docs/ — five contract documents that the rest of the harness (skills,
review-cycle workflow) reference:
* agent-boundaries.md
* agent-metrics.md
* build-runtime-contract.md
* route-contract.md
* ui-scope-contract.md

README.md — quick-start, env vars (WP_BASE, WP_VERIFY_HOST_PORT,
WP_VERIFY_INSTANCE), and the relationship to the rest of the harness.

This PR is one of a parallel batch upstreaming the dognose24/jetpack
fork's accumulated tooling work. See "Related PRs" in the PR body.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


@github-actions github-actions Bot added the OSS Citizen This Pull Request was opened by an Open Source contributor. label May 22, 2026
@dognose24 dognose24 self-assigned this May 22, 2026
Copy link
Copy Markdown
Contributor

@anomiex anomiex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As was previously requested on #48208, I suggest you add a line to .github/CODEOWNERS so you get pinged on changes to tools/ai-sandbox/; the jetpack-monorepo team doesn't have the bandwidth to maintain this at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Docs OSS Citizen This Pull Request was opened by an Open Source contributor. [Tests] Includes Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants