Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions docs/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ description: Agent instructions for AI assistants working on the Mux codebase
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.

Core workflow:

1. `agent-browser open <url>` - Navigate to page
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
Expand All @@ -68,8 +69,8 @@ Core workflow:

- If a PR has Codex review comments, address + resolve them, then re-request review by commenting `@codex review` on the PR.
- Prefer `gh` CLI for GitHub interactions over manual web/curl flows.
- In Orchestrator mode, delegate implementation/verification commands to `exec` or `explore` sub-agents and integrate their patches; do not bypass delegation with direct local edits.
- In Orchestrator mode, route higher-complexity implementation tasks to `plan` sub-agents so they can research and produce a precise plan before auto-handoff to implementation.
- When delegation is required by the active mode, use `exec` or `explore` sub-agents as directed and integrate their patches; do not bypass delegation with direct local edits.
- Keep implementation tasks on `exec` sub-agents; use a top-level plan workspace when you need a separate planning phase before delegation.

- User preference: when work is already on an open PR, push branch updates at the end of each completed change set so the PR stays current.
- **PR creation gate:** Do **not** open/create a pull request unless the user explicitly asks (e.g., "open a PR", "create PR", "submit this"). By default, complete local validation, commit/push branch updates as requested, and let the user review before deciding whether to open a PR.
Expand All @@ -81,11 +82,11 @@ Core workflow:
When a PR exists, you MUST remain in this loop until the PR is fully ready:

1. Push your latest fixes.
2. Run local validation (`make static-check` and targeted tests as needed); in Orchestrator mode, delegate command execution to sub-agents.
2. Run local validation (`make static-check` and targeted tests as needed); delegate command execution to sub-agents when the active mode requires it.
3. Request review with `@codex review`.
4. Run `./scripts/wait_pr_ready.sh <pr_number>` (which must execute `./scripts/wait_pr_checks.sh <pr_number> --once` while checks are pending).
5. If Codex leaves comments, address them (delegate fixes in Orchestrator mode), resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
6. If checks/mergeability fail, fix issues locally (delegate fixes in Orchestrator mode), push, and repeat.
5. If Codex leaves comments, address them (delegating fixes when required by the active mode), resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
6. If checks/mergeability fail, fix issues locally (delegating fixes when required by the active mode), push, and repeat.

The only early-stop exception is when the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction.

Expand Down
26 changes: 12 additions & 14 deletions docs/agents/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ When a plan is present (default):
- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable.
- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs.
- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user.
- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity).
- Convert the plan into concrete implementation subtasks and start delegation with `exec` sub-agents.

What you are allowed to do directly in this workspace:

Expand All @@ -452,8 +452,8 @@ Hard rules (delegate-first):
- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.
- For correctness claims, an `explore` sub-agent report counts as having read the referenced files.
- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination).
- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end.
- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here.
- **Do not implement features/bugfixes directly here.** Spawn `exec` sub-agents and have them complete the work end-to-end.
- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec` sub-agents instead of implementing changes here.
- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches.

Delegation guide:
Expand All @@ -474,12 +474,10 @@ Delegation guide:
Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
- Create one or more git commits before `agent_report`.
- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach.
- Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified.
- For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins.
- Use `exec` for implementation subtasks, including higher-complexity work.
- For higher-complexity work, do a small amount of parent-side framing first so the `exec` brief includes the goal, constraints, sequencing, and key files.
- Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery.
- Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes.
- For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions.
- If the implementation approach is still unclear after targeted exploration, switch to a top-level plan workspace before continuing delegation instead of spawning a plan sub-agent.
- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local.

Recommended Orchestrator → Exec task brief template:
Expand All @@ -505,7 +503,7 @@ Recommended Orchestrator → Exec task brief template:
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
- Create one or more git commits before `agent_report`.

Dependency analysis (required before spawning implementation tasks — `exec` or `plan`):
Dependency analysis (required before spawning implementation tasks):

- For each candidate subtask, write:
- Outputs: files/targets/artifacts introduced/renamed/generated
Expand All @@ -526,9 +524,9 @@ Example dependency chain (schema download → generation):
Patch integration loop (default):

1. Identify a batch of independent subtasks.
2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity).
2. Spawn one `exec` implementation sub-agent task per subtask with `run_in_background: true`.
3. Await the batch via `task_await`.
4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time:
4. For each successful implementation task, integrate patches one at a time:
- Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`.
- Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results.
- Dry-run apply: `task_apply_git_patch` with `dry_run: true`.
Expand All @@ -544,11 +542,11 @@ Patch integration loop (default):
- Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely.
- Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks.
- PASS: summary-only (no long logs).
- FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify.
- FAIL: include the failing command + key error lines; then delegate a fix to `exec` and re-verify.

Sequential protocol (only for dependency chains):

1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`.
1. Spawn the prerequisite implementation task with `agentId: "exec"` and `run_in_background: false`.
2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application.
3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress).
4. Only after the patch is applied, spawn the dependent implementation task.
Expand Down Expand Up @@ -579,7 +577,7 @@ description: Create a plan before coding
ui:
color: var(--color-plan-mode)
subagent:
runnable: true
runnable: false
tools:
add:
# Allow all tools by default (includes MCP tools which have dynamic names)
Expand Down
1 change: 0 additions & 1 deletion src/browser/components/icons/EmojiIcon/EmojiIcon.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ const EMOJI_TO_ICON: Record<string, LucideIcon> = {
"🔗": Link,
"🔄": RefreshCw,
"🧪": Beaker,
// Used by auto-handoff routing status while selecting the executor.
"🤔": CircleHelp,

// Directions
Expand Down
41 changes: 0 additions & 41 deletions src/browser/features/Settings/Sections/TasksSection.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,7 @@ import {
import {
DEFAULT_TASK_SETTINGS,
TASK_SETTINGS_LIMITS,
isPlanSubagentExecutorRouting,
normalizeTaskSettings,
type PlanSubagentExecutorRouting,
type TaskSettings,
} from "@/common/types/tasks";
import { getThinkingOptionLabel, type ThinkingLevel } from "@/common/types/thinking";
Expand Down Expand Up @@ -173,8 +171,6 @@ function areTaskSettingsEqual(a: TaskSettings, b: TaskSettings): boolean {
a.maxParallelAgentTasks === b.maxParallelAgentTasks &&
a.maxTaskNestingDepth === b.maxTaskNestingDepth &&
a.proposePlanImplementReplacesChatHistory === b.proposePlanImplementReplacesChatHistory &&
a.planSubagentExecutorRouting === b.planSubagentExecutorRouting &&
a.planSubagentDefaultsToOrchestrator === b.planSubagentDefaultsToOrchestrator &&
a.bashOutputCompactionMinLines === b.bashOutputCompactionMinLines &&
a.bashOutputCompactionMinTotalBytes === b.bashOutputCompactionMinTotalBytes &&
a.bashOutputCompactionMaxKeptLines === b.bashOutputCompactionMaxKeptLines &&
Expand Down Expand Up @@ -499,25 +495,10 @@ export function TasksSection() {
);
};

const setPlanSubagentExecutorRouting = (value: string) => {
if (!isPlanSubagentExecutorRouting(value)) {
return;
}

setTaskSettings((prev) =>
normalizeTaskSettings({
...prev,
planSubagentExecutorRouting: value,
})
);
};
const setNewWorkspaceDefaultAgentId = (agentId: string) => {
setGlobalDefaultAgentIdRaw(coerceAgentId(agentId));
};

const planSubagentExecutorRouting: PlanSubagentExecutorRouting =
taskSettings.planSubagentExecutorRouting ?? "exec";

const setAgentModel = (agentId: string, value: string) => {
setAgentAiDefaults((prev) =>
updateAgentDefaultEntry(prev, agentId, (updated) => {
Expand Down Expand Up @@ -917,28 +898,6 @@ export function TasksSection() {
aria-label="Toggle plan Implement replaces conversation with plan"
/>
</div>

<div className="flex items-center justify-between gap-4">
<div className="flex-1">
<div className="text-foreground text-sm">Plan sub-agents: executor routing</div>
<div className="text-muted text-xs">
Choose how plan sub-agent tasks route after propose_plan.
</div>
</div>
<Select
value={planSubagentExecutorRouting}
onValueChange={setPlanSubagentExecutorRouting}
>
<SelectTrigger className="border-border-medium bg-background-secondary h-9 w-44">
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="exec">Exec</SelectItem>
<SelectItem value="orchestrator">Orchestrator</SelectItem>
<SelectItem value="auto">Auto (Agent chooses)</SelectItem>
</SelectContent>
</Select>
</div>
</div>

{saveError ? <div className="text-danger-light mt-4 text-xs">{saveError}</div> : null}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,7 @@ import {
createUserMessage,
createAssistantMessage,
createProposePlanTool,
createStatusTool,
} from "@/browser/stories/mockFactory";
import {
PLAN_AUTO_ROUTING_STATUS_EMOJI,
PLAN_AUTO_ROUTING_STATUS_MESSAGE,
} from "@/common/constants/planAutoRoutingStatus";

const meta = { ...appMeta, title: "App/Chat/Tools/ProposePlan" };
export default meta;
Expand Down Expand Up @@ -167,84 +162,6 @@ graph TD
},
};

/**
* Captures the handoff pause after a plan is presented and before the executor stream starts.
*
* This reproduces the visual state where the sidebar shows "Deciding execution strategy…"
* while the proposed plan remains visible in the conversation.
*/
export const ProposePlanAutoRoutingDecisionGap: AppStory = {
render: () => (
<AppWithMocks
setup={() =>
setupSimpleChatStory({
workspaceId: "ws-plan-auto-routing-gap",
workspaceName: "feature/plan-auto-routing",
messages: [
createUserMessage(
"msg-1",
"Plan and implement a safe migration rollout for auth tokens.",
{
historySequence: 1,
timestamp: STABLE_TIMESTAMP - 240000,
}
),
createAssistantMessage("msg-2", "Here is the implementation plan.", {
historySequence: 2,
timestamp: STABLE_TIMESTAMP - 230000,
toolCalls: [
createProposePlanTool(
"call-plan-1",
`# Auth Token Migration Rollout

## Goals

- Migrate token validation to the new signing service.
- Maintain compatibility during rollout.
- Keep rollback simple and low risk.

## Steps

1. Add dual-read token validation behind a feature flag.
2. Ship telemetry for token verification outcomes.
3. Enable new validator for 10% of traffic.
4. Ramp to 100% after stability checks.
5. Remove legacy validator once metrics stay healthy.

## Rollback

- Disable the rollout flag to return to legacy validation immediately.
- Keep telemetry running to confirm recovery.`
),
],
}),
createAssistantMessage("msg-3", "Selecting the right executor for this plan.", {
historySequence: 3,
timestamp: STABLE_TIMESTAMP - 220000,
toolCalls: [
createStatusTool(
"call-status-1",
PLAN_AUTO_ROUTING_STATUS_EMOJI,
PLAN_AUTO_ROUTING_STATUS_MESSAGE
),
],
}),
],
})
}
/>
),
parameters: {
docs: {
description: {
story:
"Chromatic regression story for the plan auto-routing gap: after `propose_plan` succeeds, " +
"the sidebar stays in a working state with a 'Deciding execution strategy…' status before executor kickoff.",
},
},
},
};

/**
* Mobile viewport version of ProposePlan.
*
Expand Down
4 changes: 2 additions & 2 deletions src/common/config/schemas/appConfigOnDisk.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ import { TaskSettingsSchema } from "./taskSettings";

export { RuntimeEnablementOverridesSchema } from "../../schemas/runtimeEnablement";
export type { RuntimeEnablementOverrides } from "../../schemas/runtimeEnablement";
export { PlanSubagentExecutorRoutingSchema, TaskSettingsSchema } from "./taskSettings";
export type { PlanSubagentExecutorRouting, TaskSettings } from "./taskSettings";
export { TaskSettingsSchema } from "./taskSettings";
export type { TaskSettings } from "./taskSettings";

export const AgentAiDefaultsEntrySchema = z.object({
modelString: z.string().optional(),
Expand Down
6 changes: 0 additions & 6 deletions src/common/config/schemas/taskSettings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,6 @@ export const SYSTEM1_BASH_OUTPUT_COMPACTION_LIMITS = {
bashOutputCompactionTimeoutMs: { min: 1_000, max: 120_000, default: 5_000 },
} as const;

export const PlanSubagentExecutorRoutingSchema = z.enum(["exec", "orchestrator", "auto"]);

export type PlanSubagentExecutorRouting = z.infer<typeof PlanSubagentExecutorRoutingSchema>;

export const TaskSettingsSchema = z.object({
maxParallelAgentTasks: z
.number()
Expand All @@ -30,8 +26,6 @@ export const TaskSettingsSchema = z.object({
.max(TASK_SETTINGS_LIMITS.maxTaskNestingDepth.max)
.optional(),
proposePlanImplementReplacesChatHistory: z.boolean().optional(),
planSubagentExecutorRouting: PlanSubagentExecutorRoutingSchema.optional(),
planSubagentDefaultsToOrchestrator: z.boolean().optional(),
bashOutputCompactionMinLines: z
.number()
.int()
Expand Down
4 changes: 0 additions & 4 deletions src/common/constants/planAutoRoutingStatus.ts

This file was deleted.

Loading
Loading