Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ description: Agent instructions for AI assistants working on the Mux codebase
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.

Core workflow:

1. `agent-browser open <url>` - Navigate to page
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
Expand Down Expand Up @@ -128,7 +129,7 @@ Mobile app tests live in `mobile/src/**/*.test.ts` and use Bun's built-in test r
- Never use emoji characters as UI icons or status indicators; emoji rendering varies across platforms and fonts.
- Prefer SVG icons (usually from `lucide-react`) or shared icon components under `src/browser/components/icons/`.
- For tool call headers, use `ToolIcon` from `src/browser/components/tools/shared/ToolPrimitives.tsx`.
- If a tool/agent provides an emoji string (e.g., `status_set` or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
- If a tool/agent provides an emoji string (e.g., todo-derived status or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
- If a new emoji appears in tool output, extend `EmojiIcon` to map it to an SVG icon.
- Colors defined in `src/browser/styles/globals.css` (`:root @theme` block). Reference via CSS variables (e.g., `var(--color-plan-mode)`), never hardcode hex values.
- For incrementing numeric UI (costs, timers, token counts, percentages), use semantic numeric typography utilities (`counter-nums` / `counter-nums-mono`) to prevent width jitter.
Expand Down Expand Up @@ -229,9 +230,9 @@ Freely make breaking changes, and reorganize / cleanup IPC as needed.
- E2E tests (tests/e2e) work with Radix but are slow (~2min startup); reserve for scenarios that truly need real Electron.
- Only use `validateApiKeys()` in tests that actually make AI API calls.

## Tool: status_set
## Tool: todo_write

- Set status url to the Pull Request once opened
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.

## GitHub

Expand Down
1 change: 0 additions & 1 deletion docs/agents/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -632,7 +632,6 @@ tools:
- ask_user_question
- todo_read
- todo_write
- status_set
- notify
- analytics_query
---
Expand Down
8 changes: 4 additions & 4 deletions docs/agents/instruction-files.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ Be terse and to the point.

## Model: openai:.\*codex

Use status reporting tools every few minutes.
Keep the todo list current every few minutes while a task is in flight.
```

### Tool Prompts
Expand All @@ -92,12 +92,12 @@ Customize how the AI uses specific tools by appending instructions to their desc

- Run `prettier --write` after editing files

## Tool: status_set
## Tool: todo_write

- Set status URL to the Pull Request once opened
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.
```

**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `status_set`, `web_fetch`, `web_search`.
**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `web_fetch`, `web_search`.

## Practical layout

Expand Down
4 changes: 2 additions & 2 deletions docs/config/notifications.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The recommended way to configure the `notify` tool is via a `Tool: notify` scope
- Notify on CI failures or deployment issues
- Notify when waiting for user input longer than 30 seconds
- Do not notify for routine status updates
- Use status_set for progress updates instead
- Use `todo_write` for routine progress updates instead
```

See [Instruction Files](/agents/instruction-files) for more on scoped instructions.
Expand Down Expand Up @@ -94,7 +94,7 @@ notify: {
description:
"Send a system notification to the user. Use this to alert the user about important events that require their attention, such as long-running task completion, errors requiring intervention, or questions. " +
"Notifications appear as OS-native notifications (macOS Notification Center, Windows Toast, Linux). " +
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine status updates (use status_set instead).",
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine progress updates — keep the todo list current instead.",
schema: z
.object({
title: z
Expand Down
11 changes: 0 additions & 11 deletions docs/hooks/tools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -574,17 +574,6 @@ If a value is too large for the environment, it may be omitted (not set). Mux al

</details>

<details>
<summary>status_set (3)</summary>

| Env var | JSON path | Type | Description |
| ------------------------ | --------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `MUX_TOOL_INPUT_EMOJI` | `emoji` | string | A single emoji character representing the current activity |
| `MUX_TOOL_INPUT_MESSAGE` | `message` | string | A brief description of the current activity (auto-truncated to 60 chars with ellipsis if needed) |
| `MUX_TOOL_INPUT_URL` | `url` | string | Optional URL to external resource with more details (e.g., Pull Request URL). The URL persists and is displayed to the user for easy access. |

</details>

<details>
<summary>switch_agent (3)</summary>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ import {
createUserMessage,
createAssistantMessage,
createProposePlanTool,
createStatusTool,
createTodoWriteTool,
} from "@/browser/stories/mockFactory";
import {
PLAN_AUTO_ROUTING_STATUS_EMOJI,
PLAN_AUTO_ROUTING_STATUS_MESSAGE,
} from "@/common/constants/planAutoRoutingStatus";
import { PLAN_AUTO_ROUTING_STATUS_MESSAGE } from "@/common/constants/planAutoRoutingStatus";

const meta = { ...appMeta, title: "App/Chat/Tools/ProposePlan" };
export default meta;
Expand Down Expand Up @@ -221,13 +218,7 @@ export const ProposePlanAutoRoutingDecisionGap: AppStory = {
createAssistantMessage("msg-3", "Selecting the right executor for this plan.", {
historySequence: 3,
timestamp: STABLE_TIMESTAMP - 220000,
toolCalls: [
createStatusTool(
"call-status-1",
PLAN_AUTO_ROUTING_STATUS_EMOJI,
PLAN_AUTO_ROUTING_STATUS_MESSAGE
),
],
toolCalls: [createTodoWriteTool("call-status-1", PLAN_AUTO_ROUTING_STATUS_MESSAGE)],
}),
],
})
Expand Down
11 changes: 9 additions & 2 deletions src/browser/features/Tools/Shared/getToolComponent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,17 @@ interface ToolRegistryEntry {
* Registry mapping tool names to their components and validation schemas.
* Adding a new tool: add one line here.
*
* Note: Some tools (ask_user_question, propose_plan, todo_write, status_set) require
* Note: Some tools (ask_user_question, propose_plan, todo_write) require
* props like workspaceId/toolCallId that aren't available in nested context. This is
* fine because the backend excludes these from code_execution sandbox (see EXCLUDED_TOOLS
* in src/node/services/ptc/toolBridge.ts). They can never appear in nested tool calls.
*/
const legacyStatusSetSchema = z.object({
emoji: z.string(),
message: z.string(),
url: z.string().url().optional().nullable(),
});

const TOOL_REGISTRY: Record<string, ToolRegistryEntry> = {
bash: { component: BashToolCall, schema: TOOL_DEFINITIONS.bash.schema },
file_read: { component: FileReadToolCall, schema: TOOL_DEFINITIONS.file_read.schema },
Expand Down Expand Up @@ -120,7 +126,8 @@ const TOOL_REGISTRY: Record<string, ToolRegistryEntry> = {
schema: TOOL_DEFINITIONS.propose_plan.schema,
},
todo_write: { component: TodoToolCall, schema: TOOL_DEFINITIONS.todo_write.schema },
status_set: { component: StatusSetToolCall, schema: TOOL_DEFINITIONS.status_set.schema },
// Legacy-only transcript renderer for historical status_set calls.
status_set: { component: StatusSetToolCall, schema: legacyStatusSetSchema },
switch_agent: {
component: SwitchAgentToolCall,
schema: TOOL_DEFINITIONS.switch_agent.schema,
Expand Down
31 changes: 0 additions & 31 deletions src/browser/features/Tools/StatusSet/StatusSetToolCall.stories.tsx

This file was deleted.

19 changes: 18 additions & 1 deletion src/browser/features/Tools/TodoToolCall.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import React from "react";
import { EmojiIcon } from "@/browser/components/icons/EmojiIcon/EmojiIcon";
import { TodoList } from "@/browser/components/TodoList/TodoList";
import type { TodoWriteToolArgs, TodoWriteToolResult } from "@/common/types/tools";
import { deriveTodoStatus } from "@/common/utils/todoList";
import {
ToolContainer,
ToolHeader,
Expand All @@ -9,7 +12,6 @@ import {
ToolIcon,
} from "./Shared/ToolPrimitives";
import { useToolExpansion, getStatusDisplay, type ToolStatus } from "./Shared/toolUtils";
import { TodoList } from "@/browser/components/TodoList/TodoList";

interface TodoToolCallProps {
args: TodoWriteToolArgs;
Expand All @@ -24,12 +26,27 @@ export const TodoToolCall: React.FC<TodoToolCallProps> = ({
}) => {
const { expanded, toggleExpanded } = useToolExpansion(false); // Collapsed by default
const statusDisplay = getStatusDisplay(status);
const todoStatusPreview = deriveTodoStatus(args.todos);
const fallbackPreview =
args.todos.length === 0
? "Cleared todo list"
: `${args.todos.length} item${args.todos.length === 1 ? "" : "s"}`;

return (
<ToolContainer expanded={expanded}>
<ToolHeader onClick={toggleExpanded}>
<ExpandIcon expanded={expanded}>▶</ExpandIcon>
<ToolIcon toolName="todo_write" />
<span className="text-muted-foreground flex min-w-0 flex-1 items-center gap-1 italic">
{todoStatusPreview ? (
<>
<EmojiIcon emoji={todoStatusPreview.emoji} className="h-3 w-3 shrink-0" />
<span className="truncate">{todoStatusPreview.message}</span>
</>
) : (
<span className="truncate">{fallbackPreview}</span>
)}
</span>
<StatusIndicator status={status}>{statusDisplay}</StatusIndicator>
</ToolHeader>

Expand Down
118 changes: 116 additions & 2 deletions src/browser/stores/WorkspaceStore.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import {
getAutoCompactionThresholdKey,
getAutoRetryKey,
getPinnedTodoExpandedKey,
getStatusStateKey,
} from "@/common/constants/storage";
import type { TodoItem } from "@/common/types/tools";
import { WorkspaceStore } from "./WorkspaceStore";
Expand Down Expand Up @@ -1783,7 +1784,8 @@ describe("WorkspaceStore", () => {
streaming: true,
lastModel: "claude-sonnet-4",
lastThinkingLevel: "high",
agentStatus: { emoji: "🔧", message: "Running checks", url: "https://example.com" },
todoStatus: { emoji: "🔄", message: "Run checks" },
hasTodos: true,
};

// Recreate the store so the first activity.list call uses this test snapshot.
Expand All @@ -1809,10 +1811,122 @@ describe("WorkspaceStore", () => {
expect(state.canInterrupt).toBe(true);
expect(state.currentModel).toBe(activitySnapshot.lastModel);
expect(state.currentThinkingLevel).toBe(activitySnapshot.lastThinkingLevel);
expect(state.agentStatus).toEqual(activitySnapshot.agentStatus ?? undefined);
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
expect(state.recencyTimestamp).toBe(activitySnapshot.recency);
});

it("falls back to persisted activity todoStatus for active workspaces when replayed todos are absent", async () => {
const workspaceId = "active-activity-todo-fallback";
const activitySnapshot: WorkspaceActivitySnapshot = {
recency: new Date("2024-01-04T09:00:00.000Z").getTime(),
streaming: true,
lastModel: "claude-sonnet-4",
lastThinkingLevel: null,
todoStatus: { emoji: "🔄", message: "Persisted todo snapshot" },
hasTodos: true,
};

store.dispose();
store = new WorkspaceStore(mockOnModelUsed);
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
store.setClient(mockClient as any);
await new Promise((resolve) => setTimeout(resolve, 0));

createAndAddWorkspace(store, workspaceId);
const state = store.getWorkspaceState(workspaceId);
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
});

it("derives active workspace status from the current todo list", () => {
const workspaceId = "active-todo-status-workspace";
createAndAddWorkspace(store, workspaceId);
seedPinnedTodos(store, workspaceId, [
{ content: "Run typecheck", status: "in_progress" },
{ content: "Add regression test", status: "pending" },
]);

const state = store.getWorkspaceState(workspaceId);
expect(state.agentStatus).toEqual({ emoji: "🔄", message: "Run typecheck" });
});

it("prefers todo-derived activity status for inactive workspaces", async () => {
const workspaceId = "activity-fallback-todo-status-workspace";
const activitySnapshot: WorkspaceActivitySnapshot = {
recency: new Date("2024-01-04T12:00:00.000Z").getTime(),
streaming: true,
lastModel: "claude-sonnet-4",
lastThinkingLevel: "high",
todoStatus: { emoji: "🔄", message: "Run typecheck" },
hasTodos: true,
};

store.dispose();
store = new WorkspaceStore(mockOnModelUsed);
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
store.setClient(mockClient as any);
await new Promise((resolve) => setTimeout(resolve, 0));

createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);

const state = store.getWorkspaceState(workspaceId);
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
});

it("prefers transient displayStatus over todo-derived status for inactive workspaces", async () => {
const workspaceId = "activity-fallback-display-status-workspace";
const activitySnapshot: WorkspaceActivitySnapshot = {
recency: new Date("2024-01-04T15:00:00.000Z").getTime(),
streaming: false,
lastModel: "claude-sonnet-4",
lastThinkingLevel: null,
displayStatus: { emoji: "🤔", message: "Deciding execution strategy" },
todoStatus: { emoji: "🔄", message: "Run typecheck" },
hasTodos: true,
};

store.dispose();
store = new WorkspaceStore(mockOnModelUsed);
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
store.setClient(mockClient as any);
await new Promise((resolve) => setTimeout(resolve, 0));

createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);

const state = store.getWorkspaceState(workspaceId);
expect(state.agentStatus).toEqual(activitySnapshot.displayStatus ?? undefined);
});

it("suppresses stale legacy status fallback when activity says the todo list is empty", async () => {
const workspaceId = "activity-fallback-empty-todo-status";
const activitySnapshot: WorkspaceActivitySnapshot = {
recency: new Date("2024-01-04T18:00:00.000Z").getTime(),
streaming: false,
lastModel: "claude-sonnet-4",
lastThinkingLevel: null,
hasTodos: false,
};

localStorageBacking.set(
getStatusStateKey(workspaceId),
JSON.stringify({ emoji: "🔍", message: "Old persisted status" })
);

store.dispose();
store = new WorkspaceStore(mockOnModelUsed);
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
store.setClient(mockClient as any);
await new Promise((resolve) => setTimeout(resolve, 0));

createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);

const state = store.getWorkspaceState(workspaceId);
expect(state.agentStatus).toBeUndefined();
});

it("fires response-complete callback when a background workspace stops streaming", async () => {
const activeWorkspaceId = "active-workspace";
const backgroundWorkspaceId = "background-workspace";
Expand Down
Loading
Loading