Skip to content

Commit b637bc0

Browse files
committed
🤖 refactor: derive workspace status from todo list
Derive workspace sidebar and landing-page progress from the current todo list. - publish todo-derived activity snapshots for live and background workspaces - prefer todo-derived status over legacy status_set payloads in the UI - remove status_set from the default tool surface and refresh agent/docs guidance --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$n/a`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=n/a -->
1 parent 8ec0a10 commit b637bc0

33 files changed

Lines changed: 692 additions & 654 deletions

docs/AGENTS.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ description: Agent instructions for AI assistants working on the Mux codebase
5959
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.
6060

6161
Core workflow:
62+
6263
1. `agent-browser open <url>` - Navigate to page
6364
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
6465
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
@@ -128,7 +129,7 @@ Mobile app tests live in `mobile/src/**/*.test.ts` and use Bun's built-in test r
128129
- Never use emoji characters as UI icons or status indicators; emoji rendering varies across platforms and fonts.
129130
- Prefer SVG icons (usually from `lucide-react`) or shared icon components under `src/browser/components/icons/`.
130131
- For tool call headers, use `ToolIcon` from `src/browser/components/tools/shared/ToolPrimitives.tsx`.
131-
- If a tool/agent provides an emoji string (e.g., `status_set` or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
132+
- If a tool/agent provides an emoji string (e.g., todo-derived status or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
132133
- If a new emoji appears in tool output, extend `EmojiIcon` to map it to an SVG icon.
133134
- Colors defined in `src/browser/styles/globals.css` (`:root @theme` block). Reference via CSS variables (e.g., `var(--color-plan-mode)`), never hardcode hex values.
134135
- For incrementing numeric UI (costs, timers, token counts, percentages), use semantic numeric typography utilities (`counter-nums` / `counter-nums-mono`) to prevent width jitter.
@@ -229,9 +230,9 @@ Freely make breaking changes, and reorganize / cleanup IPC as needed.
229230
- E2E tests (tests/e2e) work with Radix but are slow (~2min startup); reserve for scenarios that truly need real Electron.
230231
- Only use `validateApiKeys()` in tests that actually make AI API calls.
231232

232-
## Tool: status_set
233+
## Tool: todo_write
233234

234-
- Set status url to the Pull Request once opened
235+
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.
235236

236237
## GitHub
237238

docs/agents/index.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -632,7 +632,6 @@ tools:
632632
- ask_user_question
633633
- todo_read
634634
- todo_write
635-
- status_set
636635
- notify
637636
- analytics_query
638637
---

docs/agents/instruction-files.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Be terse and to the point.
6969

7070
## Model: openai:.\*codex
7171

72-
Use status reporting tools every few minutes.
72+
Keep the todo list current every few minutes while a task is in flight.
7373
```
7474

7575
### Tool Prompts
@@ -92,12 +92,12 @@ Customize how the AI uses specific tools by appending instructions to their desc
9292

9393
- Run `prettier --write` after editing files
9494

95-
## Tool: status_set
95+
## Tool: todo_write
9696

97-
- Set status URL to the Pull Request once opened
97+
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.
9898
```
9999

100-
**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `status_set`, `web_fetch`, `web_search`.
100+
**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `web_fetch`, `web_search`.
101101

102102
## Practical layout
103103

docs/config/notifications.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The recommended way to configure the `notify` tool is via a `Tool: notify` scope
4343
- Notify on CI failures or deployment issues
4444
- Notify when waiting for user input longer than 30 seconds
4545
- Do not notify for routine status updates
46-
- Use status_set for progress updates instead
46+
- Use `todo_write` for routine progress updates instead
4747
```
4848

4949
See [Instruction Files](/agents/instruction-files) for more on scoped instructions.
@@ -94,7 +94,7 @@ notify: {
9494
description:
9595
"Send a system notification to the user. Use this to alert the user about important events that require their attention, such as long-running task completion, errors requiring intervention, or questions. " +
9696
"Notifications appear as OS-native notifications (macOS Notification Center, Windows Toast, Linux). " +
97-
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine status updates (use status_set instead).",
97+
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine progress updates — keep the todo list current instead.",
9898
schema: z
9999
.object({
100100
title: z

docs/hooks/tools.mdx

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -574,17 +574,6 @@ If a value is too large for the environment, it may be omitted (not set). Mux al
574574

575575
</details>
576576

577-
<details>
578-
<summary>status_set (3)</summary>
579-
580-
| Env var | JSON path | Type | Description |
581-
| ------------------------ | --------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
582-
| `MUX_TOOL_INPUT_EMOJI` | `emoji` | string | A single emoji character representing the current activity |
583-
| `MUX_TOOL_INPUT_MESSAGE` | `message` | string | A brief description of the current activity (auto-truncated to 60 chars with ellipsis if needed) |
584-
| `MUX_TOOL_INPUT_URL` | `url` | string | Optional URL to external resource with more details (e.g., Pull Request URL). The URL persists and is displayed to the user for easy access. |
585-
586-
</details>
587-
588577
<details>
589578
<summary>switch_agent (3)</summary>
590579

src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,9 @@ import {
66
createUserMessage,
77
createAssistantMessage,
88
createProposePlanTool,
9-
createStatusTool,
9+
createTodoWriteTool,
1010
} from "@/browser/stories/mockFactory";
11-
import {
12-
PLAN_AUTO_ROUTING_STATUS_EMOJI,
13-
PLAN_AUTO_ROUTING_STATUS_MESSAGE,
14-
} from "@/common/constants/planAutoRoutingStatus";
11+
import { PLAN_AUTO_ROUTING_STATUS_MESSAGE } from "@/common/constants/planAutoRoutingStatus";
1512

1613
const meta = { ...appMeta, title: "App/Chat/Tools/ProposePlan" };
1714
export default meta;
@@ -221,13 +218,7 @@ export const ProposePlanAutoRoutingDecisionGap: AppStory = {
221218
createAssistantMessage("msg-3", "Selecting the right executor for this plan.", {
222219
historySequence: 3,
223220
timestamp: STABLE_TIMESTAMP - 220000,
224-
toolCalls: [
225-
createStatusTool(
226-
"call-status-1",
227-
PLAN_AUTO_ROUTING_STATUS_EMOJI,
228-
PLAN_AUTO_ROUTING_STATUS_MESSAGE
229-
),
230-
],
221+
toolCalls: [createTodoWriteTool("call-status-1", PLAN_AUTO_ROUTING_STATUS_MESSAGE)],
231222
}),
232223
],
233224
})

src/browser/features/Tools/Shared/getToolComponent.ts

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ import { WebSearchToolCall } from "../WebSearchToolCall";
2222
import { AskUserQuestionToolCall } from "../AskUserQuestionToolCall";
2323
import { ProposePlanToolCall } from "../ProposePlanToolCall";
2424
import { TodoToolCall } from "../TodoToolCall";
25-
import { StatusSetToolCall } from "../StatusSetToolCall";
2625
import { SwitchAgentToolCall } from "../SwitchAgentToolCall";
2726
import { NotifyToolCall } from "../NotifyToolCall";
2827
import { BashBackgroundListToolCall } from "../BashBackgroundListToolCall";
@@ -56,7 +55,7 @@ interface ToolRegistryEntry {
5655
* Registry mapping tool names to their components and validation schemas.
5756
* Adding a new tool: add one line here.
5857
*
59-
* Note: Some tools (ask_user_question, propose_plan, todo_write, status_set) require
58+
* Note: Some tools (ask_user_question, propose_plan, todo_write) require
6059
* props like workspaceId/toolCallId that aren't available in nested context. This is
6160
* fine because the backend excludes these from code_execution sandbox (see EXCLUDED_TOOLS
6261
* in src/node/services/ptc/toolBridge.ts). They can never appear in nested tool calls.
@@ -120,7 +119,6 @@ const TOOL_REGISTRY: Record<string, ToolRegistryEntry> = {
120119
schema: TOOL_DEFINITIONS.propose_plan.schema,
121120
},
122121
todo_write: { component: TodoToolCall, schema: TOOL_DEFINITIONS.todo_write.schema },
123-
status_set: { component: StatusSetToolCall, schema: TOOL_DEFINITIONS.status_set.schema },
124122
switch_agent: {
125123
component: SwitchAgentToolCall,
126124
schema: TOOL_DEFINITIONS.switch_agent.schema,

src/browser/features/Tools/StatusSet/StatusSetToolCall.stories.tsx

Lines changed: 0 additions & 31 deletions
This file was deleted.

src/browser/features/Tools/TodoToolCall.tsx

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
import React from "react";
2+
import { EmojiIcon } from "@/browser/components/icons/EmojiIcon/EmojiIcon";
3+
import { TodoList } from "@/browser/components/TodoList/TodoList";
24
import type { TodoWriteToolArgs, TodoWriteToolResult } from "@/common/types/tools";
5+
import { deriveTodoStatus } from "@/common/utils/todoList";
36
import {
47
ToolContainer,
58
ToolHeader,
@@ -9,7 +12,6 @@ import {
912
ToolIcon,
1013
} from "./Shared/ToolPrimitives";
1114
import { useToolExpansion, getStatusDisplay, type ToolStatus } from "./Shared/toolUtils";
12-
import { TodoList } from "@/browser/components/TodoList/TodoList";
1315

1416
interface TodoToolCallProps {
1517
args: TodoWriteToolArgs;
@@ -24,12 +26,27 @@ export const TodoToolCall: React.FC<TodoToolCallProps> = ({
2426
}) => {
2527
const { expanded, toggleExpanded } = useToolExpansion(false); // Collapsed by default
2628
const statusDisplay = getStatusDisplay(status);
29+
const todoStatusPreview = deriveTodoStatus(args.todos);
30+
const fallbackPreview =
31+
args.todos.length === 0
32+
? "Cleared todo list"
33+
: `${args.todos.length} item${args.todos.length === 1 ? "" : "s"}`;
2734

2835
return (
2936
<ToolContainer expanded={expanded}>
3037
<ToolHeader onClick={toggleExpanded}>
3138
<ExpandIcon expanded={expanded}></ExpandIcon>
3239
<ToolIcon toolName="todo_write" />
40+
<span className="text-muted-foreground flex min-w-0 flex-1 items-center gap-1 italic">
41+
{todoStatusPreview ? (
42+
<>
43+
<EmojiIcon emoji={todoStatusPreview.emoji} className="h-3 w-3 shrink-0" />
44+
<span className="truncate">{todoStatusPreview.message}</span>
45+
</>
46+
) : (
47+
<span className="truncate">{fallbackPreview}</span>
48+
)}
49+
</span>
3350
<StatusIndicator status={status}>{statusDisplay}</StatusIndicator>
3451
</ToolHeader>
3552

src/browser/stores/WorkspaceStore.test.ts

Lines changed: 93 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import {
88
getAutoCompactionThresholdKey,
99
getAutoRetryKey,
1010
getPinnedTodoExpandedKey,
11+
getStatusStateKey,
1112
} from "@/common/constants/storage";
1213
import type { TodoItem } from "@/common/types/tools";
1314
import { WorkspaceStore } from "./WorkspaceStore";
@@ -1783,7 +1784,8 @@ describe("WorkspaceStore", () => {
17831784
streaming: true,
17841785
lastModel: "claude-sonnet-4",
17851786
lastThinkingLevel: "high",
1786-
agentStatus: { emoji: "🔧", message: "Running checks", url: "https://example.com" },
1787+
todoStatus: { emoji: "🔄", message: "Run checks" },
1788+
hasTodos: true,
17871789
};
17881790

17891791
// Recreate the store so the first activity.list call uses this test snapshot.
@@ -1809,10 +1811,99 @@ describe("WorkspaceStore", () => {
18091811
expect(state.canInterrupt).toBe(true);
18101812
expect(state.currentModel).toBe(activitySnapshot.lastModel);
18111813
expect(state.currentThinkingLevel).toBe(activitySnapshot.lastThinkingLevel);
1812-
expect(state.agentStatus).toEqual(activitySnapshot.agentStatus ?? undefined);
1814+
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
18131815
expect(state.recencyTimestamp).toBe(activitySnapshot.recency);
18141816
});
18151817

1818+
it("derives active workspace status from the current todo list", () => {
1819+
const workspaceId = "active-todo-status-workspace";
1820+
createAndAddWorkspace(store, workspaceId);
1821+
seedPinnedTodos(store, workspaceId, [
1822+
{ content: "Run typecheck", status: "in_progress" },
1823+
{ content: "Add regression test", status: "pending" },
1824+
]);
1825+
1826+
const state = store.getWorkspaceState(workspaceId);
1827+
expect(state.agentStatus).toEqual({ emoji: "🔄", message: "Run typecheck" });
1828+
});
1829+
1830+
it("prefers todo-derived activity status for inactive workspaces", async () => {
1831+
const workspaceId = "activity-fallback-todo-status-workspace";
1832+
const activitySnapshot: WorkspaceActivitySnapshot = {
1833+
recency: new Date("2024-01-04T12:00:00.000Z").getTime(),
1834+
streaming: true,
1835+
lastModel: "claude-sonnet-4",
1836+
lastThinkingLevel: "high",
1837+
todoStatus: { emoji: "🔄", message: "Run typecheck" },
1838+
hasTodos: true,
1839+
};
1840+
1841+
store.dispose();
1842+
store = new WorkspaceStore(mockOnModelUsed);
1843+
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
1844+
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
1845+
store.setClient(mockClient as any);
1846+
await new Promise((resolve) => setTimeout(resolve, 0));
1847+
1848+
createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);
1849+
1850+
const state = store.getWorkspaceState(workspaceId);
1851+
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
1852+
});
1853+
1854+
it("prefers transient displayStatus over todo-derived status for inactive workspaces", async () => {
1855+
const workspaceId = "activity-fallback-display-status-workspace";
1856+
const activitySnapshot: WorkspaceActivitySnapshot = {
1857+
recency: new Date("2024-01-04T15:00:00.000Z").getTime(),
1858+
streaming: false,
1859+
lastModel: "claude-sonnet-4",
1860+
lastThinkingLevel: null,
1861+
displayStatus: { emoji: "🤔", message: "Deciding execution strategy" },
1862+
todoStatus: { emoji: "🔄", message: "Run typecheck" },
1863+
hasTodos: true,
1864+
};
1865+
1866+
store.dispose();
1867+
store = new WorkspaceStore(mockOnModelUsed);
1868+
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
1869+
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
1870+
store.setClient(mockClient as any);
1871+
await new Promise((resolve) => setTimeout(resolve, 0));
1872+
1873+
createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);
1874+
1875+
const state = store.getWorkspaceState(workspaceId);
1876+
expect(state.agentStatus).toEqual(activitySnapshot.displayStatus ?? undefined);
1877+
});
1878+
1879+
it("suppresses stale legacy status fallback when activity says the todo list is empty", async () => {
1880+
const workspaceId = "activity-fallback-empty-todo-status";
1881+
const activitySnapshot: WorkspaceActivitySnapshot = {
1882+
recency: new Date("2024-01-04T18:00:00.000Z").getTime(),
1883+
streaming: false,
1884+
lastModel: "claude-sonnet-4",
1885+
lastThinkingLevel: null,
1886+
hasTodos: false,
1887+
};
1888+
1889+
localStorageBacking.set(
1890+
getStatusStateKey(workspaceId),
1891+
JSON.stringify({ emoji: "🔍", message: "Old persisted status" })
1892+
);
1893+
1894+
store.dispose();
1895+
store = new WorkspaceStore(mockOnModelUsed);
1896+
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
1897+
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
1898+
store.setClient(mockClient as any);
1899+
await new Promise((resolve) => setTimeout(resolve, 0));
1900+
1901+
createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);
1902+
1903+
const state = store.getWorkspaceState(workspaceId);
1904+
expect(state.agentStatus).toBeUndefined();
1905+
});
1906+
18161907
it("fires response-complete callback when a background workspace stops streaming", async () => {
18171908
const activeWorkspaceId = "active-workspace";
18181909
const backgroundWorkspaceId = "background-workspace";

0 commit comments

Comments
 (0)