Skip to content

Commit 196a5ce

Browse files
committed
🤖 refactor: derive workspace status from todo list
Derive workspace sidebar and landing-page progress from the current todo list. - publish todo-derived activity snapshots for live and background workspaces - prefer todo-derived status over legacy status_set payloads in the UI - remove status_set from the default tool surface and refresh agent/docs guidance --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$n/a`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=n/a -->
1 parent 8ec0a10 commit 196a5ce

20 files changed

Lines changed: 265 additions & 43 deletions

docs/AGENTS.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ description: Agent instructions for AI assistants working on the Mux codebase
5959
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.
6060

6161
Core workflow:
62+
6263
1. `agent-browser open <url>` - Navigate to page
6364
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
6465
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
@@ -128,7 +129,7 @@ Mobile app tests live in `mobile/src/**/*.test.ts` and use Bun's built-in test r
128129
- Never use emoji characters as UI icons or status indicators; emoji rendering varies across platforms and fonts.
129130
- Prefer SVG icons (usually from `lucide-react`) or shared icon components under `src/browser/components/icons/`.
130131
- For tool call headers, use `ToolIcon` from `src/browser/components/tools/shared/ToolPrimitives.tsx`.
131-
- If a tool/agent provides an emoji string (e.g., `status_set` or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
132+
- If a tool/agent provides an emoji string (e.g., todo-derived status or `displayStatus`), render via `EmojiIcon` (`src/browser/components/icons/EmojiIcon.tsx`) instead of rendering the emoji.
132133
- If a new emoji appears in tool output, extend `EmojiIcon` to map it to an SVG icon.
133134
- Colors defined in `src/browser/styles/globals.css` (`:root @theme` block). Reference via CSS variables (e.g., `var(--color-plan-mode)`), never hardcode hex values.
134135
- For incrementing numeric UI (costs, timers, token counts, percentages), use semantic numeric typography utilities (`counter-nums` / `counter-nums-mono`) to prevent width jitter.
@@ -229,9 +230,9 @@ Freely make breaking changes, and reorganize / cleanup IPC as needed.
229230
- E2E tests (tests/e2e) work with Radix but are slow (~2min startup); reserve for scenarios that truly need real Electron.
230231
- Only use `validateApiKeys()` in tests that actually make AI API calls.
231232

232-
## Tool: status_set
233+
## Tool: todo_write
233234

234-
- Set status url to the Pull Request once opened
235+
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.
235236

236237
## GitHub
237238

docs/agents/index.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -632,7 +632,6 @@ tools:
632632
- ask_user_question
633633
- todo_read
634634
- todo_write
635-
- status_set
636635
- notify
637636
- analytics_query
638637
---

docs/agents/instruction-files.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Be terse and to the point.
6969

7070
## Model: openai:.\*codex
7171

72-
Use status reporting tools every few minutes.
72+
Keep the todo list current every few minutes while a task is in flight.
7373
```
7474

7575
### Tool Prompts
@@ -92,12 +92,12 @@ Customize how the AI uses specific tools by appending instructions to their desc
9292

9393
- Run `prettier --write` after editing files
9494

95-
## Tool: status_set
95+
## Tool: todo_write
9696

97-
- Set status URL to the Pull Request once opened
97+
- Keep the TODO list current during multi-step work; sidebar progress is derived from it.
9898
```
9999

100-
**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `status_set`, `web_fetch`, `web_search`.
100+
**Common tools** (varies by model/provider): `bash`, `file_read`, `file_edit_replace_string`, `file_edit_insert`, `propose_plan`, `ask_user_question`, `todo_write`, `todo_read`, `web_fetch`, `web_search`.
101101

102102
## Practical layout
103103

docs/config/notifications.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The recommended way to configure the `notify` tool is via a `Tool: notify` scope
4343
- Notify on CI failures or deployment issues
4444
- Notify when waiting for user input longer than 30 seconds
4545
- Do not notify for routine status updates
46-
- Use status_set for progress updates instead
46+
- Use `todo_write` for routine progress updates instead
4747
```
4848

4949
See [Instruction Files](/agents/instruction-files) for more on scoped instructions.
@@ -94,7 +94,7 @@ notify: {
9494
description:
9595
"Send a system notification to the user. Use this to alert the user about important events that require their attention, such as long-running task completion, errors requiring intervention, or questions. " +
9696
"Notifications appear as OS-native notifications (macOS Notification Center, Windows Toast, Linux). " +
97-
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine status updates (use status_set instead).",
97+
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine progress updates — keep the todo list current instead.",
9898
schema: z
9999
.object({
100100
title: z

src/browser/stores/WorkspaceStore.test.ts

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1813,6 +1813,43 @@ describe("WorkspaceStore", () => {
18131813
expect(state.recencyTimestamp).toBe(activitySnapshot.recency);
18141814
});
18151815

1816+
it("derives active workspace status from the current todo list", () => {
1817+
const workspaceId = "active-todo-status-workspace";
1818+
createAndAddWorkspace(store, workspaceId);
1819+
seedPinnedTodos(store, workspaceId, [
1820+
{ content: "Run typecheck", status: "in_progress" },
1821+
{ content: "Add regression test", status: "pending" },
1822+
]);
1823+
1824+
const state = store.getWorkspaceState(workspaceId);
1825+
expect(state.agentStatus).toEqual({ emoji: "🔄", message: "Run typecheck" });
1826+
});
1827+
1828+
it("prefers todo-derived activity status over legacy agent status for inactive workspaces", async () => {
1829+
const workspaceId = "activity-fallback-todo-status-workspace";
1830+
const activitySnapshot: WorkspaceActivitySnapshot = {
1831+
recency: new Date("2024-01-04T12:00:00.000Z").getTime(),
1832+
streaming: true,
1833+
lastModel: "claude-sonnet-4",
1834+
lastThinkingLevel: "high",
1835+
agentStatus: { emoji: "🔧", message: "Legacy status" },
1836+
todoStatus: { emoji: "🔄", message: "Run typecheck" },
1837+
hasTodos: true,
1838+
};
1839+
1840+
store.dispose();
1841+
store = new WorkspaceStore(mockOnModelUsed);
1842+
mockActivityList.mockResolvedValue({ [workspaceId]: activitySnapshot });
1843+
// eslint-disable-next-line @typescript-eslint/no-unsafe-argument, @typescript-eslint/no-explicit-any
1844+
store.setClient(mockClient as any);
1845+
await new Promise((resolve) => setTimeout(resolve, 0));
1846+
1847+
createAndAddWorkspace(store, workspaceId, { createdAt: "2020-01-01T00:00:00.000Z" }, false);
1848+
1849+
const state = store.getWorkspaceState(workspaceId);
1850+
expect(state.agentStatus).toEqual(activitySnapshot.todoStatus ?? undefined);
1851+
});
1852+
18161853
it("fires response-complete callback when a background workspace stops streaming", async () => {
18171854
const activeWorkspaceId = "active-workspace";
18181855
const backgroundWorkspaceId = "background-workspace";

src/browser/stores/WorkspaceStore.ts

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ import {
4444
} from "@/common/types/stream";
4545
import { MapStore } from "./MapStore";
4646
import { createDisplayUsage, recomputeUsageCosts } from "@/common/utils/tokens/displayUsage";
47+
import { deriveTodoStatus } from "@/common/utils/todoList";
4748
import { getModelStats } from "@/common/utils/tokens/modelStats";
4849
import { resolveModelForMetadata } from "@/common/utils/providers/modelEntries";
4950
import { computeProvidersConfigFingerprint } from "@/common/utils/providers/configFingerprint";
@@ -1566,11 +1567,17 @@ export class WorkspaceStore {
15661567
!canInterrupt;
15671568
const isHydratingTranscript =
15681569
isActiveWorkspace && transient.isHydratingTranscript && !transient.caughtUp;
1569-
const agentStatus = useAggregatorState
1570+
const aggregatorTodos = aggregator.getCurrentTodos();
1571+
const todoStatus = useAggregatorState
1572+
? deriveTodoStatus(aggregatorTodos)
1573+
: (activity?.todoStatus ??
1574+
(activity?.hasTodos === false ? undefined : deriveTodoStatus(aggregatorTodos)));
1575+
const fallbackAgentStatus = useAggregatorState
15701576
? aggregator.getAgentStatus()
15711577
: activity
15721578
? (activity.agentStatus ?? undefined)
15731579
: aggregator.getAgentStatus();
1580+
const agentStatus = todoStatus ?? fallbackAgentStatus;
15741581

15751582
// Live streaming stats
15761583
const activeStreamMessageId = aggregator.getActiveStreamMessageId();
@@ -1597,7 +1604,7 @@ export class WorkspaceStore {
15971604
currentModel,
15981605
currentThinkingLevel,
15991606
recencyTimestamp,
1600-
todos: aggregator.getCurrentTodos(),
1607+
todos: aggregatorTodos,
16011608
loadedSkills: aggregator.getLoadedSkills(),
16021609
skillLoadErrors: aggregator.getSkillLoadErrors(),
16031610
lastAbortReason: aggregator.getLastAbortReason(),
@@ -2275,6 +2282,8 @@ export class WorkspaceStore {
22752282
previous?.lastModel !== snapshot?.lastModel ||
22762283
previous?.lastThinkingLevel !== snapshot?.lastThinkingLevel ||
22772284
previous?.recency !== snapshot?.recency ||
2285+
previous?.hasTodos !== snapshot?.hasTodos ||
2286+
!areAgentStatusesEqual(previous?.todoStatus, snapshot?.todoStatus) ||
22782287
!areAgentStatusesEqual(previous?.agentStatus, snapshot?.agentStatus);
22792288

22802289
if (!changed) {

src/common/constants/storage.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -478,8 +478,9 @@ export function getFileTreeExpandStateKey(workspaceId: string): string {
478478
export const REVIEW_FILE_TREE_VIEW_MODE_KEY = "reviewFileTreeViewMode";
479479

480480
/**
481-
* Get the localStorage key for persisted agent status for a workspace
481+
* Get the localStorage key for persisted legacy agent status for a workspace.
482482
* Stores the most recent successful status_set payload (emoji, message, url)
483+
* so historical status rows and older sessions can still be reconstructed.
483484
* Format: "statusState:{workspaceId}"
484485
*/
485486

src/common/orpc/schemas/workspace.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,11 @@ export const WorkspaceActivitySnapshotSchema = z.object({
173173
}),
174174
agentStatus: WorkspaceAgentStatusSchema.nullable().optional().meta({
175175
description:
176-
"Most recent status_set value for this workspace (used to surface background progress in sidebar).",
176+
"Most recent legacy status value for this workspace (used for non-todo progress surfaces).",
177+
}),
178+
todoStatus: WorkspaceAgentStatusSchema.nullable().optional().meta({
179+
description:
180+
"Status derived from the current todo list (preferred background progress surface in the sidebar).",
177181
}),
178182
hasTodos: z.boolean().optional().meta({
179183
description: "Whether the workspace still had todos when streaming last stopped",

src/common/utils/todoList.ts

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,11 @@ interface TodoLikeItem {
66
status: TodoLikeStatus;
77
}
88

9+
export interface TodoStatusSummary {
10+
emoji: "✓" | "🔄" | "○";
11+
message: string;
12+
}
13+
914
export function renderTodoItemsAsMarkdownList(todos: TodoItem[]): string {
1015
return todos
1116
.map((todo) => {
@@ -16,6 +21,32 @@ export function renderTodoItemsAsMarkdownList(todos: TodoItem[]): string {
1621
.join("\n");
1722
}
1823

24+
/**
25+
* Sidebar and landing-card status should reflect the most actionable todo item,
26+
* so we surface in-progress work first, then the next pending task, and finally
27+
* the most recent completion while the finished list is still visible.
28+
*/
29+
export function deriveTodoStatus(todos: readonly TodoItem[]): TodoStatusSummary | undefined {
30+
const inProgressTodo = todos.find((todo) => todo.status === "in_progress");
31+
if (inProgressTodo) {
32+
return { emoji: "🔄", message: inProgressTodo.content };
33+
}
34+
35+
const pendingTodo = todos.find((todo) => todo.status === "pending");
36+
if (pendingTodo) {
37+
return { emoji: "○", message: pendingTodo.content };
38+
}
39+
40+
for (let index = todos.length - 1; index >= 0; index--) {
41+
const todo = todos[index];
42+
if (todo.status === "completed") {
43+
return { emoji: "✓", message: todo.content };
44+
}
45+
}
46+
47+
return undefined;
48+
}
49+
1950
/**
2051
* `propose_plan` ends the active planning turn immediately, so any in-progress
2152
* todo steps need to flip to completed even though the model does not get a

src/common/utils/tools/toolDefinitions.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1449,6 +1449,8 @@ export const TOOL_DEFINITIONS = {
14491449
description: "Read the current todo list",
14501450
schema: z.object({}),
14511451
},
1452+
// Legacy-only: keep the schema for historical transcript rendering, but new runs
1453+
// should drive progress through todo_write so sidebar status can derive from the todo list.
14521454
status_set: {
14531455
description:
14541456
"Set a status indicator to show what Assistant is currently doing. The status is set IMMEDIATELY \n" +
@@ -1645,7 +1647,7 @@ CREATE TABLE IF NOT EXISTS delegation_rollups (
16451647
description:
16461648
"Send a system notification to the user. Use this to alert the user about important events that require their attention, such as long-running task completion, errors requiring intervention, or questions. " +
16471649
"Notifications appear as OS-native notifications (macOS Notification Center, Windows Toast, Linux). " +
1648-
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine status updates (use status_set instead).",
1650+
"Infer whether to send notifications from user instructions. If no instructions provided, reserve notifications for major wins or blocking issues. Do not use for routine progress updates — keep the todo list current instead.",
16491651
schema: z
16501652
.object({
16511653
title: z
@@ -2118,7 +2120,6 @@ export function getAvailableTools(
21182120
"system1_keep_ranges",
21192121
"todo_write",
21202122
"todo_read",
2121-
"status_set",
21222123
"notify",
21232124
...(enableAnalyticsQuery ? ["analytics_query"] : []),
21242125
"web_fetch",

0 commit comments

Comments
 (0)