Background
When a session runs a background subagent, the CLI tracks it internally via backgroundAgentRegistry (a Dbe instance). The registry has a cancel(agentId) method that fires the agent's AbortController, marks it cancelled, and triggers the change callback. However, this method is not exposed over the JSON-RPC interface that PolyPilot uses to communicate with the headless CLI.
The Problem
If a background agent never emits SubagentCompletedEvent or SubagentFailedEvent, PolyPilot's SessionIdleEvent keeps reporting backgroundTasks: { agents: 1 }. The IDLE-DEFER guard (PR #399) correctly blocks CompleteResponse while background tasks are active — but with a zombie subagent, it defers indefinitely until the watchdog's Case B stale checks fire (~6–30 minutes depending on events.jsonl activity).
The only recovery path today is a full session abort, which discards all accumulated response content.
What Already Exists in the CLI
// backgroundAgentRegistry (Dbe class) — internal to CLI process
cancel(agentId) {
// fires AbortController, marks status = "cancelled", calls onChangeCallback
}
// Exposed on the session context
async cancelBackgroundTask(id) {
return id.startsWith("agent-")
? this.backgroundAgentRegistry.cancel(id)
: this.detachedShellRegistry.kill(id);
}
The LLM also has a list_agents tool (returns agentId, status, startedAt) and a read_agent tool, but no cancel_agent tool and no RPC method for callers outside the CLI process.
Request
Add a CancelBackgroundTaskAsync(string agentId) method to the Copilot SDK's CopilotSession (or equivalent RPC surface), backed by the existing cancelBackgroundTask implementation in the CLI.
This would allow PolyPilot to:
- Detect a zombie subagent (started but no completion event after N minutes)
- Call
CancelBackgroundTaskAsync(agentId) to force the CLI to mark it cancelled and fire onChangeCallback
- Receive a subsequent
SessionIdleEvent with backgroundTasks: { agents: 0 } and complete normally — no abort required
Related
IDLE-DEFER guard: CopilotService.Events.cs — HasActiveBackgroundTasks()
SubagentStartedEvent / SubagentCompletedEvent / SubagentFailedEvent — already tracked in PolyPilot for UI feedback
backgroundAgentRegistry.cancel() — CLI source, already implemented
Background
When a session runs a background subagent, the CLI tracks it internally via
backgroundAgentRegistry(aDbeinstance). The registry has acancel(agentId)method that fires the agent'sAbortController, marks it cancelled, and triggers the change callback. However, this method is not exposed over the JSON-RPC interface that PolyPilot uses to communicate with the headless CLI.The Problem
If a background agent never emits
SubagentCompletedEventorSubagentFailedEvent, PolyPilot'sSessionIdleEventkeeps reportingbackgroundTasks: { agents: 1 }. TheIDLE-DEFERguard (PR #399) correctly blocksCompleteResponsewhile background tasks are active — but with a zombie subagent, it defers indefinitely until the watchdog's Case B stale checks fire (~6–30 minutes depending onevents.jsonlactivity).The only recovery path today is a full session abort, which discards all accumulated response content.
What Already Exists in the CLI
The LLM also has a
list_agentstool (returnsagentId, status, startedAt) and aread_agenttool, but nocancel_agenttool and no RPC method for callers outside the CLI process.Request
Add a
CancelBackgroundTaskAsync(string agentId)method to the Copilot SDK'sCopilotSession(or equivalent RPC surface), backed by the existingcancelBackgroundTaskimplementation in the CLI.This would allow PolyPilot to:
CancelBackgroundTaskAsync(agentId)to force the CLI to mark it cancelled and fireonChangeCallbackSessionIdleEventwithbackgroundTasks: { agents: 0 }and complete normally — no abort requiredRelated
IDLE-DEFERguard:CopilotService.Events.cs—HasActiveBackgroundTasks()SubagentStartedEvent/SubagentCompletedEvent/SubagentFailedEvent— already tracked in PolyPilot for UI feedbackbackgroundAgentRegistry.cancel()— CLI source, already implemented