Skip to content

[Feature] Background Agents stream back through stream-json with ownership information #1002

@EdanStarfire

Description

@EdanStarfire

What Works

Right now if using foreground agents, the tool usages (but not thinking blocks - if there are any?) are streamed back to the parent session's stream-json responses. This allows you to group / track tool usages, responses, etc. via the stream and can expose to the system that uses the SDK.

The Problem

If, instead of foreground agents, you use background agents, not only do any tool usages / thinking blocks not flow through the SDK's main session stream, but neither do SendMessage responses or final results from the agent. This means a massive lack of visibility into those background agents:

  • No knowledge of if it's stuck or done
  • No knowledge of what tools it's using (or if it's off track)
  • No feedback when it completes or what it has completed
  • No visibility into messages send to/from those agents (or even to the main session), losing the inter-agent comms benefit

These visibility losses make background agents "fire and hope" shots. This is exacerbated if the primary session compacts while the other agents are doing things, as it has a tendency to stop receiving messages (or stops polling its inbox?), requiring the user to re-prompt it to check on the agents and how they're doing.

The Recommendation:

  1. When spawning background agents, attach all tool usages (and possibly thinking blocks if enabled) to stream-json messages that flow back through the main session's streaming with parent_tool_use_id similar to foreground agents. While it'll add a level of complexity for some integrations to handle async tool streaming as part of their flows, the visibility gained would be highly worth it.
  2. When any agent uses SendMessage, both the send AND the receive should be treated as tool uses and streamed to provide visibility into where in the message stream the main session / team lead received the message. Similar for status checks.

Caveats

With the current mechanism of the actual token-chunk streaming, there's no identifier tied to which content each chunk is associated with, so interim chunks from different agents could likely get complicated to handle. I would recommend keeping only the main session use chunk-based streaming if enabled, and only do the full-message JSONs for background agent activity. This would would be a fine trade-off from my PoV.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions