Repo: amplifier-module-hooks-streaming-ui
File: amplifier_module_hooks_streaming_ui/__init__.py — handle_content_block_end()
Evidence
Real session with sub-agent delegation:
└ Input: 11,506 (caching...) | Output: 218 | Total: 11,724 | Cost: $0.02 ← first LLM call (mid-turn)
[tool calls, sub-agent output...]
└ Input: 12,530 (64% cached) | Output: 145 | Total: 12,675 | Cost: $0.00711955
💰 Turn: $0.38 | Session: $2.00 ← right above user prompt
In a turn with multiple LLM calls (tool use, sub-agents), cost appears on every individual call's token line, scattered throughout the output. The first cost line is many screens above the next > prompt. For simple single-call turns the same value appears twice — per-call line AND 💰 Turn line — which looks like cost doubling to users.
Intended behavior
The design intent was "cost above the user input, not above the assistant response." Cost should only appear on the 💰 Turn: $X | Session: $Y line from orchestrator:complete, which fires once per user turn immediately before the > prompt.
Fix (already applied on feat/m0-cost-management)
Remove the cost_usd/cost_part block from handle_content_block_end. Per-call token lines become tokens-only:
└ Input: 11,506 (caching...) | Output: 218 | Total: 11,724
All cost consolidated at orchestrator:complete (right above user prompt):
💰 Turn: $0.38 | Session: $2.00
Repo:
amplifier-module-hooks-streaming-uiFile:
amplifier_module_hooks_streaming_ui/__init__.py—handle_content_block_end()Evidence
Real session with sub-agent delegation:
In a turn with multiple LLM calls (tool use, sub-agents), cost appears on every individual call's token line, scattered throughout the output. The first cost line is many screens above the next
>prompt. For simple single-call turns the same value appears twice — per-call line AND💰 Turnline — which looks like cost doubling to users.Intended behavior
The design intent was "cost above the user input, not above the assistant response." Cost should only appear on the
💰 Turn: $X | Session: $Yline fromorchestrator:complete, which fires once per user turn immediately before the>prompt.Fix (already applied on
feat/m0-cost-management)Remove the
cost_usd/cost_partblock fromhandle_content_block_end. Per-call token lines become tokens-only:All cost consolidated at
orchestrator:complete(right above user prompt):