π Optimization Target: Q β Agentic Workflow Optimizer
Selected because: Highest-token consumer (9.95M tokens today) not recently optimized
Analysis period: 2026-04-07 β 2026-04-12 (4 snapshot days)
Runs analyzed: 11 runs across 4 days
π Token Usage Profile
| Metric |
Value |
| Total tokens (snapshot period) |
34,945,652 |
| Avg tokens/run |
3,176,877 |
| Avg turns/run (weighted) |
46.2 |
| Failure rate |
45% (5/11 runs) |
| Cache efficiency (per-turn) |
~96% cache-read hits |
| Cache write tokens |
0 across all 3 audited runs |
| Successful run avg duration |
4.4 min (266s) |
| Failed runaway run duration |
11.8 min (709s) |
Cache write tokens = 0 indicates the system prompt and tool schemas are already fully warm in cache β per-turn uncached cost is ~2β3K tokens/turn, which is healthy.
π§ Recommendations
1. Reduce timeout-minutes from 15 to 10 β Est. savings: ~1.4M tokens/runaway run
The single most damaging observed run used 103 turns over 11.8 minutes before failing (9.1M tokens). The most recent successful run completed in just 39 turns / 4.4 minutes. The 15-minute timeout allows the agent to run ~2.7Γ longer than a successful run costs.
At the observed rate of ~6.9s/turn, a 10-minute timeout caps at ~87 turns β still 2Γ a typical success run, but eliminates the final ~24 turns of the runaway, saving approximately 1.4M tokens on that single run.
Evidence across all snapshots:
- 2026-04-10: avg_turns = 68.7 (vs 46.2 overall) with 2/3 runs failing
- 2026-04-12: one success (39 turns, 4.4 min), one failure (103 turns, 11.8 min), one quick failure (2 turns)
Action: Change in q.md frontmatter:
timeout-minutes: 10 # was: 15; successful runs complete in ~4-5 min
Note: max-turns is not supported for the Copilot engine. timeout-minutes is the correct control knob.
2. Remove Serena (Go LSP) import β Est. savings: ~5% per-turn token reduction
The imports: [shared/mcp/serena-go.md] adds a full Serena LSP MCP server β designed for Go semantic analysis (.go files in pkg/). But Q's purpose is to optimize agentic workflows β it modifies .md workflow files, not Go source code. The serena-go.md import itself explicitly states:
"Only analyze .go files β Ignore all other file types"
"Focus on pkg/ directory"
Q uses edit, bash, and github tools to read and modify .md workflow files. Serena's LSP capabilities (go-to-definition, find-references, type inference) are not applicable to YAML/Markdown workflows.
Removing Serena eliminates one MCP server container and its tool schemas from every turn's context window, reducing per-turn input size across all 39β103 turns in a run.
Estimated savings: ~5K tokens/turn Γ 46 avg turns Γ 11 runs = ~2.5M tokens over the analysis period, plus eliminated MCP server startup overhead.
Action: Remove the Serena import from q.md:
# Remove this line:
imports:
- shared/mcp/serena-go.md
```
If Q ever needs code analysis, it can use `bash` with `grep`/`glob` for `.md` workflow files.
---
#### 3. Add investigation depth guardrails in the prompt β Est. savings: variable
The Phase 1 prompt instructs downloading 10β20 runs of logs. Each full-run JSON log can be large β this content becomes part of the conversation context across subsequent turns. Constraining log downloads prevents context bloat.
**Action**: In the Phase 1 prompt, tighten the log download instruction:
```
- Count: 5 recent runs (reduced from 10-20)
- After analysis, summarize findings in <250 words before proceeding to Phase 2
This reduces input token growth per turn as the conversation lengthens.
4. Review discussions toolset necessity β Est. savings: minor
The GitHub MCP is configured with default + actions + discussions toolsets. Q does handle discussion triggers (the prompt has {{#if discussion.number}} branches), so the discussions toolset is justified. However, if telemetry shows Q is rarely triggered from discussions, removing it saves tool schema tokens on every invocation.
Action: Monitor discussion-trigger rate. If <10% of runs originate from discussions, remove the discussions toolset and let operators add it back if needed.
Tool Usage Matrix
| Tool / Server |
Configured |
Evidence of Need |
Recommendation |
github (default toolset) |
β
|
Issues/PR access for context |
Keep |
github (actions toolset) |
β
|
Phase 1 log download via gh-aw MCP |
Keep |
github (discussions toolset) |
β
|
Discussion trigger handler in prompt |
Keep (review if rarely triggered) |
agentic-workflows MCP |
β
|
Core tool for logs/audit/compile |
Keep |
edit |
β
|
Writing workflow .md changes |
Keep |
bash |
β
|
Script execution in investigation |
Keep |
cache-memory |
β
|
Pattern storage across invocations |
Keep |
| Serena (Go LSP) |
β
configured |
Not needed β Q modifies .md not .go |
Remove |
Audited Runs Detail (2026-04-12)
| Run |
Created |
Turns |
Tokens |
Cache Read |
Input/Output Ratio |
Conclusion |
| 1 |
2026-04-12 10:24 |
2 |
92,874 |
46,103 |
99.7% / 0.3% |
failure (quick) |
| 2 |
2026-04-12 16:39 |
103 |
9,117,621 |
8,911,025 |
99.6% / 0.4% |
failure (runaway) |
| 3 |
2026-04-12 18:38 |
39 |
2,896,536 |
2,773,017 |
99.6% / 0.4% |
success |
Key observation: Run 2 (103 turns) vs Run 3 (39 turns) β both at same model and similar per-turn cost (~88K vs ~74K tokens/turn) β differ only in how long the agent was allowed to run. Run 2 used 3.1Γ more tokens than Run 3 and still failed.
Multi-day pattern (from audit snapshots):
| Date |
Runs |
Total Tokens |
Avg Turns |
Errors |
| 2026-04-07 |
4 |
13,880,345 |
42.8 |
1 |
| 2026-04-09 |
1 |
834,473 |
14.0 |
0 |
| 2026-04-10 |
3 |
10,284,760 |
68.7 |
2 |
| 2026-04-12 |
3 |
9,946,074 |
39.0 |
2 |
| Total |
11 |
34,945,652 |
46.2 |
5 |
β οΈ Caveats
- Tool-level usage data (
tools_used field) was null for all 3 audited runs β Serena removal recommendation is based on workflow design analysis, not observed call counts. Verify Serena is not actually being called before removing.
- The 103-turn failure cause is unknown β may be a complex legitimate task, not an infinite loop. The
timeout-minutes reduction preserves the ability to complete ~87-turn tasks.
- These recommendations are based on 11 runs over 4 days. Edge cases (e.g., a complex code refactoring requested via
/q) may benefit from Serena or longer timeouts.
- Quick 2-turn failure (run 1) appears to be an authentication or initialization error β separate issue, not addressed here.
References:
Generated by Copilot Token Usage Optimizer Β· β 1.2M Β· β·
π Optimization Target: Q β Agentic Workflow Optimizer
Selected because: Highest-token consumer (9.95M tokens today) not recently optimized
Analysis period: 2026-04-07 β 2026-04-12 (4 snapshot days)
Runs analyzed: 11 runs across 4 days
π Token Usage Profile
π§ Recommendations
1. Reduce
timeout-minutesfrom 15 to 10 β Est. savings: ~1.4M tokens/runaway runThe single most damaging observed run used 103 turns over 11.8 minutes before failing (9.1M tokens). The most recent successful run completed in just 39 turns / 4.4 minutes. The 15-minute timeout allows the agent to run ~2.7Γ longer than a successful run costs.
At the observed rate of ~6.9s/turn, a 10-minute timeout caps at ~87 turns β still 2Γ a typical success run, but eliminates the final ~24 turns of the runaway, saving approximately 1.4M tokens on that single run.
Evidence across all snapshots:
Action: Change in
q.mdfrontmatter:2. Remove Serena (Go LSP) import β Est. savings: ~5% per-turn token reduction
The
imports: [shared/mcp/serena-go.md]adds a full Serena LSP MCP server β designed for Go semantic analysis (.gofiles inpkg/). But Q's purpose is to optimize agentic workflows β it modifies.mdworkflow files, not Go source code. The serena-go.md import itself explicitly states:Q uses
edit,bash, andgithubtools to read and modify.mdworkflow files. Serena's LSP capabilities (go-to-definition, find-references, type inference) are not applicable to YAML/Markdown workflows.Removing Serena eliminates one MCP server container and its tool schemas from every turn's context window, reducing per-turn input size across all 39β103 turns in a run.
Estimated savings: ~5K tokens/turn Γ 46 avg turns Γ 11 runs = ~2.5M tokens over the analysis period, plus eliminated MCP server startup overhead.
Action: Remove the Serena import from
q.md:This reduces input token growth per turn as the conversation lengthens.
4. Review
discussionstoolset necessity β Est. savings: minorThe GitHub MCP is configured with
default + actions + discussionstoolsets. Q does handle discussion triggers (the prompt has{{#if discussion.number}}branches), so thediscussionstoolset is justified. However, if telemetry shows Q is rarely triggered from discussions, removing it saves tool schema tokens on every invocation.Action: Monitor discussion-trigger rate. If <10% of runs originate from discussions, remove the
discussionstoolset and let operators add it back if needed.Tool Usage Matrix
github(default toolset)github(actions toolset)github(discussions toolset)agentic-workflowsMCPeditbashcache-memory.mdnot.goAudited Runs Detail (2026-04-12)
Key observation: Run 2 (103 turns) vs Run 3 (39 turns) β both at same model and similar per-turn cost (~88K vs ~74K tokens/turn) β differ only in how long the agent was allowed to run. Run 2 used 3.1Γ more tokens than Run 3 and still failed.
Multi-day pattern (from audit snapshots):
tools_usedfield) wasnullfor all 3 audited runs β Serena removal recommendation is based on workflow design analysis, not observed call counts. Verify Serena is not actually being called before removing.timeout-minutesreduction preserves the ability to complete ~87-turn tasks./q) may benefit from Serena or longer timeouts.References: