feat(scratchpad): context window management for large mcp tool outputs#32
Open
justintime4tea wants to merge 2 commits intofeature/orchestration-modefrom
Open
feat(scratchpad): context window management for large mcp tool outputs#32justintime4tea wants to merge 2 commits intofeature/orchestration-modefrom
justintime4tea wants to merge 2 commits intofeature/orchestration-modefrom
Conversation
Collaborator
Author
|
I must have rebased and brought a couple commits in from main (the commits that aren't mine). I can dig those out but those changes are minimal and not related to scratchpad so you should be able to easily see the changes that are related to scratchpad. |
a8cab0b to
003b2c6
Compare
822f762 to
6620083
Compare
003b2c6 to
5393ab3
Compare
5393ab3 to
a87a6dc
Compare
6620083 to
5d38339
Compare
a87a6dc to
b2717a1
Compare
b2717a1 to
53b38ef
Compare
d3e4741 to
0f80f34
Compare
7653f22 to
f0b4665
Compare
586b610 to
14a88d2
Compare
1a7ad9f to
d1fa6bc
Compare
815b878 to
dab3566
Compare
79770c1 to
65cf471
Compare
Intercept large MCP tool results before they fill the context window,
save them to disk, and give the LLM eight read-only exploration tools
to selectively extract only what it needs.
Key components:
- ScratchpadWrapper: intercepts tool outputs exceeding per-tool byte
thresholds, writes to disk, returns a compact pointer to the LLM
- Exploration tools: schema, item_schema, head, grep, get_in,
iterate_over, slice, read — each budget-checked before returning
- ContextBudget: tracks estimated token consumption and prevents
context overflow from scratchpad reads
- Usage tracking: records bytes_intercepted (diverted to disk) vs
bytes_extracted (read back into context) across all workers
- aura.orchestrator.scratchpad_usage SSE event emitted at
orchestration end with aggregate totals
Configuration:
```toml
[mcp.servers.<name>.scratchpad]
"*" = { min_bytes = 2000 }
"tool_name" = { min_bytes = 4096 } # override for specific tool
[orchestration.scratchpad]
enabled = true
context_safety_margin = 0.20
```
Ref: LOG-23439
Token Budget, Per-Call Limits and Turns: - Rename min_bytes → min_tokens thresholds - Track tokens_intercepted/tokens_extracted - Per-call extraction limit (max_extraction_tokens) - Auto-increase turn_depth when scratchpad active - Feed LLM input_tokens back into ContextBudget - Replace heuristic estimates with real tokenization of tool schemas, preambles, scratchpad definitions - Per-worker worst-case via mcp_filter - Record task prompt tokens before worker execution Pretty-print & companion file extraction: - Pretty-print JSON at write time for line tools - Auto-extract large JSON string values as companions: escaped JSON → .json, markdown → .md - ContentFormat::Markdown variant with detection - WriteResult struct replaces tuple returns - Shared prepare_write/make_companion helpers Markdown schema support: - analyze_markdown_structure() parses headers into section tree with line ranges and keys - schema tool dispatches by file extension - Format-aware schema_too_large error suggestions Scratchpad tool get_in string pagination: - offset/limit params for large string values - Raw string content returned (not JSON-encoded) - Early return for non-string, extracted helper Scratchpad tool iterate_over large string truncation: - Strings >5 lines truncated with preview + hint - Prevents memory spikes from cloning huge strings Ref: LOG-23533
65cf471 to
1a2887b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ref: LOG-23439
Intercept large MCP tool results before they fill the context window,
save them to disk, and give the LLM eight read-only exploration tools
to selectively extract only what it needs.
Key components:
thresholds, writes to disk, returns a compact pointer to the LLM
iterate_over, slice, read — each budget-checked before returning
context overflow from scratchpad reads
bytes_extracted (read back into context) across all workers
orchestration end with aggregate totals
Configuration:
Ref: LOG-23533
Token Budget, Per-Call Limits and Turns
schemas, worker preambles, scratchpad/builtin tool definitions
Pretty-print & minified JSON fix:
Companion file extraction:
Markdown schema support:
Scratchpad tool get_in string pagination:
Scratchpad tool iterate_over large string truncation:
Companion pointer hints in wrapper: