feat(scratchpad): context window management for large mcp tool outputs by justintime4tea · Pull Request #32 · mezmo/aura

justintime4tea · 2026-03-25T18:36:43Z

Ref: LOG-23439

Intercept large MCP tool results before they fill the context window,
save them to disk, and give the LLM eight read-only exploration tools
to selectively extract only what it needs.

Key components:

ScratchpadWrapper: intercepts tool outputs exceeding per-tool byte
thresholds, writes to disk, returns a compact pointer to the LLM
Exploration tools: schema, item_schema, head, grep, get_in,
iterate_over, slice, read — each budget-checked before returning
ContextBudget: tracks estimated token consumption and prevents
context overflow from scratchpad reads
Usage tracking: records bytes_intercepted (diverted to disk) vs
bytes_extracted (read back into context) across all workers
aura.orchestrator.scratchpad_usage SSE event emitted at
orchestration end with aggregate totals

Configuration:

  [mcp.servers.<name>.scratchpad]
  "*" = { min_bytes = 2000 }
  "tool_name" = { min_bytes = 4096 } # override for specific tool

  [orchestration.scratchpad]
  enabled = true
  context_safety_margin = 0.20

Ref: LOG-23533

Token Budget, Per-Call Limits and Turns

Rename min_bytes → min_tokens with token-based interception thresholds
Track tokens_intercepted/tokens_extracted instead of bytes throughout
Add per-call extraction limit (max_extraction_tokens, default 10k)
Auto-increase worker turn_depth when scratchpad active (turn_depth_bonus)
Feed LLM-reported input_tokens back into ContextBudget via set_used_from_llm
Replace heuristic initial_used estimates with real tokenization of MCP tool
schemas, worker preambles, scratchpad/builtin tool definitions
Compute per-worker worst-case using mcp_filter to count only filtered tools
Record actual task prompt tokens in budget before worker execution

Pretty-print & minified JSON fix:

Pretty-print JSON at write time so line-based tools work on minified responses
WriteResult struct replaces tuple returns, includes companion file info

Companion file extraction:

Auto-extract large structured string values from JSON as companion files
Escaped JSON strings → pretty-printed .json companions
Markdown content (headers + lists) → .md companions
ContentFormat::Markdown variant with detection heuristic
Shared prepare_write / make_companion helpers to reduce duplication

Markdown schema support:

analyze_markdown_structure() parses ### headers into section tree with line ranges and keys
schema tool dispatches by file extension (.json → JSON, .md → markdown sections)
Format-aware schema_too_large error suggestions

Scratchpad tool get_in string pagination:

offset/limit params for paginating large string values by line
Raw string content returned (not JSON-encoded) so embedded newlines render as real lines
Early return for non-string values, extracted get_in_paginated helper

Scratchpad tool iterate_over large string truncation:

String values >5 lines truncated with preview + get_in path hint
Prevents memory spikes from cloning huge strings across all array items

Companion pointer hints in wrapper:

Pointer message lists companion files with format-specific tool suggestions

justintime4tea · 2026-03-25T18:53:21Z

I must have rebased and brought a couple commits in from main (the commits that aren't mine). I can dig those out but those changes are minimal and not related to scratchpad so you should be able to easily see the changes that are related to scratchpad.

Shearerbeard

LGTM

Intercept large MCP tool results before they fill the context window, save them to disk, and give the LLM eight read-only exploration tools to selectively extract only what it needs. Key components: - ScratchpadWrapper: intercepts tool outputs exceeding per-tool byte thresholds, writes to disk, returns a compact pointer to the LLM - Exploration tools: schema, item_schema, head, grep, get_in, iterate_over, slice, read — each budget-checked before returning - ContextBudget: tracks estimated token consumption and prevents context overflow from scratchpad reads - Usage tracking: records bytes_intercepted (diverted to disk) vs bytes_extracted (read back into context) across all workers - aura.orchestrator.scratchpad_usage SSE event emitted at orchestration end with aggregate totals Configuration: ```toml [mcp.servers.<name>.scratchpad] "*" = { min_bytes = 2000 } "tool_name" = { min_bytes = 4096 } # override for specific tool [orchestration.scratchpad] enabled = true context_safety_margin = 0.20 ``` Ref: LOG-23439

Token Budget, Per-Call Limits and Turns: - Rename min_bytes → min_tokens thresholds - Track tokens_intercepted/tokens_extracted - Per-call extraction limit (max_extraction_tokens) - Auto-increase turn_depth when scratchpad active - Feed LLM input_tokens back into ContextBudget - Replace heuristic estimates with real tokenization of tool schemas, preambles, scratchpad definitions - Per-worker worst-case via mcp_filter - Record task prompt tokens before worker execution Pretty-print & companion file extraction: - Pretty-print JSON at write time for line tools - Auto-extract large JSON string values as companions: escaped JSON → .json, markdown → .md - ContentFormat::Markdown variant with detection - WriteResult struct replaces tuple returns - Shared prepare_write/make_companion helpers Markdown schema support: - analyze_markdown_structure() parses headers into section tree with line ranges and keys - schema tool dispatches by file extension - Format-aware schema_too_large error suggestions Scratchpad tool get_in string pagination: - offset/limit params for large string values - Raw string content returned (not JSON-encoded) - Early return for non-string, extracted helper Scratchpad tool iterate_over large string truncation: - Strings >5 lines truncated with preview + hint - Prevents memory spikes from cloning huge strings Ref: LOG-23533

justintime4tea changed the title ~~Justingross/log 23439 scratchpad poc~~ feat(context-management): add scratchpad for large MCP results Mar 25, 2026

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from a8cab0b to 003b2c6 Compare March 25, 2026 20:42

Shearerbeard force-pushed the feature/orchestration-mode branch from 822f762 to 6620083 Compare March 26, 2026 19:26

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from 003b2c6 to 5393ab3 Compare March 27, 2026 15:17

justintime4tea changed the title ~~feat(context-management): add scratchpad for large MCP results~~ feat(scratchpad): context window management for large mcp tool outputs Mar 27, 2026

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from 5393ab3 to a87a6dc Compare March 27, 2026 15:18

Shearerbeard force-pushed the feature/orchestration-mode branch from 6620083 to 5d38339 Compare March 27, 2026 15:23

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from a87a6dc to b2717a1 Compare March 27, 2026 15:37

justintime4tea marked this pull request as ready for review March 27, 2026 15:39

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from b2717a1 to 53b38ef Compare March 27, 2026 15:40

Shearerbeard force-pushed the feature/orchestration-mode branch from d3e4741 to 0f80f34 Compare March 30, 2026 17:30

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch 18 times, most recently from 7653f22 to f0b4665 Compare April 6, 2026 14:22

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch 4 times, most recently from 586b610 to 14a88d2 Compare April 6, 2026 16:46

Shearerbeard force-pushed the feature/orchestration-mode branch from 1a7ad9f to d1fa6bc Compare April 6, 2026 18:19

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch 2 times, most recently from 815b878 to dab3566 Compare April 8, 2026 14:54

Shearerbeard approved these changes Apr 9, 2026

View reviewed changes

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch 8 times, most recently from 79770c1 to 65cf471 Compare April 10, 2026 22:42

justintime4tea added 2 commits April 15, 2026 13:28

justintime4tea force-pushed the justingross/LOG-23439-scratchpad-poc branch from 65cf471 to 1a2887b Compare April 15, 2026 20:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scratchpad): context window management for large mcp tool outputs#32

feat(scratchpad): context window management for large mcp tool outputs#32
justintime4tea wants to merge 2 commits intofeature/orchestration-modefrom
justingross/LOG-23439-scratchpad-poc

justintime4tea commented Mar 25, 2026 •

edited

Loading

Uh oh!

justintime4tea commented Mar 25, 2026

Uh oh!

Shearerbeard left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

justintime4tea commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ref: LOG-23439

Ref: LOG-23533

Uh oh!

justintime4tea commented Mar 25, 2026

Uh oh!

Shearerbeard left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

justintime4tea commented Mar 25, 2026 •

edited

Loading