Conversation
Intercepts /v1/chat/completions and /v1/messages, extracts documents from system prompts (XML tags, numbered, separator formats), reorders them via ContextPilot for optimal prefix sharing, and forwards to the backend. Users just change their API endpoint URL — no code changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e, and deployment files - Extract and reorder documents from tool_result messages (OpenAI role="tool" and Anthropic type="tool_result"), with camelCase compat for OpenClaw internal format - Add markdown_header extraction mode (split on # / ## headers) - Extend XML tag recognition with <files>/<file> - Add X-ContextPilot-Scope header (system / tool_results / all) - Refactor _intercept_and_forward to use MultiExtractionResult for multi-source reordering - Expand _contextpilot response metadata with total_documents and sources breakdown - Add OpenClaw examples: setup.sh, Docker Compose, provider config template - Add integration guide at docs/guides/openclaw.md - 95 tests passing (34 new) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…den forwarding - Move _contextpilot response metadata from JSON body to X-ContextPilot-Result header so strict API parsers (OpenClaw SDK) receive unmodified responses - Broaden request header forwarding from 4-header whitelist to blacklist (only strip x-contextpilot-* and hop-by-hop), fixing dropped anthropic-beta etc. - Forward backend response headers and status code in streaming mode - Replace noisy print() in schedule_only() with logger.debug() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… paths - Move proxy_completions metadata from response body to X-ContextPilot-Result header - Inject rid in intercept path when running in stateful mode (index active) - Add tests for header metadata, rid injection, and stateless bypass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ing timeline, package structure
… rule trigger points
….insert, kv.reuse, auto-batch)
… add no-accuracy-impact constraint
…artition, Reorder)
…ential, not parallel) Add MUST come before Reorder: it expands the block set so Reorder has more permutation options. Solves cases where no permutation of existing blocks hits a prefix, but adding a block makes it possible. Updated: both design docs, SVG diagram, Notion callout + CRUD table.
…icable Dedup skips on Turn 1 or no duplicates; Repartition skips when all blocks have dependencies; Add skips when no prefix-hit candidate exists; Reorder skips when order is already optimal. Updated EXPLAIN example to show a SKIPPED primitive. Updated Notion primitives table with skip conditions.
… how results persist across turns
Default 13, configurable via --chunk-modulus. Passed through to all dedup calls. Added tuning guide in how_it_works.md with M value recommendations.
…s, fix broken docker link
…line changes, FakeSession mock
|
the three functions in block_dedup.py can be made into one for simplicity |
dalongbao
left a comment
There was a problem hiding this comment.
Review Summary
Solid PR overall. The adapter pattern, intercept parser, and TTL eviction module are well-structured. Docs are accurate, tests have decent coverage, and the OpenClaw example works. Below are the findings grouped by priority.
Bugs (should fix before merge)
-
_chunk_modulusnot inglobaldeclaration (http_server.py~L2148) —main()declaresglobal _max_tokens, _infer_api_url, ...but omits_chunk_modulus. The--chunk-modulusCLI flag is silently ignored; value always stays at default13. -
proxy_enginehardcodestemperature=0(http_server.py~L1933) — The generic/v1/{path:path}catch-all unconditionally setsbody["temperature"] = 0, overwriting the user's value on every proxied request. -
"\n\n".joincorrupts content (block_dedup.pyL149/241/323,conversation_tracker.pyL307) — Content is split on"\n"but reassembled with"\n\n", inserting phantom blank lines at every chunk boundary even for non-deduped blocks. Breaks structured content (JSON, YAML, code). -
hash()is non-deterministic across processes (block_dedup.pyL64) — Python randomizeshash()per process via PYTHONHASHSEED. Chunk boundaries differ on every restart and across workers. Should use a deterministic hash. -
default_ttl_seconds=0silently becomes 300 (ttl_eviction.pyL115) — Usesorinstead ofis not None.0is falsy so it falls through todefault_ttl.seconds. -
default_ttlsetter is a no-op (ttl_eviction.pyL129-132) — Updates the enum_default_ttlbut not_default_ttl_seconds, which is whatadd_entryactually reads. -
Reconstruction uses default config (
intercept_parser.py~L572/621/962/1002) —reconstruct_*re-runs extraction withInterceptConfig()defaults instead of the original config. Silently fails for non-auto modes likemode=separator. -
_apply_block_dedupmutates caller's dict (conversation_tracker.pyL306-307) — Hidden side effect: modifiesdoc_contentsin-place with no indication in the signature or return value.
Resource / safety issues
-
Streaming connection leak (
http_server.py~L1770-1806) —_stream_with_headershas nofinallycleanup. Client disconnect mid-stream can leak aiohttp connections. Compare withproxy_enginewhich hasfinally: response.close(). -
Double deep-copy (
http_server.py~L1363+1486) —_strip_external_content_idsrecursively copies the body, thencopy.deepcopy(body)copies it again. 2x memory pressure on every request. -
API key could leak in errors (
http_server.py~L1851) —aiohttp.ClientErrorcan include URLs/headers instr(e), which is returned verbatim in the 502detail. -
Non-JSON upstream error crashes (
http_server.py~L1815) —resp.json()on a plain-text 502 from a load balancer raisesJSONDecodeErrorinstead of a clean error. -
get_conversation_chainhas no cycle detection (conversation_tracker.pyL137-146) — Infinite loop if parent chain has a cycle. -
_requestsdict grows unbounded (conversation_tracker.pyL77) — No TTL, no max size, no automatic cleanup.timestampfield exists but is never read.
Cloud adapters
-
Cache breakpoint limit (
anthropic_adapter.pyL105-117) — Injectscache_controlon every qualifying tool result. Anthropic limits to 4 breakpoints per request. Will 400 on real agentic conversations with many tool results. -
No streaming cache metrics (
http_server.py~L1768-1806) —parse_cache_metricsonly runs in the non-streaming path. TTL policy never updates for streaming requests. -
TTL label mismatch (confirmed via manual testing) —
--extended-cachewith OpenAI shows"default_ttl": "5m"but"default_ttl_seconds": 86400. The enum and seconds are set independently. -
update_from_responsedouble-counts (ttl_eviction.pyL247-268) — On partial cache hits (both read and creation tokens), callstouch_entrythenadd_entrynon-atomically. Hit counter is inflated.
Docs
-
Auto-detection priority is wrong (
docs/guides/openclaw.mdL218) — Says "XML > Numbered > Separator > Markdown headers" but code doesxml_tag > numbered > json_results. Separator and markdown_header are not auto-detected. -
json_resultsmissing from format table (docs/guides/openclaw.mdL210-216) — The document extraction table omitsjson_results, which is an auto-detected format.
Dedup module
-
Core dedup loop duplicated 3x (
block_dedup.py) —dedup_chat_completions,_dedup_assistant_code_blocks, anddedup_responses_apishare near-identical logic. Should extract into a shared helper. -
blocks_totalundercounts (block_dedup.pyL129/222/303) — Single-block messages are registered inseen_blocksbut never counted inblocks_total. -
No unit tests for
_content_defined_chunking,_hash_block, or the dedup functions directly.
Minor / nits (non-blocking)
FrozenSetimported but unused in all cloud adapter filestool_results_skippedinitialized but never incremented (dead code,http_server.pyL1371)- Debug SHA-256 hashing runs unconditionally, not gated on log level (
http_server.pyL1380) _intercept_indexnot reset when conversation changes (http_server.pyL1189)alphaheader not validated — non-numeric value crashes (intercept_parser.pyL135)clear_conversationonly walks ancestors, leaks child requests (conversation_tracker.pyL357)live_index.pyschedule_onlyconverted tologger.debug()but other methods still useprint()- Test
test_single_separator_returns_noneassertsresult is not None— name contradicts assertion - MiniMax listed in News section of README but omitted from Drop-in solutions line
|
|
||
| for line in lines: | ||
| current.append(line) | ||
| line_hash = hash(line.strip()) & 0xFFFFFFFF |
There was a problem hiding this comment.
use hashlib.md5 for determinism
|
|
||
| if current: | ||
| if blocks and len(current) < CHUNK_MIN_LINES: | ||
| blocks[-1] += "\n" + "\n".join(current) |
|
cloud adapter test 了沒問題 |
Close #8