Add OpenClaw and Cloud API Support by SecretSettler · Pull Request #34 · EfficientContext/ContextPilot

SecretSettler · 2026-03-24T00:46:09Z

Close #8

Intercepts /v1/chat/completions and /v1/messages, extracts documents from system prompts (XML tags, numbered, separator formats), reorders them via ContextPilot for optimal prefix sharing, and forwards to the backend. Users just change their API endpoint URL — no code changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…e, and deployment files - Extract and reorder documents from tool_result messages (OpenAI role="tool" and Anthropic type="tool_result"), with camelCase compat for OpenClaw internal format - Add markdown_header extraction mode (split on # / ## headers) - Extend XML tag recognition with <files>/<file> - Add X-ContextPilot-Scope header (system / tool_results / all) - Refactor _intercept_and_forward to use MultiExtractionResult for multi-source reordering - Expand _contextpilot response metadata with total_documents and sources breakdown - Add OpenClaw examples: setup.sh, Docker Compose, provider config template - Add integration guide at docs/guides/openclaw.md - 95 tests passing (34 new) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…den forwarding - Move _contextpilot response metadata from JSON body to X-ContextPilot-Result header so strict API parsers (OpenClaw SDK) receive unmodified responses - Broaden request header forwarding from 4-header whitelist to blacklist (only strip x-contextpilot-* and hop-by-hop), fixing dropped anthropic-beta etc. - Forward backend response headers and status code in streaming mode - Replace noisy print() in schedule_only() with logger.debug() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… paths - Move proxy_completions metadata from response body to X-ContextPilot-Result header - Inject rid in intercept path when running in stateful mode (index active) - Add tests for header metadata, rid injection, and stateless bypass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ing timeline, package structure

… rule trigger points

….insert, kv.reuse, auto-batch)

… add no-accuracy-impact constraint

…artition, Reorder)

…annotation

…ential, not parallel) Add MUST come before Reorder: it expands the block set so Reorder has more permutation options. Solves cases where no permutation of existing blocks hits a prefix, but adding a block makes it possible. Updated: both design docs, SVG diagram, Notion callout + CRUD table.

…icable Dedup skips on Turn 1 or no duplicates; Repartition skips when all blocks have dependencies; Add skips when no prefix-hit candidate exists; Reorder skips when order is already optimal. Updated EXPLAIN example to show a SKIPPED primitive. Updated Notion primitives table with skip conditions.

…order

… how results persist across turns

Default 13, configurable via --chunk-modulus. Passed through to all dedup calls. Added tuning guide in how_it_works.md with M value recommendations.

… M3 in README

…s, fix broken docker link

…o front

…line changes, FakeSession mock

…oss turns

dalongbao · 2026-03-26T23:01:06Z

the three functions in block_dedup.py can be made into one for simplicity

dalongbao

Review Summary

Solid PR overall. The adapter pattern, intercept parser, and TTL eviction module are well-structured. Docs are accurate, tests have decent coverage, and the OpenClaw example works. Below are the findings grouped by priority.

Bugs (should fix before merge)

_chunk_modulus not in global declaration (http_server.py ~L2148) — main() declares global _max_tokens, _infer_api_url, ... but omits _chunk_modulus. The --chunk-modulus CLI flag is silently ignored; value always stays at default 13.
proxy_engine hardcodes temperature=0 (http_server.py ~L1933) — The generic /v1/{path:path} catch-all unconditionally sets body["temperature"] = 0, overwriting the user's value on every proxied request.
"\n\n".join corrupts content (block_dedup.py L149/241/323, conversation_tracker.py L307) — Content is split on "\n" but reassembled with "\n\n", inserting phantom blank lines at every chunk boundary even for non-deduped blocks. Breaks structured content (JSON, YAML, code).
hash() is non-deterministic across processes (block_dedup.py L64) — Python randomizes hash() per process via PYTHONHASHSEED. Chunk boundaries differ on every restart and across workers. Should use a deterministic hash.
default_ttl_seconds=0 silently becomes 300 (ttl_eviction.py L115) — Uses or instead of is not None. 0 is falsy so it falls through to default_ttl.seconds.
default_ttl setter is a no-op (ttl_eviction.py L129-132) — Updates the enum _default_ttl but not _default_ttl_seconds, which is what add_entry actually reads.
Reconstruction uses default config (intercept_parser.py ~L572/621/962/1002) — reconstruct_* re-runs extraction with InterceptConfig() defaults instead of the original config. Silently fails for non-auto modes like mode=separator.
_apply_block_dedup mutates caller's dict (conversation_tracker.py L306-307) — Hidden side effect: modifies doc_contents in-place with no indication in the signature or return value.

Resource / safety issues

Streaming connection leak (http_server.py ~L1770-1806) — _stream_with_headers has no finally cleanup. Client disconnect mid-stream can leak aiohttp connections. Compare with proxy_engine which has finally: response.close().
Double deep-copy (http_server.py ~L1363+1486) — _strip_external_content_ids recursively copies the body, then copy.deepcopy(body) copies it again. 2x memory pressure on every request.
API key could leak in errors (http_server.py ~L1851) — aiohttp.ClientError can include URLs/headers in str(e), which is returned verbatim in the 502 detail.
Non-JSON upstream error crashes (http_server.py ~L1815) — resp.json() on a plain-text 502 from a load balancer raises JSONDecodeError instead of a clean error.
get_conversation_chain has no cycle detection (conversation_tracker.py L137-146) — Infinite loop if parent chain has a cycle.
_requests dict grows unbounded (conversation_tracker.py L77) — No TTL, no max size, no automatic cleanup. timestamp field exists but is never read.

Cloud adapters

Cache breakpoint limit (anthropic_adapter.py L105-117) — Injects cache_control on every qualifying tool result. Anthropic limits to 4 breakpoints per request. Will 400 on real agentic conversations with many tool results.
No streaming cache metrics (http_server.py ~L1768-1806) — parse_cache_metrics only runs in the non-streaming path. TTL policy never updates for streaming requests.
TTL label mismatch (confirmed via manual testing) — --extended-cache with OpenAI shows "default_ttl": "5m" but "default_ttl_seconds": 86400. The enum and seconds are set independently.
update_from_response double-counts (ttl_eviction.py L247-268) — On partial cache hits (both read and creation tokens), calls touch_entry then add_entry non-atomically. Hit counter is inflated.

Docs

Auto-detection priority is wrong (docs/guides/openclaw.md L218) — Says "XML > Numbered > Separator > Markdown headers" but code does xml_tag > numbered > json_results. Separator and markdown_header are not auto-detected.
json_results missing from format table (docs/guides/openclaw.md L210-216) — The document extraction table omits json_results, which is an auto-detected format.

Dedup module

Core dedup loop duplicated 3x (block_dedup.py) — dedup_chat_completions, _dedup_assistant_code_blocks, and dedup_responses_api share near-identical logic. Should extract into a shared helper.
blocks_total undercounts (block_dedup.py L129/222/303) — Single-block messages are registered in seen_blocks but never counted in blocks_total.
No unit tests for _content_defined_chunking, _hash_block, or the dedup functions directly.

Minor / nits (non-blocking)

FrozenSet imported but unused in all cloud adapter files
tool_results_skipped initialized but never incremented (dead code, http_server.py L1371)
Debug SHA-256 hashing runs unconditionally, not gated on log level (http_server.py L1380)
_intercept_index not reset when conversation changes (http_server.py L1189)
alpha header not validated — non-numeric value crashes (intercept_parser.py L135)
clear_conversation only walks ancestors, leaks child requests (conversation_tracker.py L357)
live_index.py schedule_only converted to logger.debug() but other methods still use print()
Test test_single_separator_returns_none asserts result is not None — name contradicts assertion
MiniMax listed in News section of README but omitted from Drop-in solutions line

dalongbao · 2026-03-26T22:53:11Z

contextpilot/dedup/block_dedup.py

+
+    for line in lines:
+        current.append(line)
+        line_hash = hash(line.strip()) & 0xFFFFFFFF


use hashlib.md5 for determinism

dalongbao · 2026-03-26T22:53:51Z

contextpilot/dedup/block_dedup.py

+
+    if current:
+        if blocks and len(current) < CHUNK_MIN_LINES:
+            blocks[-1] += "\n" + "\n".join(current)


why add two \n?

dalongbao · 2026-03-26T23:19:46Z

cloud adapter test 了沒問題

SecretSettler and others added 30 commits March 1, 2026 17:54

Merge branch 'sglang-monkeypatch' into http_intercept

0c35e26

Add DEBUG Log optionas

9c773af

update openclaw readme

ad7170d

Update openclaw docs

5434e2e

Update openclaw README

c6d8792

Support http intercept proxy

378ee0a

Merge branch 'main' into http_intercept

66deff2

Merge branch 'main' into http_intercept

c38f6d7

Merge branch 'main' into http_intercept

66a5a64

update openclaw example

442beb9

Add http intercept support

39f9bd7

Fix http intercept bugs

6b505b4

docs: add Context Optimizer architecture design + SVG diagram

56a88e1

Update architecture SVG: add _compat bridge, pipeline/utils, refactor…

dd7353e

…ing timeline, package structure

Redesign SVG: horizontal layout (landscape, left-to-right data flow)

e4efe06

SVG: resize to 950x680, larger fonts for Notion readability

a46d1c5

SVG: 全中文标注，每步说明干什么

de8a766

SVG v5: flow-oriented vertical layout, English labels, 880x720, shows…

af3218e

… rule trigger points

Add PNG version of architecture diagram for Notion compatibility

8ca341e

SVG v6: all rules applied, add KV Engine layer (SGLang radix tree, kv…

b10ac5a

….insert, kv.reuse, auto-batch)

SVG v7: clarify dedup (same-session) vs prefix cache (cross-session),…

f59003e

… add no-accuracy-impact constraint

docs: update design docs to 4-primitive architecture (Dedup, Add, Rep…

da0ab19

…artition, Reorder)

docs: fix primitives — all READ-only, Add=prefix hit with cache-only …

d56ab67

…annotation

svg: fix Add primitive — READ prefix-hit blocks (not INSERT)

31248b5

SecretSettler added 21 commits March 23, 2026 01:01

Reorder pipeline steps: align prefix cache first, then dedup, then re…

4ccc4a4

…order

Simplify: reorder + dedup are the two operations, prefix alignment is…

edc04ab

… how results persist across turns

Rename: file-level → tool-level, block-level → content-level dedup

3ae346c

Rename tool-level → document-level (works for both agentic and RAG)

ea74a8b

Rename content-level → ContextBlock-level dedup

3c30864

Fix naming: ContextBlock-level (whole doc) + Content-level (block dedup)

07e33f9

Update pipeline diagram

9237c3b

Delete unecessary images

059275c

Add --chunk-modulus CLI flag for tuning content-level dedup block size

1fac931

Default 13, configurable via --chunk-modulus. Passed through to all dedup calls. Added tuning guide in how_it_works.md with M value recommendations.

Add OpenClaw quick start as first example in Getting Started

13fb948

Show real OpenClaw usage example with contract analysis conversation

d337173

Show openclaw agent CLI commands in example

dce540e

Move RAG benchmarks to docs/benchmarks/rag.md, keep OpenClaw + Mem0 +…

602e1f3

… M3 in README

Add 5 long context memory result to Mem0 table

4e3fdf4

Update mem0.md model to Qwen3-4B tp=1

1d06c8b

Move RAG reference to end of Performance section

be031a0

Add M5 MacBook Air results to Apple Silicon benchmark table

f368201

Use deduplication instead of dedup in how_it_works.md

02da78a

Fix setup table header and remove dead raw data link

53ad199

Replace jargon: arms → with and without ContextPilot

8131f0f

Rewrite docs index: add OpenClaw, how_it_works, cache_sync, benchmark…

9e4b450

…s, fix broken docker link

SecretSettler requested a review from dalongbao March 24, 2026 00:46

SecretSettler added 5 commits March 24, 2026 00:50

Remove dedup section from cache_sync.md — belongs in how_it_works.md

3045101

Fix reorder example: both requests must start with shared prefix

70b8bab

Fix reorder example: only Request 2 is reordered, shared docs moved t…

843f642

…o front

Fix 13 failing tests: update for new TTL API (request_id), dedup pipe…

f097372

…line changes, FakeSession mock

Extend block dedup to assistant code blocks — dedup repeated code acr…

e5fe5c3

…oss turns

dalongbao reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenClaw and Cloud API Support #34

Add OpenClaw and Cloud API Support #34
SecretSettler wants to merge 79 commits intomainfrom
cloud-cache-proxy

SecretSettler commented Mar 24, 2026

Uh oh!

dalongbao commented Mar 26, 2026

Uh oh!

dalongbao left a comment

Uh oh!

dalongbao Mar 26, 2026

Uh oh!

dalongbao Mar 26, 2026

Uh oh!

dalongbao commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SecretSettler commented Mar 24, 2026

Uh oh!

dalongbao commented Mar 26, 2026

Uh oh!

dalongbao left a comment

Choose a reason for hiding this comment

Review Summary

Bugs (should fix before merge)

Resource / safety issues

Cloud adapters

Docs

Dedup module

Minor / nits (non-blocking)

Uh oh!

dalongbao Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

dalongbao Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

dalongbao commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants