feat: content attachment (Paper I Atom(p))#42
Conversation
B-DOC-1: markdownlint in CI for MD040 and similar issues. B-DOC-2: syntax-check JS/TS code blocks in markdown specs.
…ent() Implements the write-side of content attachment (Paper I Atom(p)): - CONTENT_PROPERTY_KEY constant in KeyCodec.js - attachContent(nodeId, content) writes blob + sets _content property - attachEdgeContent(from, to, label, content) for edge content - commit() now embeds content blob OIDs in the patch commit tree under _blob/0, _blob/1, etc. for GC protection - _contentBlobs array tracks blob OIDs during patch building
Delegates attachContent() and attachEdgeContent() from PatchSession to PatchBuilderV2, with _ensureNotCommitted() guard. Both are async (returns Promise<this>) because writeBlob() is I/O. Also removes unnecessary eslint-disable directive for max-params in PatchBuilderV2.attachEdgeContent().
Adds four auto-wired query methods to WarpGraph: - getContentOid(nodeId) → string|null - getContent(nodeId) → Buffer|null - getEdgeContentOid(from, to, label) → string|null - getEdgeContent(from, to, label) → Buffer|null These read the _content property and resolve it via readBlob(). Auto-wired via wireWarpMethods() — no manual registration needed.
- index.js: export CONTENT_PROPERTY_KEY from KeyCodec - index.d.ts: add attachContent/attachEdgeContent to PatchBuilderV2 and PatchSession; add getContent/getContentOid/getEdgeContent/ getEdgeContentOid to WarpGraph; export CONTENT_PROPERTY_KEY const - type-surface.m8.json: register all new methods and constant - consumer.ts: exercise new API with positive + negative type tests
Covers single-writer, getContentOid, null cases, edge content, multi-writer LWW, time-travel with ceiling, node deletion, Writer API, GC durability (git gc --prune=now), and binary round-trip. Also fixes _blob/ → _blob_ in tree entry paths (git mktree does not support subdirectory paths).
- CONTENT_ATTACHMENT.md: status Proposal → Implemented; add Final API section with write/read examples; add Durability/Git GC section explaining blob anchoring; update summary table - CHANGELOG.md: add 11.5.0 entry documenting all new API surface
New feature (content attachment), backward-compatible.
Scans state.prop for _content values during checkpoint creation and embeds the referenced blob OIDs as _blob_* entries in the checkpoint commit tree. This ensures content blobs remain reachable via the checkpoint even if patch commits are ever pruned in the future. Without this, git gc could nuke content blobs whose only anchor was a pruned patch commit tree.
Wraps stream.collect() result with Buffer.from() when the underlying
plumbing returns a Uint8Array. This makes the public API contract
honest — consumers can call .toString('utf8') without wrapping.
Simplifies integration tests to use content.toString('utf8') directly.
Verifies the invariant: attach content → create checkpoint → git gc --prune=now → content still loads. Proves checkpoint tree anchoring works independently of patch commit reachability.
- CHANGELOG.md: add Fixed section for checkpoint content anchoring and readBlob Buffer guarantee - CONTENT_ATTACHMENT.md: add checkpoint anchoring invariant to durability section
- Add Content Attachment section to README under Patch Operations - Update "When NOT to Use It" blob storage note for accuracy - Add ADR-001 Folds entry to CHANGELOG - Remove LLM preamble from ADR-001, fix title numbering
- Add Content Attachment section to README under Patch Operations - Update "When NOT to Use It" blob storage note for accuracy - Add ADR-001 Folds entry to CHANGELOG - Remove LLM preamble from ADR-001, fix title numbering - Fix consumer.ts PatchSession content test ordering (ps used before declaration)
Add getContent, getContentOid, getEdgeContent, getEdgeContentOid to the API surface snapshot. Method count 75 → 79.
|
No actionable comments were generated in the recent review. 🎉 📝 WalkthroughWalkthroughAdds content-attachment v11.5.0: content-addressed blob write/read APIs for nodes and edges, GC-anchoring of blob OIDs in checkpoints/commits, Buffer-normalized blob reads, public CONTENT_PROPERTY_KEY export, docs/ADR/spec updates, and extensive tests and type-surface updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant PatchSession as PatchSession
participant PatchBuilder as PatchBuilderV2
participant BlobPort as BlobPort
participant Repo as Repository
participant Checkpoint as CheckpointService
participant GitAdapter as GitGraphAdapter
Client->>PatchSession: attachContent(nodeId, content)
PatchSession->>PatchBuilder: attachContent(nodeId, content)
PatchBuilder->>BlobPort: writeBlob(content)
BlobPort->>Repo: store blob object
Repo-->>BlobPort: oid
BlobPort-->>PatchBuilder: oid
PatchBuilder->>PatchBuilder: _contentBlobs.push(oid)\nset node/edge _content = oid
PatchBuilder-->>PatchSession: this
PatchSession-->>Client: this
Client->>PatchSession: commit()
PatchSession->>PatchBuilder: commit()
PatchBuilder->>Repo: create tree + entries (_content_<oid> / patch blob)
Repo-->>PatchBuilder: commit
PatchBuilder-->>PatchSession: commit result
PatchSession-->>Client: commit result
Note over Checkpoint,Repo: On checkpoint creation
Checkpoint->>Repo: scan checkpoint state for _content entries
Checkpoint->>Repo: write anchor tree entries (_content_<oid>, ...)
Note over Repo: Anchored blob OIDs prevent GC pruning
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Release Preflight
If you tag this commit as |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@BACKLOG.md`:
- Around line 122-126: Update the B-DOC-2 entry to replace the incorrect tool
recommendation: note that remark-lint-fenced-code-flag only enforces a language
tag and does not perform syntax checking, and change the suggested
implementation to use eslint-plugin-markdown (run ESLint on JS/TS fenced blocks)
or an ESLint --stdin-based extraction script; mention adding a CI step or script
that targets docs/**/*.md and explicitly runs ESLint with your TS/JS config so
code blocks are syntax-checked for duplicate const/TDZ and similar errors.
In `@docs/ADR-001-Folds.md`:
- Around line 98-113: The TypeScript code fence opened before the
FoldMode/FoldPolicy types is not closed, causing the subsequent "Default
behavior..." and "Rationale:" paragraphs to be rendered as code; close the code
fence immediately after the closing brace/semicolon of the FoldPolicy definition
(after "};") and convert the following lines into normal Markdown prose/bulleted
list (the text describing Default behavior and the Rationale for Fold) so they
render as readable Markdown rather than part of the code block; locate the types
by the FoldMode and FoldPolicy identifiers to find where to insert the closing
fence.
- Line 172: Update the fenced code block near the bullet "• Cons: not structural
recursion; merges become coarse (LWW)" in ADR-001-Folds.md to include a language
specifier (e.g., ```text for plain/pseudocode or ```ts/```js for
TypeScript/JavaScript examples) so the linter MD040 is resolved and the snippet
is syntax-highlighted correctly.
- Around line 203-208: Remove the accidental AI-generated boilerplate block that
starts with "If you want this even *hotter*, I can also generate:" and the
following bullet list (the four lines of assistant-style copy) from the ADR
document; edit ADR-001-Folds.md to delete that snippet and leave the surrounding
ADR prose intact, and scan for any other similar assistant leftover language to
remove so the committed ADR contains only author-written content.
- Around line 115-193: Sections 3–Rollout Plan use plain text headings, inline
code, and Apple bullets causing Markdown regressions; update each named heading
("3) API Surface (Proposed)", "3.1 View API", "3.2 Attachment Graph Accessors",
"4) Invariants", "Consequences", "Benefits", "Costs / Risks", "Alternatives
Considered", "Rollout Plan", "Decision Summary") to proper Markdown headings
(e.g., ### / #### levels consistent with earlier sections), wrap code examples
like the const view = graph.view(...) snippet and the graph.getFoldRootForNode /
graph.getFoldViewForNode accessors in fenced code blocks with a language tag
(```js), and convert Apple bullet characters (•) to standard Markdown list
markers (- or *) with correct indentation so lists render properly; keep the
original text and identifiers (const view = graph.view, view.traverse,
view.query, view.renderAscii, graph.getFoldRootForNode,
graph.getFoldViewForNode, FoldPolicy) to locate the changes.
In `@docs/specs/CONTENT_ATTACHMENT.md`:
- Around line 172-176: The unnamed fenced code block in CONTENT_ATTACHMENT.md
(the GC anchoring section showing "patch.cbor → CBOR-encoded patch blob" and
blob lines) needs a language tag to satisfy markdownlint MD040; update the
opening fence from ``` to ```text for that specific block so the fence is
explicitly labeled (look for the block containing "patch.cbor →
CBOR-encoded patch blob" and change its opening fence).
In `@README.md`:
- Around line 331-350: The example incorrectly reuses the committed Patch
builder: after calling patch.commit() on the Patch returned by
graph.createPatch(), subsequent calls like patch.attachEdgeContent(...) will
throw because the builder is already committed; fix by creating a fresh patch
for the edge content (e.g., call graph.createPatch() again and use that new
Patch to call attachEdgeContent(...) and commit it), and leave the read examples
using graph.getContent(), graph.getContentOid(), and graph.getEdgeContent()
unchanged.
In `@src/domain/services/PatchBuilderV2.js`:
- Around line 468-472: In attachEdgeContent, _persistence.writeBlob is called
and the resulting oid is pushed into _contentBlobs before setEdgeProperty
validates the edge; if setEdgeProperty throws the blob remains pinned. Change
the flow so you either validate the edge first (e.g., call a pre-check like
hasEdge / getEdge) or call setEdgeProperty successfully before pushing oid into
_contentBlobs and returning; specifically ensure the push to this._contentBlobs
happens only after setEdgeProperty(from, to, label, CONTENT_PROPERTY_KEY, oid)
completes without throwing.
Delete trailing LLM prompt residue (lines 203-208), close unclosed code fence at line 106, convert tab-bullet formatting to standard Markdown dashes, restore heading hierarchy (##/###/####), wrap code examples in fences, and remove emoji from section headings.
Content anchor tree entries in patch commits and checkpoints now use the blob's hex OID as the filename (_content_<sha>) instead of a sequential counter (_blob_0, _blob_1). This is self-documenting, unique by construction, and eliminates the counter variable.
- PatchBuilderV2.attachContent: note that node must exist for getContent - README: clarify sync vs async in content attachment example - README: note that edge content assumes nodes/edge already exist - query.methods.js: document Map .get() vs Record [] access asymmetry - CONTENT_ATTACHMENT.md: remove stale "Edge attachments deferred" bullet (edge content shipped in v11.5.0)
- B-FEAT-1: mark content attachment as shipped in v11.5.0 - B-FEAT-2: determinism fuzzer for tree construction order - B-FEAT-3: reconcile Map vs Record asymmetry in getNodeProps/getEdgeProps - B-TYPE-3: establish test-file wildcard ratchet for ts-policy-check
Release Preflight
If you tag this commit as |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/domain/warp/query.methods.js (1)
309-315:getContentdelegates well, but consider documenting error behavior for missing blobs.If a node's
_contentOID references a blob that has been garbage-collected (despite the GC anchoring),readBlobwill throw. This is probably the right behavior (fail loud), but the JSDoc doesn't mention it. A@throwsline would help consumers handle the edge case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/domain/warp/query.methods.js` around lines 309 - 315, Add JSDoc to getContent explaining its error behavior: document that getContent calls getContentOid and this._persistence.readBlob and that readBlob may throw (e.g., if the referenced blob OID has been garbage-collected), include a `@throws` line describing the thrown error and when consumers should expect it so callers can handle the edge case appropriately.test/unit/domain/services/PatchBuilderV2.content.test.js (2)
112-141: Missing test:attachEdgeContenton a non-existent edge.
setEdgePropertythrows when the edge doesn't exist, but_contentBlobsis already pushed (per the ordering issue in the builder). A test for this case would both document the expected error and verify that_contentBlobsisn't polluted on failure — once the ordering fix is applied.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/unit/domain/services/PatchBuilderV2.content.test.js` around lines 112 - 141, Add a new unit test for PatchBuilderV2.attachEdgeContent that calls attachEdgeContent on a non-existent edge (so setEdgeProperty will throw) and asserts that the call throws and that the builder's internal _contentBlobs array was not mutated; construct the builder with a mocked persistence.writeBlob (e.g., vi.fn().mockResolvedValue(...)) and a state that does NOT contain the target edge, call await expect(builder.attachEdgeContent(...)).rejects.toThrow(), and then verify builder._contentBlobs remains empty (or unchanged) and persistence.writeBlob was not called or its result was not retained in the builder after the failure so the test documents the error behavior and ensures no blob pollution.
257-288: Consider adding a test for duplicate content (same blob attached to two nodes).This test uses distinct blob OIDs for each attachment, but if the same content is attached to two different nodes,
writeBlobreturns the same OID. The commit tree would then contain duplicate_content_<oid>entries with the same path — which is the dedup issue flagged inPatchBuilderV2.js. Adding a test for this scenario would document the expected behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/unit/domain/services/PatchBuilderV2.content.test.js` around lines 257 - 288, Add a unit test to cover duplicate-content deduping: create a test in the same style as the existing one that uses PatchBuilderV2, mocks persistence.writeBlob to return the same blob OID for two attachContent calls, call builder.addNode('n1').addNode('n2'), attachContent with the same payload to both nodes, commit, then inspect persistence.writeTree.mock.calls[0][0] (treeEntries) and assert it contains only one `_content_<oid>` entry (and the patch.cbor entry) — i.e., treeEntries length should be 2 and the single content entry equals `100644 blob <oid>\t_content_<oid>`; reference PatchBuilderV2.attachContent, persistence.writeBlob, and persistence.writeTree to locate the code under test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 23: Update the CHANGELOG entry so the method signature for
getEdgeContentOid matches the others: change the text from "getEdgeContentOid()"
to "getEdgeContentOid(from, to, label)" to reflect its parameters and arity;
ensure the entry that currently reads "**`WarpGraph.getEdgeContent(from, to,
label)`** / **`getEdgeContentOid()`** — edge variants." is updated to
"**`WarpGraph.getEdgeContent(from, to, label)`** / **`getEdgeContentOid(from,
to, label)`** — edge variants." so consumers see consistent signatures for
WarpGraph.getEdgeContent and getEdgeContentOid.
In `@src/domain/services/PatchBuilderV2.js`:
- Around line 626-632: The tree builder is emitting duplicate tree entry paths
because this._contentBlobs can contain the same OID multiple times
(content-addressed blobs); before calling this._persistence.writeTree you must
deduplicate the OIDs so each tree path `_content_<oid>` is unique: replace the
loop that pushes into treeEntries with logic that iterates over a deduplicated
set (e.g., Array.from(new Set(this._contentBlobs))) or otherwise filter
duplicates, keeping the initial patchBlobOid entry unchanged, then call
this._persistence.writeTree(treeEntries) so writeTree receives unique paths;
check symbols: this._contentBlobs, treeEntries, patchBlobOid, and
this._persistence.writeTree.
---
Duplicate comments:
In `@docs/specs/CONTENT_ATTACHMENT.md`:
- Around line 171-174: The fenced code block containing "patch.cbor →
CBOR-encoded patch blob" and "_content_<oid> → content blob" lacks a language
tag; update that triple-backtick fence in CONTENT_ATTACHMENT.md so it reads
```text (i.e., add "text" after the opening ```), leaving the inner lines
unchanged to satisfy markdownlint MD040.
In `@src/domain/services/PatchBuilderV2.js`:
- Around line 472-477: In attachEdgeContent, an OID is pushed into _contentBlobs
before setEdgeProperty validates the edge, so if setEdgeProperty throws the blob
is incorrectly retained; reorder the operations in attachEdgeContent so you
first write the blob (await this._persistence.writeBlob(content)), then call
this.setEdgeProperty(from, to, label, CONTENT_PROPERTY_KEY, oid) and only after
that succeeds call this._contentBlobs.push(oid), preserving the final return
this; this ensures blobs are only recorded when the edge property is
successfully set.
---
Nitpick comments:
In `@src/domain/warp/query.methods.js`:
- Around line 309-315: Add JSDoc to getContent explaining its error behavior:
document that getContent calls getContentOid and this._persistence.readBlob and
that readBlob may throw (e.g., if the referenced blob OID has been
garbage-collected), include a `@throws` line describing the thrown error and when
consumers should expect it so callers can handle the edge case appropriately.
In `@test/unit/domain/services/PatchBuilderV2.content.test.js`:
- Around line 112-141: Add a new unit test for PatchBuilderV2.attachEdgeContent
that calls attachEdgeContent on a non-existent edge (so setEdgeProperty will
throw) and asserts that the call throws and that the builder's internal
_contentBlobs array was not mutated; construct the builder with a mocked
persistence.writeBlob (e.g., vi.fn().mockResolvedValue(...)) and a state that
does NOT contain the target edge, call await
expect(builder.attachEdgeContent(...)).rejects.toThrow(), and then verify
builder._contentBlobs remains empty (or unchanged) and persistence.writeBlob was
not called or its result was not retained in the builder after the failure so
the test documents the error behavior and ensures no blob pollution.
- Around line 257-288: Add a unit test to cover duplicate-content deduping:
create a test in the same style as the existing one that uses PatchBuilderV2,
mocks persistence.writeBlob to return the same blob OID for two attachContent
calls, call builder.addNode('n1').addNode('n2'), attachContent with the same
payload to both nodes, commit, then inspect
persistence.writeTree.mock.calls[0][0] (treeEntries) and assert it contains only
one `_content_<oid>` entry (and the patch.cbor entry) — i.e., treeEntries length
should be 2 and the single content entry equals `100644 blob
<oid>\t_content_<oid>`; reference PatchBuilderV2.attachContent,
persistence.writeBlob, and persistence.writeTree to locate the code under test.
…tree entries - attachEdgeContent: push to _contentBlobs only after setEdgeProperty succeeds, preventing orphaned blob anchors when the edge doesn't exist - commit(): deduplicate _contentBlobs via Set before building tree entries, preventing duplicate _content_<oid> paths when identical content is attached to multiple nodes - Add test: attachEdgeContent on non-existent edge doesn't pollute _contentBlobs - Add test: duplicate content produces a single deduplicated tree entry
readBlob() uses Buffer.isBuffer/Buffer.from which are Node.js globals not available in Deno without --unstable-node-globals. Explicit import fixes the 16 Deno test failures.
…JSDoc - CONTENT_ATTACHMENT.md: add `text` language tag to tree layout code fence - README.md: fix patch reuse — edge content example now creates a fresh patch - CHANGELOG.md: add params to getEdgeContentOid(from, to, label) signature - BACKLOG.md B-DOC-2: replace remark-lint-fenced-code-flag with eslint-plugin-markdown (correct tool for syntax-checking code blocks) - query.methods.js: add @throws to getContent JSDoc for missing blob case
Feedback Round 1 — All Issues AddressedCommits 1-4 (initial fixes from self-review):
Commits 5-7 (CodeRabbit feedback):
Verification
@coderabbitai review please 🥕 |
|
I can see you've systematically tackled:
The verification suite is green across the board. I'll kick off a fresh review of the PR now. ✅ Actions performedReview triggered.
|
Release Preflight
If you tag this commit as |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/domain/warp/query.methods.js`:
- Around line 339-357: getEdgeContent is missing the `@throws` JSDoc that
getContent includes to warn callers that this._persistence.readBlob(oid) will
throw if the OID is missing (e.g., due to aggressive GC/partial clones); update
the JSDoc for getEdgeContent to mirror getContent by adding a `@throws` tag
describing the readBlob error and the same note about aggressive GC/partial
clones, and reference the related helper getEdgeContentOid and
this._persistence.readBlob in the comment so callers understand the source of
the exception.
---
Duplicate comments:
In `@src/domain/services/PatchBuilderV2.js`:
- Around line 626-633: Summary: Duplicate tree entries can produce malformed Git
trees; ensure content blob OIDs are deduplicated before building treeEntries.
Fix: keep the deduplication step using uniqueBlobs = [...new
Set(this._contentBlobs)] and iterate over uniqueBlobs (not this._contentBlobs)
when pushing entries into treeEntries (the patchBlobOid entry stays first), then
call this._persistence.writeTree(treeEntries) with that deduplicated list;
verify symbols: _contentBlobs, uniqueBlobs, treeEntries, patchBlobOid, and
_persistence.writeTree are used exactly as shown.
- Around line 472-477: Ensure attachEdgeContent preserves the safe ordering:
call this._persistence.writeBlob(content) to get oid, then immediately call
setEdgeProperty(from, to, label, CONTENT_PROPERTY_KEY, oid) (which may throw for
unknown edges), and only after that push oid into this._contentBlobs; update the
attachEdgeContent implementation so the sequence is writeBlob -> setEdgeProperty
-> _contentBlobs.push and do not push the blob before setEdgeProperty to avoid
orphaned blobs on failures.
JoinReducer.join() merged only the patch's context VV (pre-creation state) into observedFrontier, never incorporating the patch's own lamport tick. This caused the frontier to lag by one tick per writer. After the existing vvMerge with contextVV, also set max(current, patch.lamport) for the patch's writer on both the fast path and receipt-enabled path. This preserves the semilattice property (pointwise max is commutative, associative, idempotent). Update WarpGraph VV test to expect the actual lamport tick (6) rather than the artificially low counter (1) that resulted from the stale frontier being cloned into _versionVector.
Mirror the @throws tag already present on getContent, noting that readBlob may throw if the referenced OID is missing from the store.
- Add observedFrontier staleness fix (#43) to CHANGELOG - Reorder attachContent to match attachEdgeContent error-safe ordering - Extract foldPatchDot helper in JoinReducer (DRY two call sites) - Add @throws JSDoc to getContentOid and getEdgeContentOid - Rename shadowed propKey destructuring in CheckpointService
Release Preflight
If you tag this commit as |
|
@coderabbitai review please |
Rate Limit Exceeded
|
package.json was bumped to 11.5.0 but jsr.json was left at 11.4.0. Also adds release workflow instructions to CLAUDE.md and new backlog items from PR #42 review.
Summary
attachContent(),getContent(),getContentOid()+ edge variants for attaching content-addressed blobs to graph nodes and edges as first-classAtom(p)payloads (Paper I)git gc --prune=now_contentLWW properties, inheriting merge, time-travel, and observer scoping for freeindex.d.tsdeclarations,type-surface.m8.jsonmanifest, consumer type tests, API surface snapshots updated (75 → 79 methods)readBlob()fix — always returns real NodeBuffer(wrapsUint8Array)Test plan
tsc --noEmit) passesSummary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests