fix: Send binary files via application/octet-stream in boxel-cli sync (CS-11075)#4852
fix: Send binary files via application/octet-stream in boxel-cli sync (CS-11075)#4852FadhlanR wants to merge 4 commits into
Conversation
Make boxel-cli's push / pull / sync / watch and the single-file `boxel file write` / `boxel file read` commands handle binary assets (images, PDFs, fonts, etc.) byte-identically by mirroring the host package's wire format: a per-file POST with `Content-Type: application/octet-stream` and raw bytes as the body, which the realm-server already routes to `upsertBinaryFile`. Detection uses the existing `isBinaryFilename` helper from runtime-common, so SVG (XML-based) stays on the text path while `image/*`, `font/*`, `application/pdf`, and `.eot` switch to bytes. The /_atomic batch endpoint stays text-only — binary entries are carved out and POSTed per-file alongside the atomic call, matching the host package which never sends binary through /_atomic. Adds binary roundtrip tests across realm-push, realm-pull, realm-watch, file-write, and file-read using an inline 1x1 transparent PNG fixture so non-UTF-8 bytes (PNG signature, IDAT chunk) round-trip verbatim. Linear: CS-11075 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a60cacb53
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When uploadFilesAtomic split binaries out of /_atomic, a partial binary failure left the atomic text successes unrecorded on disk — the next push hit a 409 against the files the server already had. Record result.succeeded unconditionally in push and sync, drop the brittle status-from-message regex in favor of a status field on the thrown Error (with the response body in the message), and reject source/destination binary-classification mismatches in `file write`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bare `from '@cardstack/runtime-common'` pulled the whole barrel into the bundle and the type-check program: transpile.ts → content-tag, plus ~50 modules that import `https://cardstack.com/base/*`. That broke the smoke tests (`ENOENT: dist/content_tag_bg.wasm` at startup) and `lint:types` (TS2307 cascade). Switching to `/infer-content-type` keeps the bundle to the leaf file's tiny graph, matching the subpath-only convention every other boxel-cli source file follows. Bundle shrinks from 7.2 MB to 370 KB; lint clean; integration suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes binary file corruption in boxel-cli realm sync/push/pull and file read/write by routing binary payloads through application/octet-stream while keeping /_atomic text-only.
Changes:
- Detect binary files via
isBinaryFilenameand upload/download them as raw bytes (octet-stream) instead of UTF-8 text. - Split mixed atomic uploads into a text
/_atomicbatch plus per-file binary POSTs, and persist manifest updates for successful files even on partial failure. - Add integration coverage for PNG/PDF binary round-trips plus new inline byte fixtures.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/boxel-cli/src/lib/realm-sync-base.ts | Adds binary upload/download support, structured upload errors, and mixed-batch splitting for /_atomic. |
| packages/boxel-cli/src/commands/realm/sync.ts | Persists manifest updates based on succeeded operations even when some uploads fail. |
| packages/boxel-cli/src/commands/realm/push.ts | Updates manifest/mtimes even on partial failure so successful uploads aren’t retried. |
| packages/boxel-cli/src/commands/file/write.ts | Allows writing Uint8Array as octet-stream and adds CLI protection against src/dst binary mismatch. |
| packages/boxel-cli/src/commands/file/read.ts | Returns bytes for binary reads and adds base64 JSON output for binary content. |
| packages/boxel-cli/tests/helpers/binary-fixtures.ts | Adds tiny PNG/PDF byte fixtures for regression tests. |
| packages/boxel-cli/tests/integration/realm-push.test.ts | Adds binary push + mixed batch tests (including partial failure behavior). |
| packages/boxel-cli/tests/integration/realm-pull.test.ts | Adds binary pull test for PNG byte identity. |
| packages/boxel-cli/tests/integration/realm-watch.test.ts | Adds watcher pull test for byte-identical binary download. |
| packages/boxel-cli/tests/integration/file-write.test.ts | Adds binary write coverage for PNG/PDF. |
| packages/boxel-cli/tests/integration/file-read.test.ts | Adds binary read coverage and asserts bytes is used instead of content. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const binaryResults = await Promise.all( | ||
| binaryEntries.map(([relativePath, localPath]) => | ||
| this.remoteLimit(async () => { | ||
| try { | ||
| await this.uploadBinaryFile(relativePath, localPath); | ||
| console.log(` Uploaded: ${relativePath}`); | ||
| return { relativePath, ok: true as const }; | ||
| } catch (err) { | ||
| const errWithStatus = err as { status?: number }; | ||
| const status = | ||
| typeof errWithStatus?.status === 'number' | ||
| ? errWithStatus.status | ||
| : 500; | ||
| const title = err instanceof Error ? err.message : String(err); | ||
| return { | ||
| relativePath, | ||
| ok: false as const, | ||
| status, | ||
| title, | ||
| }; | ||
| } | ||
| }), | ||
| ), | ||
| ); |
There was a problem hiding this comment.
Good catch — the PR description claimed parallelism that the code didn't actually have. Fixed in 8c2c95f: extracted uploadBinaryBatch / uploadTextAtomic helpers and now run them via Promise.all([…]), so the per-file octet-stream POSTs and the /_atomic batch are truly concurrent.
| let url = new URL(path, ensureTrailingSlash(realmUrl)).href; | ||
| let isBinary = typeof content !== 'string'; | ||
|
|
||
| try { | ||
| let response = await pm.authedRealmFetch(url, { | ||
| method: 'POST', | ||
| headers: { | ||
| Accept: SupportedMimeType.CardSource, | ||
| 'Content-Type': SupportedMimeType.CardSource, | ||
| }, | ||
| body: content, | ||
| headers: isBinary | ||
| ? { 'Content-Type': SupportedMimeType.OctetStream } | ||
| : { | ||
| Accept: SupportedMimeType.CardSource, | ||
| 'Content-Type': SupportedMimeType.CardSource, | ||
| }, | ||
| // Both branches of `content: string | Uint8Array` are valid | ||
| // BodyInit values, but TS narrows them as a union that doesn't | ||
| // unify against the fetch signature without a hint. | ||
| body: content as BodyInit, | ||
| }); |
There was a problem hiding this comment.
Addressed in 8c2c95f — added an isBinaryFilename(path) vs typeof content guard inside the library-level write(). The CLI wrapper's --file mismatch check stays (it can short-circuit before reading the source file with the wrong encoding), but the library guard catches programmatic callers (BoxelClient.write and tests) that don't go through the CLI path.
| await watcher.initialize(); | ||
| await watcher.poll(); | ||
| let result = await watcher.flushPending(); | ||
| expect(result.pulled).toContain('image.png'); | ||
|
|
||
| let pulled = fs.readFileSync(path.join(localDir, 'image.png')); | ||
| expect(pulled.equals(Buffer.from(TINY_PNG_BYTES))).toBe(true); | ||
|
|
||
| watcher.shutdown(); |
There was a problem hiding this comment.
Addressed in 8c2c95f. Wrapped every RealmWatcher site in the file in try/finally — the new PNG test, the six pre-existing watcher tests, and the primer in the post-cleanup race test — so a failing assertion can't leak a running watcher into the next test.
| } | ||
| } | ||
|
|
||
| const binaryResults = await Promise.all( |
There was a problem hiding this comment.
do we want to use Promise.allSettled here so an individual failure doesn't reject for the whole set?
There was a problem hiding this comment.
Switched to Promise.allSettled in 8c2c95f. The inner try/catch already prevented rejection, but allSettled makes intent explicit and survives any future change that drops the inner catch — a rogue throw now surfaces as a rejected outcome that we fold into a structured failure (status 500, String(reason) title) instead of silently aborting the rest of the fan-out.
- realm-sync-base.ts: extract uploadBinaryBatch / uploadTextAtomic helpers and run them concurrently via Promise.all, matching the "in parallel alongside" claim in the PR description. Switch the per-file binary fan-out to Promise.allSettled so an unexpected reject (future change that drops the inner try/catch) still surfaces a structured failure instead of aborting the rest. - commands/file/write.ts: add a library-level isBinaryFilename(path) vs typeof content guard. The CLI wrapper already rejects this mismatch up-front, but BoxelClient.write reaches the function directly and needed the same defense-in-depth. - tests/integration/realm-watch.test.ts: wrap every RealmWatcher test (including the primer in the post-cleanup race test) in try/finally so a failing assertion can no longer leak a running watcher into the next test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
U+FFFD).isBinaryFilename(already exported byruntime-common) and route binary files through a per-file POST withContent-Type: application/octet-stream— exactly the wire formatpackages/host'sFileUploadServiceandWriteBinaryFileCommandalready use, which the realm-server routes toupsertBinaryFile./_atomicbatch endpoint stays text-only (it embeds content inside a JSON string). When a push batch contains binary files, they are carved out and POSTed per-file in parallel alongside the atomic call for the text files. Failures are folded back into the existingperFileerror shape so callers don't have to learn a second contract.boxel file write/boxel file readcommands grow a symmetric branch:write()now acceptsUint8Arrayand switches to octet-stream;read()populates a newbytes?: Uint8Arrayfield for binary paths and renders bytes to stdout (or base64 under--json).Test plan
pnpm lintclean inpackages/boxel-cli(the runtime-commonhttps://cardstack.com/base/*resolution errors that surface duringlint:typesare pre-existing and unrelated to this change).pnpm test:unit-exclude-smoke— 187/187 unit tests pass.packages/boxel-cli/tests/integration/realm-push.test.ts— 23/23 pass (4 new binary tests: PNG, PDF, mixed batch, SVG-stays-text).packages/boxel-cli/tests/integration/realm-pull.test.ts— 13/13 pass (1 new binary PNG test).packages/boxel-cli/tests/integration/realm-sync.test.ts— 16/16 pass (no new tests; regression coverage for the partitioned upload path).packages/boxel-cli/tests/integration/realm-watch.test.ts— 15/15 pass (1 new binary PNG test).packages/boxel-cli/tests/integration/file-write.test.ts— 5/5 pass (2 new binary tests: PNG, PDF).packages/boxel-cli/tests/integration/file-read.test.ts— 5/5 pass (1 new binary PNG test)..png+.pdf+.woff2+.gts, pull into an empty dir, hash both sides withshasum -a 256to confirm byte-equality. (Steps documented indocs/cs-11075-binary-upload-plan.md.)Test fixture is a 67-byte 1x1 transparent PNG defined inline in
packages/boxel-cli/tests/helpers/binary-fixtures.ts— the IDAT chunk contains non-UTF-8 byte sequences (0x89, embedded nulls) that get mangled if anything in the pipeline forces UTF-8 round-tripping, so a byte-levelBuffer.equalsassertion is enough to catch any regression.Linear: https://linear.app/cardstack/issue/CS-11075
🤖 Generated with Claude Code