Skip to content

fix: Send binary files via application/octet-stream in boxel-cli sync (CS-11075)#4852

Open
FadhlanR wants to merge 4 commits into
mainfrom
cs-11075-ensure-that-uploading-binary-files-to-realm-works-reliably
Open

fix: Send binary files via application/octet-stream in boxel-cli sync (CS-11075)#4852
FadhlanR wants to merge 4 commits into
mainfrom
cs-11075-ensure-that-uploading-binary-files-to-realm-works-reliably

Conversation

@FadhlanR
Copy link
Copy Markdown
Contributor

Summary

  • boxel-cli's realm-sync code read every file as UTF-8 and wrote every downloaded file back as UTF-8, silently corrupting PNG / JPEG / PDF / font bytes (invalid UTF-8 sequences round-trip as U+FFFD).
  • Branch the upload + download sites on isBinaryFilename (already exported by runtime-common) and route binary files through a per-file POST with Content-Type: application/octet-stream — exactly the wire format packages/host's FileUploadService and WriteBinaryFileCommand already use, which the realm-server routes to upsertBinaryFile.
  • The /_atomic batch endpoint stays text-only (it embeds content inside a JSON string). When a push batch contains binary files, they are carved out and POSTed per-file in parallel alongside the atomic call for the text files. Failures are folded back into the existing perFile error shape so callers don't have to learn a second contract.
  • The single-file boxel file write / boxel file read commands grow a symmetric branch: write() now accepts Uint8Array and switches to octet-stream; read() populates a new bytes?: Uint8Array field for binary paths and renders bytes to stdout (or base64 under --json).

Test plan

  • pnpm lint clean in packages/boxel-cli (the runtime-common https://cardstack.com/base/* resolution errors that surface during lint:types are pre-existing and unrelated to this change).
  • pnpm test:unit-exclude-smoke — 187/187 unit tests pass.
  • packages/boxel-cli/tests/integration/realm-push.test.ts — 23/23 pass (4 new binary tests: PNG, PDF, mixed batch, SVG-stays-text).
  • packages/boxel-cli/tests/integration/realm-pull.test.ts — 13/13 pass (1 new binary PNG test).
  • packages/boxel-cli/tests/integration/realm-sync.test.ts — 16/16 pass (no new tests; regression coverage for the partitioned upload path).
  • packages/boxel-cli/tests/integration/realm-watch.test.ts — 15/15 pass (1 new binary PNG test).
  • packages/boxel-cli/tests/integration/file-write.test.ts — 5/5 pass (2 new binary tests: PNG, PDF).
  • packages/boxel-cli/tests/integration/file-read.test.ts — 5/5 pass (1 new binary PNG test).
  • End-to-end manual smoke: push a folder containing .png + .pdf + .woff2 + .gts, pull into an empty dir, hash both sides with shasum -a 256 to confirm byte-equality. (Steps documented in docs/cs-11075-binary-upload-plan.md.)

Test fixture is a 67-byte 1x1 transparent PNG defined inline in packages/boxel-cli/tests/helpers/binary-fixtures.ts — the IDAT chunk contains non-UTF-8 byte sequences (0x89, embedded nulls) that get mangled if anything in the pipeline forces UTF-8 round-tripping, so a byte-level Buffer.equals assertion is enough to catch any regression.

Linear: https://linear.app/cardstack/issue/CS-11075

🤖 Generated with Claude Code

Make boxel-cli's push / pull / sync / watch and the single-file
`boxel file write` / `boxel file read` commands handle binary assets
(images, PDFs, fonts, etc.) byte-identically by mirroring the host
package's wire format: a per-file POST with
`Content-Type: application/octet-stream` and raw bytes as the body,
which the realm-server already routes to `upsertBinaryFile`.

Detection uses the existing `isBinaryFilename` helper from
runtime-common, so SVG (XML-based) stays on the text path while
`image/*`, `font/*`, `application/pdf`, and `.eot` switch to bytes.
The /_atomic batch endpoint stays text-only — binary entries are
carved out and POSTed per-file alongside the atomic call, matching
the host package which never sends binary through /_atomic.

Adds binary roundtrip tests across realm-push, realm-pull,
realm-watch, file-write, and file-read using an inline 1x1
transparent PNG fixture so non-UTF-8 bytes (PNG signature, IDAT
chunk) round-trip verbatim.

Linear: CS-11075

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a60cacb53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/boxel-cli/src/lib/realm-sync-base.ts
FadhlanR and others added 2 commits May 18, 2026 16:21
When uploadFilesAtomic split binaries out of /_atomic, a partial
binary failure left the atomic text successes unrecorded on disk —
the next push hit a 409 against the files the server already had.
Record result.succeeded unconditionally in push and sync, drop the
brittle status-from-message regex in favor of a status field on the
thrown Error (with the response body in the message), and reject
source/destination binary-classification mismatches in `file write`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bare `from '@cardstack/runtime-common'` pulled the whole barrel
into the bundle and the type-check program: transpile.ts → content-tag,
plus ~50 modules that import `https://cardstack.com/base/*`. That
broke the smoke tests (`ENOENT: dist/content_tag_bg.wasm` at startup)
and `lint:types` (TS2307 cascade). Switching to
`/infer-content-type` keeps the bundle to the leaf file's tiny graph,
matching the subpath-only convention every other boxel-cli source
file follows. Bundle shrinks from 7.2 MB to 370 KB; lint clean;
integration suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FadhlanR FadhlanR requested a review from a team May 19, 2026 04:08
@habdelra habdelra requested a review from Copilot May 19, 2026 13:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes binary file corruption in boxel-cli realm sync/push/pull and file read/write by routing binary payloads through application/octet-stream while keeping /_atomic text-only.

Changes:

  • Detect binary files via isBinaryFilename and upload/download them as raw bytes (octet-stream) instead of UTF-8 text.
  • Split mixed atomic uploads into a text /_atomic batch plus per-file binary POSTs, and persist manifest updates for successful files even on partial failure.
  • Add integration coverage for PNG/PDF binary round-trips plus new inline byte fixtures.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/boxel-cli/src/lib/realm-sync-base.ts Adds binary upload/download support, structured upload errors, and mixed-batch splitting for /_atomic.
packages/boxel-cli/src/commands/realm/sync.ts Persists manifest updates based on succeeded operations even when some uploads fail.
packages/boxel-cli/src/commands/realm/push.ts Updates manifest/mtimes even on partial failure so successful uploads aren’t retried.
packages/boxel-cli/src/commands/file/write.ts Allows writing Uint8Array as octet-stream and adds CLI protection against src/dst binary mismatch.
packages/boxel-cli/src/commands/file/read.ts Returns bytes for binary reads and adds base64 JSON output for binary content.
packages/boxel-cli/tests/helpers/binary-fixtures.ts Adds tiny PNG/PDF byte fixtures for regression tests.
packages/boxel-cli/tests/integration/realm-push.test.ts Adds binary push + mixed batch tests (including partial failure behavior).
packages/boxel-cli/tests/integration/realm-pull.test.ts Adds binary pull test for PNG byte identity.
packages/boxel-cli/tests/integration/realm-watch.test.ts Adds watcher pull test for byte-identical binary download.
packages/boxel-cli/tests/integration/file-write.test.ts Adds binary write coverage for PNG/PDF.
packages/boxel-cli/tests/integration/file-read.test.ts Adds binary read coverage and asserts bytes is used instead of content.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +522 to +545
const binaryResults = await Promise.all(
binaryEntries.map(([relativePath, localPath]) =>
this.remoteLimit(async () => {
try {
await this.uploadBinaryFile(relativePath, localPath);
console.log(` Uploaded: ${relativePath}`);
return { relativePath, ok: true as const };
} catch (err) {
const errWithStatus = err as { status?: number };
const status =
typeof errWithStatus?.status === 'number'
? errWithStatus.status
: 500;
const title = err instanceof Error ? err.message : String(err);
return {
relativePath,
ok: false as const,
status,
title,
};
}
}),
),
);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the PR description claimed parallelism that the code didn't actually have. Fixed in 8c2c95f: extracted uploadBinaryBatch / uploadTextAtomic helpers and now run them via Promise.all([…]), so the per-file octet-stream POSTs and the /_atomic batch are truly concurrent.

Comment on lines 54 to 70
let url = new URL(path, ensureTrailingSlash(realmUrl)).href;
let isBinary = typeof content !== 'string';

try {
let response = await pm.authedRealmFetch(url, {
method: 'POST',
headers: {
Accept: SupportedMimeType.CardSource,
'Content-Type': SupportedMimeType.CardSource,
},
body: content,
headers: isBinary
? { 'Content-Type': SupportedMimeType.OctetStream }
: {
Accept: SupportedMimeType.CardSource,
'Content-Type': SupportedMimeType.CardSource,
},
// Both branches of `content: string | Uint8Array` are valid
// BodyInit values, but TS narrows them as a union that doesn't
// unify against the fetch signature without a hint.
body: content as BodyInit,
});
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 8c2c95f — added an isBinaryFilename(path) vs typeof content guard inside the library-level write(). The CLI wrapper's --file mismatch check stays (it can short-circuit before reading the source file with the wrong encoding), but the library guard catches programmatic callers (BoxelClient.write and tests) that don't go through the CLI path.

Comment on lines +502 to +510
await watcher.initialize();
await watcher.poll();
let result = await watcher.flushPending();
expect(result.pulled).toContain('image.png');

let pulled = fs.readFileSync(path.join(localDir, 'image.png'));
expect(pulled.equals(Buffer.from(TINY_PNG_BYTES))).toBe(true);

watcher.shutdown();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 8c2c95f. Wrapped every RealmWatcher site in the file in try/finally — the new PNG test, the six pre-existing watcher tests, and the primer in the post-cleanup race test — so a failing assertion can't leak a running watcher into the next test.

}
}

const binaryResults = await Promise.all(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to use Promise.allSettled here so an individual failure doesn't reject for the whole set?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to Promise.allSettled in 8c2c95f. The inner try/catch already prevented rejection, but allSettled makes intent explicit and survives any future change that drops the inner catch — a rogue throw now surfaces as a rejected outcome that we fold into a structured failure (status 500, String(reason) title) instead of silently aborting the rest of the fan-out.

- realm-sync-base.ts: extract uploadBinaryBatch / uploadTextAtomic
  helpers and run them concurrently via Promise.all, matching the
  "in parallel alongside" claim in the PR description. Switch the
  per-file binary fan-out to Promise.allSettled so an unexpected
  reject (future change that drops the inner try/catch) still
  surfaces a structured failure instead of aborting the rest.
- commands/file/write.ts: add a library-level isBinaryFilename(path)
  vs typeof content guard. The CLI wrapper already rejects this
  mismatch up-front, but BoxelClient.write reaches the function
  directly and needed the same defense-in-depth.
- tests/integration/realm-watch.test.ts: wrap every RealmWatcher
  test (including the primer in the post-cleanup race test) in
  try/finally so a failing assertion can no longer leak a running
  watcher into the next test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FadhlanR FadhlanR changed the title Send binary files via application/octet-stream in boxel-cli sync (CS-11075) fix: Send binary files via application/octet-stream in boxel-cli sync (CS-11075) May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants