Skip to content

perf!: buf+offsets SourceMap (no lifetime) + BorrowedSourceMap<'a>#333

Closed
Boshen wants to merge 4 commits into
mainfrom
perf/sourcemap-buffer-offsets
Closed

perf!: buf+offsets SourceMap (no lifetime) + BorrowedSourceMap<'a>#333
Boshen wants to merge 4 commits into
mainfrom
perf/sourcemap-buffer-offsets

Conversation

@Boshen
Copy link
Copy Markdown
Member

@Boshen Boshen commented May 25, 2026

Summary

Hybrid sourcemap design that separates ownership concerns:

  • `SourceMap` — no lifetime parameter. All string data (names, sources, sourcesContent, file, source_root, debug_id) lives in a single `Box` buffer, accessed via `StrRef { start, end }` offsets. ~5 allocations per SourceMap regardless of how many strings it holds.
  • `BorrowedSourceMap<'a>` — `Cow<'a, str>`-based zero-copy parse view, for callers who want to read a sourcemap without copying its strings into an owned buffer. Convert to `SourceMap` via `.into_owned()`.

Supersedes #329, which made `SourceMap` itself lifetime-parameterized as `SourceMap<'a>`. That design forced every downstream consumer (oxc, rolldown) to annotate `SourceMap<'static>` on every struct field that held a sourcemap — 20+ places in rolldown alone, for a feature (zero-copy parsing) that rolldown never actually uses.

Why

Downstream code in rolldown almost always wants an owned sourcemap as a long-lived struct field (chunk, asset, plugin hook output). The Cow design from #329 had:

```rust
pub struct Asset {
pub map: Option<SourceMap<'static>>, // <-- lifetime noise everywhere
// ...
}
```

…because rolldown stores sourcemaps as fields of long-lived structs, and Rust requires the lifetime parameter to be spelled out. With the buf+offsets design, `SourceMap` has no lifetime and the field is just `Option`. Callers who genuinely want zero-copy parse use `BorrowedSourceMap<'a>` explicitly.

Allocation count

For 1000 modules × 100 strings each:

Design Allocations per SourceMap Total
Arc-based (main) O(strings) — 1 Arc per name/source/content ~100k
Cow-based (#329) O(strings) when detached ~100k
buf+offsets O(1) — 1 buf + 4 small offset arrays + tokens ~5k

Wall-clock benchmark (xlarge fixture)

Benchmark main (Arc) #329 (Cow) This PR: `SourceMap` This PR: `BorrowedSourceMap`
parse/real_xlarge ~120 µs 119 µs 145 µs 135 µs
parse/real_large 4.29 µs 3.45 µs 4.26 µs 3.82 µs
serialize/real_xlarge 44.6 µs 44.6 µs 44.6 µs n/a
concat/from_sourcemaps ~19 µs 10.2 µs 16.9 µs n/a

Trade-off

`SourceMap` is slower than #329 on parse and concat because every string gets a `memcpy` into the buffer at construction time, where the Cow design just stored a borrowed pointer. The Cow approach pays that copy later in `.into_owned()` — same total work, just deferred. For rolldown's actual workflow (parse → store-as-owned), the totals are roughly tied; this design wins decisively on allocation count.

If you have a pure read-only workload, use `BorrowedSourceMap` instead — its parse numbers are within ~10% of #329.

API surface

```rust
// Owned: one buffer holds every string; no lifetime
let owned = SourceMap::from_json_string(json)?;
let owned = SourceMapBuilder::default().add_name("x").into_sourcemap();
let owned = ConcatSourceMapBuilder::from_sourcemaps(&[...]).into_sourcemap();

// Borrowed: zero-copy view into the input JSON
let borrowed = BorrowedSourceMap::from_json_string(json)?;
let owned: SourceMap = borrowed.into_owned();

// Both expose the same accessors
fn process(sm: &SourceMap) {
let name = sm.get_name(0); // Option<&str>
let sources = sm.get_sources(); // impl Iterator<Item = &str>
}
```

Downstream verification

Patched locally and rebuilt:

  • oxc: 3-line diff in `oxc_codegen` — that's all. No `<'static>` annotations.
  • rolldown: 6 files, ~140 line diff (down from ~190 with perf!: borrow strings in SourceMap, drop Arc<str> #329). Every `Option<SourceMap<'static>>` field becomes `Option`. `rolldown_sourcemap::adjust_sourcemap_dst_lines` uses the new `set_tokens` mutator; `collapse_sourcemaps` uses `SourceMap::new` accepting `&str` slices. Both `oxc` and `rolldown` build clean; relevant tests pass.

Breaking changes (warrants 7.0)

  • `SourceMap` no longer has the previous `Arc`-based public fields/accessors. `get_file` / `get_name` / `get_source` / `get_source_content` / `get_sources` / `get_names` / `get_source_contents` now return `Option<&str>` (or iterators of `&str`) instead of `Option<&Arc>`.
  • `SourceMap::new` takes `Option<&str>` / `Vec<&str>` (interning happens internally) instead of pre-built `Arc` values.
  • New: `SourceMap::set_tokens`, `BorrowedSourceMap<'a>`, `BorrowedSourceMap::into_owned`.

Boshen and others added 4 commits May 25, 2026 18:29
…ototype)

Replace `Vec<Cow<'a, str>>` / `Vec<Arc<str>>` storage with one
`Box<str>` buffer + per-array `Box<[StrRef]>` offset tables. The
`SourceMap` struct no longer carries a lifetime parameter, and the
allocation count drops from O(names + sources + sources_content) per
sourcemap to O(1).

New internal types:

- `StrRef { start: u32, end: u32 }` — substring view into `SourceMap::buf`.
- `OptionalStrRef` — same shape, with `start = u32::MAX` as the `None`
  sentinel (8 bytes, no enum tag).
- `StringInterner` — appends to a growing `String`, dedups via
  `FxHashMap<Box<str>, StrRef>`; the dedup keys are dropped at
  `into_buf()`.

Per-sourcemap allocations after this refactor:

- 1 `Box<str>` for `buf`
- 1 `Box<[StrRef]>` for names
- 1 `Box<[StrRef]>` for sources
- 1 `Box<[OptionalStrRef]>` for sources_content
- 1 `Box<[Token]>` for tokens
- optional `Vec<TokenChunk>`, `Vec<u32>` for x_google_ignore_list

Constructors / accessors:

- `SourceMap::new` now takes `&str` slices and interns them.
- `set_file` / `set_sources` / `set_source_contents` / `set_debug_id`
  rebuild the buffer (via `SourceMapRebuild`) so each mutation
  preserves the single-buf invariant.

Trade-off: parse path now copies every string from the deserialized
`String` into `buf` (vs. the borrowed-Cow design in #329 which kept the
JSON buffer alive and stored zero-copy refs). For the parse benchmark
alone this is slower than #329; for the realistic
`parse → store-as-owned` pipeline (which rolldown always wants),
allocation count drops dramatically — N per-string allocations → ~5
per-map allocations.

Prototype; not pushed. Bench numbers vs PR #329 (in commit msg of the
follow-up).
Implements the hybrid design (Option 3):

- `SourceMap` (no lifetime, buf+offsets) — primary type, used wherever
  the map needs to outlive its input or be stored as a struct field.
- `BorrowedSourceMap<'a>` (Cow<'a, str>) — zero-copy parse view, used
  when the caller wants to read tokens / strings without paying for an
  owned string buffer.

`SourceMap::from_json_string` now goes through the borrowed parse path
internally (skipping the serde_json String allocations for unescaped
fields), then copies into the owned buffer. So parse-into-owned has
zero per-string heap allocations.

API surface:

```rust
// Direct: parses with allocations (one buf for all strings)
let owned = SourceMap::from_json_string(json)?;

// Zero-copy: parses with Cow::Borrowed where possible
let borrowed = BorrowedSourceMap::from_json_string(json)?;
// Same accessors as SourceMap (return &str)
let name = borrowed.get_name(0);
// Promote when you need to detach from `json`:
let owned: SourceMap = borrowed.into_owned();
```

Trade-off vs. PR #329's single `SourceMap<'a>`:
- API: no lifetime parameter on the primary type (resolves the
  "<'static> noise" in rolldown).
- Allocations: ~5 per owned SourceMap (buf + offset arrays) instead of
  O(names + sources + sources_content).
- Wall-clock parse: ~10-25% slower than PR #329 because every string
  gets memcpy'd into the buffer at construction time.
- Concat: ~65% slower than PR #329 for the same reason.

Users who want max parse throughput on read-only workloads use
`BorrowedSourceMap`; users who need owned storage use `SourceMap`.

Numbers (xlarge fixture):
- parse/real_xlarge:           145 µs (vs 119 µs in #329, +22%)
- parse_borrowed/real_xlarge:  135 µs (vs 119 µs in #329, +13%)
- concat/from_sourcemaps:       17 µs (vs  10 µs in #329, +65%)
- serialize/real_xlarge:      44.6 µs (matches #329)
Required by downstream code (rolldown's `adjust_sourcemap_dst_lines`)
that wants to rewrite a SourceMap's tokens without touching its
strings or rebuilding the underlying buffer. Drops `token_chunks`
since they reference token indices that are now invalid.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 25, 2026

Merging this PR will degrade performance by 5.33%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 8 improved benchmarks
❌ 6 regressed benchmarks
✅ 2 untouched benchmarks
🆕 4 new benchmarks
⏩ 5 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
build_single 7.1 µs 8.5 µs -16.6%
add_sourcemap_loop 363 µs 518.7 µs -30.02%
🆕 parse_borrowed[real_large] N/A 50.5 µs N/A
serialize[real_small] 4.3 µs 4.1 µs +5.37%
serialize[real_medium] 4.9 µs 4.7 µs +4.61%
🆕 parse_borrowed[real_medium] N/A 15 µs N/A
from_sourcemaps 344.3 µs 507.3 µs -32.12%
serialize[real_xlarge] 593.4 µs 573.1 µs +3.54%
from_json_string_inline 15.8 µs 16.2 µs -2.47%
🆕 parse_borrowed[real_small] N/A 11.9 µs N/A
lookup_table[real_large] 4.5 µs 4.5 µs +1.31%
🆕 parse_borrowed[real_xlarge] N/A 1.3 ms N/A
lookup_table[real_medium] 1.5 µs 1.5 µs +1.97%
serialize[real_large] 20.3 µs 19.7 µs +3.05%
parse[real_medium] 16.5 µs 17.8 µs -7.25%
parse[real_large] 54.7 µs 54 µs +1.33%
parse[real_small] 13.2 µs 14.4 µs -8.53%
parse[real_xlarge] 1.7 ms 1.5 ms +15.1%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/sourcemap-buffer-offsets (730e06f) with main (9323530)

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 730e06f7e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sourcemap_builder.rs
Comment on lines +41 to +43
let start = self.buf.len() as u32;
self.buf.push_str(s);
let end = self.buf.len() as u32;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard offset casts against >4GiB buffers

StringInterner truncates self.buf.len() from usize to u32 without bounds checks, so once concatenated string data exceeds u32::MAX bytes, start/end wrap and subsequent StrRef::resolve uses those wrapped offsets in get_unchecked, which is undefined behavior. This can be triggered by parsing or building very large source maps (untrusted inputs included), so the cast needs a checked conversion (and an error path) before storing offsets.

Useful? React with 👍 / 👎.

@Boshen
Copy link
Copy Markdown
Member Author

Boshen commented May 25, 2026

Superseded — buf+offsets has wall-clock regressions on parse and concat. Going with the borrow-string approach in #329 plus an OwnedSourceMap wrapper instead.

@Boshen Boshen closed this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant