perf: misc allocation/branch reductions (round 2) by Boshen · Pull Request #335 · oxc-project/oxc-sourcemap

Boshen · 2026-05-25T16:55:16Z

Summary

Four small, independent perf wins on top of main. Each removes one allocation, one branch, or one bounds check from a hot path.

encode: x_google_ignoreList integers now go through an inline stack-buffer u32 → bytes converter instead of u32::to_string(). Zero allocations on the ignoreList encode path.
encode (VLQ): the 64-entry B64_CHARS table is now indexed via get_unchecked. digit & 0b11111 is provably 0..=31, but the optimizer doesn't reliably elide the bounds check across the loop break. Worth ~3-4% on small/medium serialize.
decode: tokens are constructed via a new pub(crate) Token::new_raw(...) that takes raw u32 ids with INVALID_ID as the absent-sentinel. The decoder already tracks ids that way, so this skips the previous u32 → Option<u32> → u32 roundtrip through Token::new. Also marks the small Token getters #[inline]. Worth ~1-4% on small/medium/large parse.
concat builder: add_sourcemap extends sources / source_contents / names by iterating the input Vec<Cow>s directly. Going through the get_* accessors returned impl Iterator<Item = &str>, which hid the ExactSizeIterator impl from extend and forced geometric growth. Direct field iteration preserves the exact-size hint so each extend pre-reserves once.
builder: drop the explicit self.tokens.shrink_to_fit() from into_sourcemap — Vec::into_boxed_slice below already drops excess capacity in one allocation+copy.

Benchmarks

Wall-clock differences are mostly inside the criterion noise floor (±2-3%) on the existing perf fixtures. The wins these make are most visible on workloads with many sourcemaps being concatenated (where the extend no-reserve was geometric) and on workloads with thousands of tokens (where the per-token bounds-check + Option roundtrip cost adds up). Composes additively with #330 and #331.

Three small, independent wins on top of main: * **encode**: `x_google_ignoreList` integers no longer go through `u32::to_string()` per element. Inline a stack-buffer u32 → bytes conversion so the rare ignoreList encode path does zero allocations. * **concat builder**: `add_sourcemap` now extends `sources` / `source_contents` / `names` by iterating the input `Vec<Cow>`s directly. The previous `get_*()` accessors return `impl Iterator<Item = &str>` which hides the `ExactSizeIterator` impl from `extend`, forcing geometric growth of the output vecs. Going through `.iter().map(...)` preserves the exact-size hint, so each `extend` pre-reserves in one shot. * **builder**: drop the explicit `self.tokens.shrink_to_fit()` from `into_sourcemap`. `Vec::into_boxed_slice` already drops any excess capacity in a single allocation+copy; the standalone shrink was duplicate work on the same Vec.

Two more small wins: * **encode**: lookup into the 64-entry `B64_CHARS` table now uses `get_unchecked`. The optimizer doesn't reliably elide the bounds check across the loop-break boundary, even though `digit & 0b11111` is provably in `0..=31`. Worth ~3-4% on small/medium serialize. * **decode**: tokens are now constructed via `Token::new_raw`, a pub(crate) constructor that takes raw u32 ids using `INVALID_ID` as the absent-sentinel. The decoder already tracks ids that way, so this skips the previous `u32 → Option<u32> → u32` roundtrip through `Token::new`. Also marks Token's small getters `#[inline]` so accessor calls in hot loops reliably collapse to direct field reads. Worth ~1-4% across the parse sizes.

codspeed-hq · 2026-05-25T16:56:00Z

Merging this PR will degrade performance by 1.35%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 3 regressed benchmarks
✅ 13 untouched benchmarks
⏩ 5 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
❌	`parse[real_medium]`	14.9 µs	15.1 µs	-1.35%
❌	`parse[real_small]`	11.6 µs	11.8 µs	-1.48%
❌	`from_json_string_inline`	14.1 µs	14.3 µs	-1.22%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing perf/round2-misc-allocations (25593e5) with main (db883f9)}

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Boshen · 2026-05-26T03:16:34Z

Closing per investigation: no sub-change in this PR touches the parse path algorithmically, yet CodSpeed reports a ~1.35% parse regression and itself warns of "different runtime environments". The effect appears to be binary-layout / i-cache sensitivity from reshuffling unrelated functions, not a real perf issue. Not worth the noise for the tiny wins on the existing fixtures.

Boshen added 2 commits May 26, 2026 00:46

Boshen closed this May 26, 2026

Boshen deleted the perf/round2-misc-allocations branch May 26, 2026 03:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: misc allocation/branch reductions (round 2)#335

perf: misc allocation/branch reductions (round 2)#335
Boshen wants to merge 2 commits into
mainfrom
perf/round2-misc-allocations

Boshen commented May 25, 2026

Uh oh!

codspeed-hq Bot commented May 25, 2026

Uh oh!

Boshen commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Boshen commented May 25, 2026

Summary

Benchmarks

Uh oh!

codspeed-hq Bot commented May 25, 2026

Merging this PR will degrade performance by 1.35%

Performance Changes

Footnotes

Uh oh!

Boshen commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant