[flags-core] perf improvements#382
Draft
dferber90 wants to merge 2 commits into
Draft
Conversation
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three independent micro-optimizations on the flag evaluation hot path, identified from a CPU profile of
client.evaluate()under load. No behavior change, no public API change. All 426 existing tests pass.scaledWeightsfor split outcomes —handleOutcomewas recomputingsum(outcome.weights)+outcome.weights.map(w => w/sum * UINT32_MAX)on every evaluation. Cached on first call under a Symbol-keyed property on the outcome object.RegExpfor REGEX / NOT_REGEX conditions —matchConditionswas callingnew RegExp(rhs.pattern, rhs.flags)on every evaluation. Cached under a Symbol on therhsobject.Controller.read()/getDatafile()— both methods used to destructure_originand spread the entire datafile on every call. Now the destructure result is cached keyed on thethis.datareference; the cache rebuilds once when stream/poll replaces the underlying data.Bench results
Median over 3 runs of 2,000,000 iterations on Node 24 / darwin-arm64. Benchmark and CPU-profile summarizer are not committed in this PR; ping me if you want to land them.
Pure
evaluate()— wins where expected, no regression elsewhereFull
client.evaluate()path (offline, datafile provided) — every scenario faster thanks to theread()cacheCPU profile delta
Captured with `node --cpu-prof --cpu-prof-interval=100`, ~10s of sampled CPU. Self-% is share of total elapsed time spent inside that frame (excluding descendants).
`Controller.read()` self-time more than halved. `handleOutcome` shed the per-call `sum` + `weights.map` allocations. `new RegExp` no longer appears in the top frames at all. The remaining big costs (`matchConditions` dispatch, xxHash32 + UTF-8 encoding inside it) are intrinsic to the work and out of scope for this PR.
Why this is safe
Test plan
bench/bench.mjs
bench/summarize-profile.mjs