Skip to content

util/metric: add benchmark comparing classic vs native histograms#167487

Open
angles-n-daemons wants to merge 1 commit intocockroachdb:masterfrom
angles-n-daemons:benchmark-native-histograms
Open

util/metric: add benchmark comparing classic vs native histograms#167487
angles-n-daemons wants to merge 1 commit intocockroachdb:masterfrom
angles-n-daemons:benchmark-native-histograms

Conversation

@angles-n-daemons
Copy link
Copy Markdown
Contributor

Summary

  • Adds BenchmarkClassicVsNativeHistogram comparing classic Prometheus histograms against native (exponential) histograms across sequential observation, parallel observation, and quantile computation.
  • Tests two bucket configs (IOLatency, Count1K) with three variants: classic, native-fine (factor=1.1), and native-matched (factor matching classic growth factor) to isolate whether the ~2x overhead is from bucket count or the underlying sync.Map data structure.
  • Results show the overhead is a fixed cost of sync.Map vs []uint64 — matching the growth factor does not close the gap.

Benchmark Results (Apple M3 Pro, arm64)

Scenario Classic Native-fine (1.1) Native-matched
Observe (sequential) ~31 ns/op ~65 ns/op (2.1x) ~64 ns/op (2.1x)
Observe (parallel, 11 goroutines) ~315 ns/op ~284 ns/op (0.9x) ~289 ns/op (0.9x)
Quantile (p50/p99/p99.9) ~2.3 us/op ~4.5 us/op (2.0x) ~4.7 us/op (2.0x)

Epic: none

Add BenchmarkClassicVsNativeHistogram to measure the performance
difference between classic Prometheus histograms and native (exponential)
histograms across three axes: sequential observation, parallel
(contended) observation, and quantile computation.

The benchmark tests two bucket configs (IOLatency, Count1K) and three
histogram variants: classic, native-fine (factor=1.1, schema 3), and
native-matched (factor matching the classic growth factor) to isolate
whether the performance gap is driven by bucket count or by the
underlying sync.Map data structure.

Results on Apple M3 Pro (arm64):

  Observe (sequential):
    classic:        ~31 ns/op
    native-fine:    ~65 ns/op  (2.1x)
    native-matched: ~64 ns/op (2.1x)

  Observe (parallel, 11 goroutines):
    classic:        ~315 ns/op
    native-fine:    ~284 ns/op (0.9x)
    native-matched: ~289 ns/op (0.9x)

  Quantile (p50/p99/p99.9, 10k observations):
    classic:        ~2.3 us/op
    native-fine:    ~4.5 us/op (2.0x)
    native-matched: ~4.7 us/op (2.0x)

The overhead is a fixed cost of sync.Map vs []uint64, not a function
of bucket count — matching the growth factor does not close the gap.

Epic: none

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@angles-n-daemons angles-n-daemons requested review from a team as code owners April 3, 2026 17:30
@angles-n-daemons angles-n-daemons requested review from arjunmahishi and dhartunian and removed request for a team April 3, 2026 17:30
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io bot commented Apr 3, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants