Skip to content

perf(process): reduce sysinfo CPU refresh overhead by removing frequency polling#67

Open
tosynthegeek wants to merge 1 commit into
chainbound:mainfrom
tosynthegeek:main
Open

perf(process): reduce sysinfo CPU refresh overhead by removing frequency polling#67
tosynthegeek wants to merge 1 commit into
chainbound:mainfrom
tosynthegeek:main

Conversation

@tosynthegeek
Copy link
Copy Markdown

@tosynthegeek tosynthegeek commented May 7, 2026

Benchmarks confirm that CpuRefreshKind::everything() is the dominant cost in ProcessCollector::collect() on Linux, largely due to CPU frequency polling via sysfs reads. This PR replaces it with CpuRefreshKind::nothing().with_cpu_usage(), reducing collection time by ~74% on Linux with no meaningful impact on macOS.

Changes

  • Replace CpuRefreshKind::everything() with CpuRefreshKind::nothing().with_cpu_usage() in ProcessCollector::new()
  • Add benches/process_collector.rs with six RefreshKind configurations benchmarked under Criterion to isolate per-component refresh cost

Benchmark Results

Environment

macOS

  • Machine: MacBook Pro (Intel Core i7-9750H @ 2.60GHz)
  • OS: macOS 26.3.1
  • Logical CPUs: 12
  • Rust: rustc 1.95.0 (59807616e 2026-04-14)

Linux

  • Runner: Docker container via act (catthehacker/ubuntu:act-22.04) on the same MacBook
  • Rust: rustc 1.95.0 (59807616e 2026-04-14)

Benchmarks use Criterion, measuring System::refresh_specifics() wall-time for six RefreshKind configurations. Each benchmark performs one warm-up refresh before the measurement loop, mirroring ProcessCollector::new() which also does one eager refresh so the first collect() call has a valid prior sample to diff against.

cargo bench --bench process_collector --features process -- \
  --output-format bencher --noplot

Linux (Ubuntu 22.04, Docker on Apple M2)

Configuration ns/iter ± vs current_default
cpu_usage_only 181,403 ±13,820 -90.1%
with_tasks 357,596 ±318,250 -80.4%
proposed_slim 468,905 ±66,501 -74.3%
with_disk_usage 497,585 ±25,474 -72.7%
current_default 1,824,332 ±81,346 baseline
cpu_with_frequency 2,034,379 ±703,758 +11.5%

The delta between cpu_usage_only and current_default (~1.6ms) represents the cost of CPU frequency polling in our configuration.

macOS (Apple M2, native)

Configuration ns/iter ± vs current_default
proposed_slim 38,457,270 ±2,938,117 -3.1%
current_default 39,676,476 ±3,346,959 baseline
cpu_usage_only 39,883,871 ±7,608,643 +0.5%
cpu_with_frequency 39,807,413 ±3,404,677 +0.3%
with_tasks 40,222,138 ±7,558,858 +1.4%
with_disk_usage 43,640,777 ±4,139,449 +10.0%

All configurations cluster around 38–43ms with overlapping variance. sysinfo on macOS uses unified syscalls (proc_pidinfo, host_statistics) that return all data in one shot regardless of RefreshKind flags. The proposed change has no negative impact on macOS.

Tradeoffs

Metric Impact
system_cpu_usage Unaffected
process_cpu_usage Unaffected
process_threads Unaffected
process_disk_written_bytes_total Unaffected
process_thread_usage (per-thread) Unaffected
process_resident_memory_* Unaffected
system_min_cpu_frequency Returns 0
system_max_cpu_frequency Returns 0

Follow-up / open questions

system_min_cpu_frequency and system_max_cpu_frequency remain in the collector but will return 0 under the new configuration since CpuRefreshKind::nothing().with_cpu_usage() does not populate frequency data.

Possible directions:

  1. Leave as-is — gauges return 0, non-breaking for existing scrapers, documents the tradeoff via the benchmark report
  2. Remove the gauges — avoids exposing misleading zero-valued metrics, but is a breaking change for anyone scraping them
  3. Make frequency opt-in — add a builder flag like ProcessCollector::with_cpu_frequency() so users can re-enable it and accept the ~1.6ms cost on Linux if they need it

Closes #31

…ncy polling

Replace `CpuRefreshKind::everything()` with `CpuRefreshKind::nothing().with_cpu_usage()`
to avoid unnecessary CPU frequency collection, which increases refresh cost without
being required by any exported metrics.

Benchmarking on Linux and macOS shows CPU frequency polling is the dominant source
of overhead in sysinfo refresh_specifics on Linux, while other process metrics
(tasks, disk usage, memory) remain relatively stable across configurations.

Benchmarked on:
- Linux: Ubuntu 22.04 (Docker, catthehacker/ubuntu:act-22.04)
- macOS: local machine (x86_64)

Closes chainbound#31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Process: benchmark & improve sysinfo usage

1 participant