perf(process): reduce sysinfo CPU refresh overhead by removing frequency polling#67
Open
tosynthegeek wants to merge 1 commit into
Open
perf(process): reduce sysinfo CPU refresh overhead by removing frequency polling#67tosynthegeek wants to merge 1 commit into
tosynthegeek wants to merge 1 commit into
Conversation
…ncy polling Replace `CpuRefreshKind::everything()` with `CpuRefreshKind::nothing().with_cpu_usage()` to avoid unnecessary CPU frequency collection, which increases refresh cost without being required by any exported metrics. Benchmarking on Linux and macOS shows CPU frequency polling is the dominant source of overhead in sysinfo refresh_specifics on Linux, while other process metrics (tasks, disk usage, memory) remain relatively stable across configurations. Benchmarked on: - Linux: Ubuntu 22.04 (Docker, catthehacker/ubuntu:act-22.04) - macOS: local machine (x86_64) Closes chainbound#31
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Benchmarks confirm that
CpuRefreshKind::everything()is the dominant cost inProcessCollector::collect()on Linux, largely due to CPU frequency polling via sysfs reads. This PR replaces it withCpuRefreshKind::nothing().with_cpu_usage(), reducing collection time by ~74% on Linux with no meaningful impact on macOS.Changes
CpuRefreshKind::everything()withCpuRefreshKind::nothing().with_cpu_usage()inProcessCollector::new()benches/process_collector.rswith sixRefreshKindconfigurations benchmarked under Criterion to isolate per-component refresh costBenchmark Results
Environment
macOS
Linux
act(catthehacker/ubuntu:act-22.04) on the same MacBookBenchmarks use Criterion, measuring
System::refresh_specifics()wall-time for sixRefreshKindconfigurations. Each benchmark performs one warm-up refresh before the measurement loop, mirroringProcessCollector::new()which also does one eager refresh so the firstcollect()call has a valid prior sample to diff against.Linux (Ubuntu 22.04, Docker on Apple M2)
current_defaultcpu_usage_onlywith_tasksproposed_slimwith_disk_usagecurrent_defaultcpu_with_frequencyThe delta between
cpu_usage_onlyandcurrent_default(~1.6ms) represents the cost of CPU frequency polling in our configuration.macOS (Apple M2, native)
current_defaultproposed_slimcurrent_defaultcpu_usage_onlycpu_with_frequencywith_taskswith_disk_usageAll configurations cluster around 38–43ms with overlapping variance.
sysinfoon macOS uses unified syscalls (proc_pidinfo,host_statistics) that return all data in one shot regardless ofRefreshKindflags. The proposed change has no negative impact on macOS.Tradeoffs
system_cpu_usageprocess_cpu_usageprocess_threadsprocess_disk_written_bytes_totalprocess_thread_usage(per-thread)process_resident_memory_*system_min_cpu_frequencysystem_max_cpu_frequencyFollow-up / open questions
system_min_cpu_frequencyandsystem_max_cpu_frequencyremain in the collector but will return 0 under the new configuration sinceCpuRefreshKind::nothing().with_cpu_usage()does not populate frequency data.Possible directions:
0, non-breaking for existing scrapers, documents the tradeoff via the benchmark reportProcessCollector::with_cpu_frequency()so users can re-enable it and accept the ~1.6ms cost on Linux if they need itCloses #31