ci: [TEST RUN — DO NOT REVIEW] All CI optimizations combined for profiling#7039
Open
huth-stacks wants to merge 10 commits intostacks-network:developfrom
Open
ci: [TEST RUN — DO NOT REVIEW] All CI optimizations combined for profiling#7039huth-stacks wants to merge 10 commits intostacks-network:developfrom
huth-stacks wants to merge 10 commits intostacks-network:developfrom
Conversation
Author
|
The unit-tests job had continue-on-error: true and the check-tests job did not depend on unit-tests, causing test failures to be silently swallowed. Remove continue-on-error and add unit-tests to the check-tests needs array.
The paths-ignore block excluded **.yml files, which meant pushes to master/develop/next containing only workflow file changes would silently skip CI. Remove this exclusion so workflow changes are always validated.
The create-cache workflow waited for rustfmt, changelog-check, and check-release before starting the 15-minute nextest archive build. These gates are independent from compilation. Test workflows already depend on both create-cache and the format checks independently, so format failures still block test results.
Switch nextest-archive, cargo-hack native-targets, and constants-check to ubuntu-latest-m (4 vCPU, 16GB RAM) for faster compilation. The release workflow already uses these runners. Expected ~30-50% faster compilation with negligible cost difference.
Add a Python script that reads JUnit XML timing data from previous CI runs and uses greedy bin-packing to distribute tests into time-balanced partitions. Falls back to hash-based partitioning when no timing data is available (first run, or data expired). New files: .github/scripts/split-tests-by-timing.py - bin-packing script .config/nextest.toml - enables JUnit XML output for CI profile Each partition uploads its JUnit XML as an artifact (90-day retention). On subsequent runs, all partitions download this timing data and the script assigns tests to minimize the slowest partition's duration. POC: inlines the nextest command to test this approach. Production implementation should integrate with stacks-network/actions.
Instead of compiling stacks-inspect from scratch (4+ minutes), the nextest-archive job now generates the constants JSON and uploads it as an artifact. The constants-check job downloads and diffs it, reducing the check from 4 minutes to ~10 seconds. Also adds create-cache to constants-check's needs in ci.yml so the artifact is available before the download step runs.
The clippy workflow explicitly disabled caching, causing full recompilation on every PR. Enable the built-in cargo/target caching from actions-rust-lang/setup-rust-toolchain.
Set CARGO_INCREMENTAL=0 and CARGO_PROFILE_DEV_DEBUG=0 for the test cache build. Incremental compilation adds ~10% overhead on clean CI builds and bloats the target directory. Debug info level 2 (default) increases binary size and link time without benefit in CI where we don't debug interactively.
Remove -Cinstrument-coverage from RUSTFLAGS in create-cache.yml. This eliminates ~15% compilation overhead, ~56% binary size bloat, and per-test .profraw I/O from every PR run. Coverage data is no longer collected per-PR. The coverage report job will skip gracefully. Standard practice for large Rust projects — coverage on merge, not per-PR. Can be restored by reverting this one-line change.
Remove pull_request from cargo-hack-check's trigger condition. Feature combination checks still run on merge queue, releases, and manual dispatch — catching issues before they reach develop. PRs skip this 13-minute job for faster feedback.
e30f2d5 to
a97066b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR exists solely to measure the aggregate CI timing impact of all optimizations combined.
It requires upstream org runners (
ubuntu-latest-m) which are not available on personal forks.Will be closed after timing data is collected.
What's included (10 atomic changes)
Total: ~30 lines changed across 7 files.
Baseline (median of 3 recent upstream runs)
Expected aggregate impact
Targeting sub-1-hour total pipeline time.
Security Checklist