Skip to content

Comments

Add benchmark utils#449

Merged
evisdren merged 4 commits intomainfrom
add_benchmark_utils
Feb 21, 2026
Merged

Add benchmark utils#449
evisdren merged 4 commits intomainfrom
add_benchmark_utils

Conversation

@evisdren
Copy link
Contributor

No description provided.

@evisdren evisdren requested a review from a team as a code owner February 20, 2026 22:19
Copilot AI review requested due to automatic review settings February 20, 2026 22:19
@cursor
Copy link

cursor bot commented Feb 20, 2026

PR Summary

Low Risk
Changes are limited to developer tooling/benchmarks and documentation; no production execution paths are modified.

Overview
Adds a new benchutil package used by benchmarks to generate realistic fixture data: temp git repos with many files/commits, .entire settings, session state files, synthetic JSONL transcripts, and seeded temporary/committed checkpoints via checkpoint.GitStore.

Introduces benchmark tests covering repo creation, transcript generation, session state creation, and seeding shadow/metadata branches.

Updates mise.toml with bench, bench:cpu, bench:mem, and bench:compare tasks (including benchstat-based comparison), and refreshes the Discord invite link in CONTRIBUTING.md.

Written by Cursor Bugbot for commit 5c61b11. Configure here.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive benchmarking infrastructure to the Entire CLI project, enabling performance testing and comparison of checkpoint operations. It introduces a new benchutil package with test fixture helpers and adds benchmark tasks to the build system.

Changes:

  • Adds benchutil package with realistic git repository fixtures, session state generators, and transcript generation for benchmarking checkpoint operations
  • Adds four benchmark tasks to mise.toml: bench (run all), bench:cpu (CPU profiling), bench:mem (memory profiling), and bench:compare (compare branches)
  • Updates Discord invite links in CONTRIBUTING.md

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
cmd/entire/cli/benchutil/benchutil.go New package providing test fixtures for benchmarks, including repo creation, session state management, and synthetic data generation
cmd/entire/cli/benchutil/benchutil_test.go Benchmark tests for the benchutil helpers themselves, using Go 1.24+ b.Loop() syntax
mise.toml Adds benchmark tasks with CPU/memory profiling and branch comparison capabilities
CONTRIBUTING.md Updates Discord invite links from old to new

mise.toml Outdated
tmpdir=$(mktemp -d)
new_out="$tmpdir/new.txt"
old_out="$tmpdir/old.txt"
trap 'rm -rf "$tmpdir"' EXIT
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bench:compare script has a trap for cleaning up the temporary directory but doesn't trap to restore the original branch if the script exits prematurely (e.g., via Ctrl+C or benchmark timeout). If interrupted during the base branch benchmarking, the user will be left on the base branch with uncommitted changes potentially stashed.

Consider adding branch restoration to the trap handler:

trap 'git checkout "$current_branch" --quiet 2>/dev/null; [ "$has_changes" = true ] && git stash pop --quiet 2>/dev/null; rm -rf "$tmpdir"' EXIT
Suggested change
trap 'rm -rf "$tmpdir"' EXIT
trap 'git checkout "$current_branch" --quiet 2>/dev/null; [ "$has_changes" = true ] && git stash pop --quiet 2>/dev/null; rm -rf "$tmpdir"' EXIT

Copilot uses AI. Check for mistakes.
mise.toml Outdated
# Install benchstat if not available
if ! command -v benchstat &>/dev/null; then
echo "Installing benchstat..."
go install golang.org/x/perf/cmd/benchstat@latest
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the benchstat installation fails, the script continues and will fail later when trying to run benchstat at line 164. Consider checking the installation status:

if ! command -v benchstat &>/dev/null; then
  echo "Installing benchstat..."
  if ! go install golang.org/x/perf/cmd/benchstat@latest; then
    echo "Failed to install benchstat. Please install it manually: go install golang.org/x/perf/cmd/benchstat@latest"
    exit 1
  fi
fi
Suggested change
go install golang.org/x/perf/cmd/benchstat@latest
if ! go install golang.org/x/perf/cmd/benchstat@latest; then
echo "Failed to install benchstat. Please install it manually: go install golang.org/x/perf/cmd/benchstat@latest"
exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 40
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
sessionID := repo.CreateSessionState(b, SessionOpts{})

b.ResetTimer()
for b.Loop() {
// Each iteration seeds a fresh shadow branch
// (will append to existing, but that's fine for benchmarking)
repo.SeedShadowBranch(b, sessionID, 5, 3)
}
}

func BenchmarkSeedMetadataBranch(b *testing.B) {
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})

b.ResetTimer()
for b.Loop() {
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarks for SeedShadowBranch and SeedMetadataBranch may produce misleading results because each iteration appends to the same branches, causing performance to degrade as the branches grow longer. This means later iterations will be slower than earlier ones, skewing the average.

Consider either:

  1. Creating a fresh BenchRepo inside the b.Loop() so each iteration starts from a clean state, or
  2. If the goal is to benchmark appending to existing branches, document this more clearly and consider using b.Run() with different sizes as sub-benchmarks.
Suggested change
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
sessionID := repo.CreateSessionState(b, SessionOpts{})
b.ResetTimer()
for b.Loop() {
// Each iteration seeds a fresh shadow branch
// (will append to existing, but that's fine for benchmarking)
repo.SeedShadowBranch(b, sessionID, 5, 3)
}
}
func BenchmarkSeedMetadataBranch(b *testing.B) {
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
b.ResetTimer()
for b.Loop() {
for b.Loop() {
// Each iteration uses a fresh repo and session state to keep work per iteration consistent.
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
sessionID := repo.CreateSessionState(b, SessionOpts{})
repo.SeedShadowBranch(b, sessionID, 5, 3)
}
}
func BenchmarkSeedMetadataBranch(b *testing.B) {
for b.Loop() {
// Use a fresh repo each iteration so branch size does not accumulate across iterations.
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})

Copilot uses AI. Check for mistakes.
Comment on lines 37 to 40
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})

b.ResetTimer()
for b.Loop() {
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the SeedShadowBranch benchmark, this benchmark appends to the same metadata branch on each iteration, causing performance to degrade as the branch grows. This may not accurately reflect the typical performance of the operation.

Consider creating a fresh BenchRepo inside b.Loop() or using b.Run() with different checkpoint counts as sub-benchmarks to get more meaningful performance data.

Suggested change
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
b.ResetTimer()
for b.Loop() {
b.ResetTimer()
for b.Loop() {
repo := NewBenchRepo(b, RepoOpts{FileCount: 10})

Copilot uses AI. Check for mistakes.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

mise.toml Outdated
tmpdir=$(mktemp -d)
new_out="$tmpdir/new.txt"
old_out="$tmpdir/old.txt"
trap 'rm -rf "$tmpdir"' EXIT
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trap doesn't restore git branch or stash on failure

Medium Severity

The bench:compare EXIT trap only removes the temp directory but doesn't restore the git branch or pop the stash. Because set -euo pipefail is active, if git checkout "$BASE_REF" or git checkout "$current_branch" fails, the script exits immediately, leaving the repo on the wrong branch with uncommitted changes stuck in the stash. The trap needs to also restore the original branch and conditionally pop the stash.

Additional Locations (2)

Fix in Cursor Fix in Web

mise.toml Outdated

[tasks."bench:cpu"]
description = "Run benchmarks with CPU profile"
run = "go test -bench=. -benchmem -run='^$' -cpuprofile=cpu.prof -timeout=10m ./... && echo 'Profile saved to cpu.prof. View with: go tool pprof -http=:8080 cpu.prof'"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profile tasks fail with multiple packages flag

Medium Severity

The bench:cpu and bench:mem tasks pass -cpuprofile and -memprofile together with ./.... Go's test tool does not support test profile flags with multiple packages and will fail with "cannot use test profile flag with multiple packages". These tasks need to target a single package instead of ./....

Additional Locations (1)

Fix in Cursor Fix in Web

Entire-Checkpoint: fa2f482ddb24
Entire-Checkpoint: 8a0976de057c
@evisdren evisdren merged commit c71f4df into main Feb 21, 2026
3 checks passed
@evisdren evisdren deleted the add_benchmark_utils branch February 21, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant