Skip to content

pr-review: CI polling wastes compute due to skill discrepancy and silent-verdict monitoring #1245

@worktrunk-bot

Description

@worktrunk-bot

Problem

The pr-review skill's Step 5 tells the reviewer to manually poll CI with
sleep 30–60s loops, while the running-in-ci skill says to use
gh run watch. This discrepancy causes sessions that follow pr-review
guidance to waste significantly more compute than sessions that follow
running-in-ci.

Evidence from the past 24 hours

Session PR Approach CI poll iterations Notes
Run 22540574040 #1242 (renovate) Manual poll (pr-review) 43 Self-authored, stayed silent
Run 22539129222 #1241 (nightly review) Manual poll (pr-review) 8 Self-authored, stayed silent
Run 22540314728 #1241 (nightly review) Manual poll (pr-review) 7 Self-authored, stayed silent
Run 22560341156 #1243 (dependabot) gh run watch (running-in-ci) 0 Approved
Run 22560379238 #1244 (dependabot) gh pr checks --watch (running-in-ci) 0 Approved

The worst case — run 22540574040 — spent 43 iterations of sleep 60
(~43 min of runner time) waiting for the benchmarks job on a self-authored
renovate PR where the bot was going to stay silent regardless. The benchmarks
job clones rust-lang/rust and takes ~2 hours, completely unrelated to the
version-bump changes being reviewed.

Two sub-problems

  1. Skill discrepancy: running-in-ci says "Wait for completion with
    gh run watch" while pr-review Step 5 provides a manual
    gh pr view --json statusCheckRollup template with "poll until complete
    (sleep 30–60s between checks)." Sessions that happen to follow one skill
    get efficient blocking waits; sessions that follow the other get expensive
    poll loops.

  2. CI monitoring on silent verdicts: Step 5 says "After approving, check
    whether CI has finished" — but for self-authored PRs where the bot stays
    silent (no approval posted), CI monitoring adds no value. There is no
    approval to dismiss on failure and no comment to post. All three
    self-authored sessions (runs 22540574040, 22539129222, 22540314728) still
    monitored CI despite having nothing to act on.

Relationship to #1186

#1186 flagged the same CI waste pattern and was closed with "compute is cheap"
(#1189). The new evidence shows the problem is worse than originally reported
(43 iterations vs 6–9 previously) and identifies a concrete fix: align the two
skills rather than cap iterations.

Suggested fix

  1. Replace the manual statusCheckRollup polling in pr-review Step 5 with
    gh run watch (consistent with running-in-ci)
  2. Skip Step 5 entirely when the verdict is "stay silent" (self-authored PR
    with no concerns, no approval posted)

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-behaviorIssues with Claude CI bot behavior

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions