fix(review): mark judge infra failures inconclusive by leobaldock · Pull Request #240 · aetheronhq/agent-cube

leobaldock · 2026-05-26T14:09:47Z

Summary

Separate judge availability failures from code review verdicts.
Post COMMENT with INCONCLUSIVE: judge infrastructure unavailable when available judges approve but expected judges fail or miss.
Keep REQUEST_CHANGES only when an available judge actually rejects the code.
Apply matching semantics to branch review display and panel summaries.

Root cause

Cube treated missing judge decision files as a review rejection. That made infra failures look like code failures and turned partial approval panels into noisy REQUEST_CHANGES reviews.

Validation

pytest -q tests -> 512 passed
pytest -q tests/cli tests/core tests/automation -> 511 passed
pytest -q tests/cli/test_auto_approve_gate.py tests/cli/test_panel_summary.py tests/cli/test_peer_review_branch_summary.py -> 20 passed
pytest -q tests/core/test_judge_panel_skip_approved.py tests/core/test_session_reset_on_stale.py tests/cli/test_auto_approve_gate.py tests/cli/test_panel_summary.py tests/cli/test_peer_review_branch_summary.py -> 61 passed
ruff check on touched files -> no issues
mypy on touched runtime files -> no issues
python -m compileall -q python/cube -> passed
git diff --check -> passed

Overview

Distinguishes infrastructure failures from code review rejections by treating missing/failed judge decisions as availability issues rather than code rejections. When expected judges are unavailable but available judges approve, the system posts an INCONCLUSIVE comment instead of REQUEST_CHANGES.

Key Changes

Judge Infrastructure Handling

Added "unavailable" status to JudgeRunStatus type
Treats failed, missing, and unavailable states uniformly in status rendering
Panel summary now displays these as UNAVAILABLE (FAILED), UNAVAILABLE (MISSING), etc.

Review Verdict Logic

New helper _available_review_rejected() distinguishes actual code rejections from infrastructure failures
Auto-approve gate now returns INCONCLUSIVE (COMMENT) when expected judges are missing but no available judge has rejected
REQUEST_CHANGES is only returned when an available judge explicitly rejects the code
Missing judge details are included in inconclusive comments

Branch Review Alignment

Refactored branch review decision logic into _branch_review_decision_and_summary() helper
Applies consistent inconclusive semantics to branch and PR review paths
Missing judge information now rendered consistently across both flows

Panel Summary Display

New _is_unavailable_result() helper centralises unavailable state detection
Unavailable rows display error/log information when available
Recovery commands shown only for unavailable rows that have recovery options
Gate status now includes "inconclusive" branch for availability issues

Testing

Updated 3 auto-approve gate tests to expect INCONCLUSIVE behaviour
Updated 2 panel summary tests for new unavailable rendering
Added 2 new branch review tests for inconclusive/rejection scenarios
All 512 pytest tests passing; mypy and ruff checks clean

coderabbitai · 2026-05-26T14:09:56Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 32cd3dd3-2b82-495f-833d-c2817472040e

📥 Commits

Reviewing files that changed from the base of the PR and between 313928e and 1fa736c.

📒 Files selected for processing (7)

python/cube/automation/judge_panel.py
python/cube/commands/auto_approve.py
python/cube/commands/peer_review.py
python/cube/models/types.py
tests/cli/test_auto_approve_gate.py
tests/cli/test_panel_summary.py
tests/cli/test_peer_review_branch_summary.py

📜 Recent review details

🔇 Additional comments (7)

python/cube/models/types.py (1)

57-57: LGTM!

python/cube/automation/judge_panel.py (1)

631-635: LGTM!

Also applies to: 644-646, 665-666, 698-700, 707-709, 720-725

python/cube/commands/auto_approve.py (1)

30-40: LGTM!

Also applies to: 191-223, 228-233

python/cube/commands/peer_review.py (1)

511-547: LGTM!

Also applies to: 799-799, 812-814, 821-825, 965-976, 1001-1004, 1211-1220

tests/cli/test_auto_approve_gate.py (1)

63-94: LGTM!

Also applies to: 118-123, 146-147, 257-283

tests/cli/test_panel_summary.py (1)

140-143: LGTM!

Also applies to: 148-159, 189-192, 197-197, 281-281

tests/cli/test_peer_review_branch_summary.py (1)

1-49: LGTM!

Walkthrough

This PR unifies the treatment of "failed", "missing", and "unavailable" judge statuses as unavailable states throughout the judge panel rendering and decision logic. A new "unavailable" value is added to the JudgeRunStatus type. The judge panel now displays failed and missing judges using the label UNAVAILABLE (FAILED) and UNAVAILABLE (MISSING) respectively, and a centralised helper classifies unavailable rows. The gate block documentation is expanded to include "inconclusive" status. When expected judges are missing, the auto-approve gate and peer review logic now distinguish between infrastructure unavailability (missing judges) and actual code verdicts (available judges rejecting), returning an INCONCLUSIVE COMMENT only when no available judge has rejected, otherwise returning REQUEST_CHANGES with an availability warning.

Possibly related PRs

aetheronhq/agent-cube#196: Updates peer review logic for missing judge detection and verdict derivation based on expected panel size, with corresponding downstream decision text modifications and helper refactoring.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly addresses the main objective: distinguishing judge infrastructure failures from code review verdicts and marking them as inconclusive.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

fix(review): mark judge infra failures inconclusive

1fa736c

leobaldock marked this pull request as ready for review May 26, 2026 14:10

jacsamell approved these changes May 26, 2026

View reviewed changes

jacsamell merged commit f09aaf0 into main May 26, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(review): mark judge infra failures inconclusive#240

fix(review): mark judge infra failures inconclusive#240
jacsamell merged 1 commit into
mainfrom
codex/judge-infra-inconclusive

leobaldock commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

leobaldock commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Validation

Overview

Key Changes

Testing

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leobaldock commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading