Skip to content

Add risk/safety evaluation to the code review skill#148

Draft
xingyaoww wants to merge 3 commits intomainfrom
openhands/add-risk-safety-evaluation-to-code-review
Draft

Add risk/safety evaluation to the code review skill#148
xingyaoww wants to merge 3 commits intomainfrom
openhands/add-risk-safety-evaluation-to-code-review

Conversation

@xingyaoww
Copy link
Copy Markdown
Contributor

@xingyaoww xingyaoww commented Apr 7, 2026

  • A human has tested these changes.

Why

The code review skill (skills/code-review) does not evaluate the risk level of a pull request. Without risk analysis, teams either over-rely on human review (slow) or under-review AI changes (dangerous). This is the missing piece for enabling a full autonomous pipeline: AI codes → AI reviews → AI QA → merge.

Summary

  • Added Risk and Safety Evaluation as a shared component in skills/code-review/references/risk-evaluation.md, instructing the review agent to classify PR risk as 🟢 Low / 🟡 Medium / 🔴 High based on pattern conformance, security sensitivity, infrastructure dependencies, blast radius, and core system impact.
  • Both code-review and codereview-roasted skills reference the shared component — factored out so risk evaluation is defined once and shared across review styles.
  • High-risk PRs include explicit escalation guidance (do not auto-merge, request human architect review) with specific reasons.
  • Added support for repo-specific risk rules via AGENTS.md or code review guide.
  • Updated README.md files and marketplaces/default.json descriptions for both skills to reflect the new capability.
  • Added tests validating the shared risk evaluation content and both skills' integration with it.

Issue Number

Closes #147

How to Test

uv sync --group test
uv run pytest tests/test_code_review_risk_evaluation.py -v
uv run pytest -q  # full suite

All 100 tests pass (including 14 tests for risk evaluation — 6 for the shared reference, 5 for code-review integration, 3 for codereview-roasted integration).

Notes

This is a documentation/skill definition change. The risk evaluation framework is factored out as a shared reference file (references/risk-evaluation.md) following the existing pattern used by other skills (e.g., babysit-pr/references/heuristics.md). Both code-review (scenario 5) and codereview-roasted (scenario 7) reference the same file, so the risk definition lives in one place.

This PR was created by an AI assistant (OpenHands) on behalf of the user.

Add scenario 5 (Risk and Safety Evaluation) to the code-review skill that
instructs the review agent to classify PR risk as Low/Medium/High based on:
- Pattern conformance vs. architectural shifts
- Security sensitivity (auth, crypto, permissions)
- New infrastructure dependencies
- Blast radius (isolated vs. shared code)
- Core system impact (agent orchestration, LLM prompts)

High-risk PRs are flagged with explicit escalation guidance recommending
human architect review and blocking auto-merge. The skill also supports
repo-specific risk rules defined in AGENTS.md.

Updates SKILL.md, README.md, and marketplace description. Adds tests
validating the risk evaluation content structure.

Closes #147

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands do you think if we should add these to codereview-roasted skill too?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! xingyaoww can track my progress at all-hands.dev

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! After comparing both skills, I think no — we should not add this to codereview-roasted, at least not in its current form. Here's my reasoning:

1. It already covers risk implicitly. The roasted skill has scenario 4 ("Breaking Change Risk Assessment" — "We don't break user space!") and scenario 5 ("Security and Correctness"), which already evaluate risk through the Linus Torvalds lens. Adding a separate formal risk classification would partially duplicate those.

2. It has its own rating system. The roasted skill uses a Taste Rating (🟢/🟡/🔴) and a VERDICT (✅ Worth merging / ❌ Needs rework). Adding a third rating dimension (risk level) would make the output format heavier without clear benefit.

3. Tone mismatch. The risk/safety evaluation in code-review is designed for structured, process-oriented workflows (AI codes → AI reviews → auto-merge decisions). The corporate escalation language ("Request review from a human architect") doesn't fit the Linus Torvalds voice.

4. Different audiences. code-review targets autonomous CI/CD pipelines where a formal risk gate is valuable. codereview-roasted targets developers who want brutally honest technical feedback — they're already the human reviewer in the loop.

That said, if we wanted a lighter adaptation in the future, we could add a one-liner to the roasted VERDICT section like "⚠️ This touches auth/infra — get a second pair of eyes" that fits the voice. But that feels like a separate PR.

This comment was generated by an AI assistant (OpenHands).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

The PR comment from @xingyaoww asked a question: "Should we add the risk/safety evaluation to the codereview-roasted skill too?"

I analyzed both skills and replied directly on the PR thread with my recommendation: No, we should not add it, for four reasons:

  1. Already covers risk implicitlycodereview-roasted has "Breaking Change Risk Assessment" (scenario 4) and "Security and Correctness" (scenario 5), which already evaluate risk through the Linus Torvalds lens.
  2. Has its own rating system — The Taste Rating (🟢/🟡/🔴) and VERDICT (✅ Worth merging / ❌ Needs rework) already serve a similar gatekeeping purpose; adding a third rating dimension would make the output heavier without clear benefit.
  3. Tone mismatch — The formal escalation language in the code-review risk evaluation ("Request review from a human architect") doesn't fit the Linus Torvalds voice of the roasted skill.
  4. Different audiencescode-review targets autonomous CI/CD pipelines needing a formal risk gate; codereview-roasted targets developers who are the human reviewer in the loop.

I suggested that a lighter, voice-appropriate adaptation (e.g., a one-liner in the VERDICT section) could be considered in a separate PR if desired.

No code changes were made — this was purely a question that was answered with a reply on the PR review thread.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands i actually want to put code-review and codereview-roasted under the same structure, and the only difference between both would be the "style" of reviews. In that case, would it make sence to have risk/safety evaluation to be factored out as a sepatrate component that is shared among both skills?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! xingyaoww can track my progress at all-hands.dev

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea — yes, factoring it out as a shared component makes total sense if both skills will share the same structure with only style differences.

I've implemented this in commit 43f2a74. Here's the approach:

Shared component: Created skills/code-review/references/risk-evaluation.md containing the full risk evaluation framework (risk levels, risk factors, escalation guidance, repo-specific rules, and output format).

Both skills reference the same file:

  • code-review/SKILL.mdreferences/risk-evaluation.md
  • codereview-roasted/SKILL.md../code-review/references/risk-evaluation.md

This follows the existing references/ pattern used by other skills (e.g., babysit-pr/references/heuristics.md). The risk evaluation definition lives in one place, and any future updates only need to change a single file.

For codereview-roasted, I added it as scenario 7 (between Testing/Regression Proof and PR Description Evidence) and added a [RISK ASSESSMENT] block to the output format. The risk evaluation is tone-neutral since it lives in the shared reference — the Linus voice comes from the rest of the SKILL.md.

Tests restructured into 3 classes (14 tests, all passing, full suite at 100):

  • TestSharedRiskEvaluationReference — validates the shared file content
  • TestCodeReviewSkillReferencesRisk — validates code-review integration
  • TestRoastedSkillReferencesRisk — validates codereview-roasted integration

This also sets up well for the future unification — when both skills move under the same structure, the shared reference path simplifies.

This comment was generated by an AI assistant (OpenHands).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Work

The PR comment from @xingyaoww asked whether it makes sense to factor out the risk/safety evaluation as a separate shared component, given the plan to unify code-review and codereview-roasted under the same structure where only the "style" differs.

Answer: Yes, it makes sense. I implemented it in commit 43f2a74.

Checklist — Was the request fully addressed?

  • Risk evaluation factored out as a shared component — Created skills/code-review/references/risk-evaluation.md containing the full framework (risk levels, factors, escalation guidance, repo-specific rules, output format)
  • Both skills reference the same shared filecode-review/SKILL.md references references/risk-evaluation.md; codereview-roasted/SKILL.md references ../code-review/references/risk-evaluation.md
  • codereview-roasted updated with risk evaluation — Added as scenario 7 with a [RISK ASSESSMENT] block in its output format
  • README.md files updated for both skills
  • Marketplace descriptions updated for both skills
  • Tests restructured — 14 tests across 3 classes (shared reference validation, code-review integration, codereview-roasted integration); full suite passes (100 tests)
  • Changes pushed to PR branch and reply posted on PR thread
  • PR description updated to reflect the new factored-out approach

Are the changes concise?

Yes. Every changed file directly supports the factoring-out request:

  • 1 new file (the shared reference)
  • 6 modified files (both SKILL.md, both README.md, marketplace JSON, test file)
  • No extraneous changes were introduced

Extract risk evaluation content into skills/code-review/references/risk-evaluation.md
as a shared component referenced by both code-review and codereview-roasted skills.

- Created references/risk-evaluation.md with risk levels, factors, escalation
  guidance, repo-specific rules, and output format
- Updated code-review/SKILL.md to reference the shared component
- Added risk evaluation scenario to codereview-roasted/SKILL.md referencing
  the same shared component via ../code-review/references/risk-evaluation.md
- Updated both README.md files and marketplace descriptions
- Restructured tests into 3 classes: shared reference validation,
  code-review integration, and codereview-roasted integration (14 tests)

Co-authored-by: openhands <openhands@all-hands.dev>
…ross-skill reference

Instead of referencing ../code-review/references/risk-evaluation.md directly,
create a symlink at codereview-roasted/references/risk-evaluation.md that points
to the canonical file in code-review. This keeps each skill self-contained while
maintaining a single source of truth.

- Added symlink: codereview-roasted/references/risk-evaluation.md -> ../../code-review/references/risk-evaluation.md
- Updated SKILL.md and README.md to use the local references/ path
- Updated tests to verify the symlink exists and resolves correctly

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add risk/safety evaluation to the code review skill

2 participants