Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
7564581
feat: add qa-changes plugin for automated PR QA validation
openhands-agent Apr 2, 2026
b6906a6
refactor: rewrite QA methodology - CI-aware, high-bar exercise, grace…
openhands-agent Apr 2, 2026
1c3516e
fix: add tmux as system dependency in action.yml
openhands-agent Apr 2, 2026
54ccfc4
fix: use Plugin.load() instead of manual skill loading
openhands-agent Apr 2, 2026
e5a31e4
fix: address review feedback — enable browser, harden security, add m…
openhands-agent Apr 6, 2026
454a555
feat: add cost guard, timeout, iteration limit, and tests
openhands-agent Apr 6, 2026
f6b6365
chore: bump default max-iterations from 200 to 500
openhands-agent Apr 6, 2026
57c6579
fix: replace unsupported timeout-minutes in composite action with she…
openhands-agent Apr 6, 2026
6d4cbae
chore: remove accidentally committed uv.lock
openhands-agent Apr 6, 2026
46d4769
feat(qa-changes): add Laminar tracing and evaluation workflow
openhands-agent Apr 6, 2026
fd9259c
test: add tests for evaluate_qa_changes.py
openhands-agent Apr 6, 2026
1938a52
chore: remove accidentally committed uv.lock
openhands-agent Apr 6, 2026
7a3a4c3
feat(qa-changes): post QA report as code review via /github-pr-review
openhands-agent Apr 6, 2026
ea104ea
chore: add uv.lock to .gitignore and remove from tracking
openhands-agent Apr 6, 2026
91189b8
fix(qa-changes): link github-pr-review skill to plugin
openhands-agent Apr 6, 2026
8da2d3e
qa-changes: use compact report format with collapsible evidence
openhands-agent Apr 12, 2026
6d8134e
feat: make qa-changes agent focus on whether PR fixes the original issue
openhands-agent Apr 13, 2026
97eedf0
refactor: reframe QA from "fix the issue" to "achieve the stated goal"
openhands-agent Apr 13, 2026
ef0c987
feat: require explicit before/after verification narrative in QA reports
openhands-agent Apr 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,6 @@ env/
# OS
.DS_Store
Thumbs.db

# uv lock file (no application code in this repo)
uv.lock
13 changes: 13 additions & 0 deletions marketplaces/default.json
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,19 @@
"documents"
]
},
{
"name": "qa-changes",
"source": "./qa-changes",
"description": "Validate pull request changes by actually running the code \u2014 setting up the environment, exercising changed behavior, and posting a structured QA report.",
"category": "quality-assurance",
"keywords": [
"qa",
"testing",
"pull-request",
"validation",
"automation"
]
},
{
"name": "readiness-report",
"source": "./readiness-report",
Expand Down
8 changes: 8 additions & 0 deletions plugins/qa-changes/.plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"name": "qa-changes",
"version": "0.1.0",
"description": "Automated QA validation of PR changes — sets up the environment, runs tests, exercises changed behavior, and reports results",
"author": "OpenHands",
"license": "MIT",
"repository": "https://github.com/OpenHands/extensions"
}
185 changes: 185 additions & 0 deletions plugins/qa-changes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# QA Changes Plugin

Automated pull request QA validation using OpenHands agents. Unlike the [PR Review plugin](../pr-review/) which reads diffs and posts code review comments, this plugin **actually runs the code** — setting up the environment, executing the test suite, exercising changed behavior, and posting a structured QA report.

## Quick Start

Copy the workflow file to your repository:

```bash
cp plugins/qa-changes/workflows/qa-changes-by-openhands.yml \
.github/workflows/qa-changes-by-openhands.yml
```

Then configure the required secrets (see [Installation](#installation) below).

## How It Works

The QA agent follows a four-phase methodology:

1. **Understand** — Reads the PR diff, title, and description. Classifies changes and identifies entry points (CLI commands, API endpoints, UI pages).
2. **Setup** — Bootstraps the repo: installs dependencies, builds the project. Checks CI status and only runs tests CI does not cover.
3. **Exercise** — The core phase. Actually uses the software the way a human would: spins up servers, opens browsers, runs CLI commands, makes HTTP requests. The bar is high — "tests pass" is not enough.
4. **Report** — Posts a structured QA report as a PR comment with evidence (commands, outputs, screenshots) and a verdict.

The agent knows when to give up: if a verification approach fails after three materially different attempts, it switches to a different approach. If two fundamentally different approaches fail, it reports honestly what could not be verified and suggests `AGENTS.md` guidance for future runs.

## Plugin Contents

```
plugins/qa-changes/
├── README.md # This file
├── action.yml # Composite GitHub Action
├── skills/ # Symbolic links to QA skills
│ └── qa-changes -> ../../../skills/qa-changes
├── workflows/ # Example GitHub workflow files
│ └── qa-changes-by-openhands.yml
└── scripts/ # Python scripts for QA execution
├── agent_script.py # Main QA agent script
└── prompt.py # Prompt template
```

## Installation

### 1. Copy the Workflow File

Copy the workflow file to your repository's `.github/workflows/` directory:

```bash
mkdir -p .github/workflows
cp plugins/qa-changes/workflows/qa-changes-by-openhands.yml \
.github/workflows/qa-changes-by-openhands.yml
```

### 2. Configure Secrets

Add the following secrets in your repository settings (**Settings → Secrets and variables → Actions**):

| Secret | Required | Description |
|--------|----------|-------------|
| `LLM_API_KEY` | Yes | API key for your LLM provider |
| `GITHUB_TOKEN` | Auto | Provided automatically by GitHub Actions |

### 3. Create the QA Label (Optional)

Create a `qa-this` label for manual QA triggers:

1. Go to **Issues → Labels**
2. Click **New label**
3. Name: `qa-this`
4. Description: `Trigger OpenHands QA validation`

## Usage

### Automatic Triggers

QA validation is automatically triggered when:
- A new non-draft PR is opened (by trusted contributors)
- A draft PR is marked as ready for review
- The `qa-this` label is added
- `openhands-agent` is requested as a reviewer

### Requesting QA

**Option 1: Add Label**

Add the `qa-this` label to any PR.

**Option 2: Request as Reviewer**

Request `openhands-agent` as a reviewer on the PR.

## Action Inputs

| Input | Required | Default | Description |
|-------|----------|---------|-------------|
| `llm-model` | No | `anthropic/claude-sonnet-4-5-20250929` | LLM model to use |
| `llm-base-url` | No | `''` | Custom LLM endpoint URL |
| `extensions-repo` | No | `OpenHands/extensions` | Extensions repository |
| `extensions-version` | No | `main` | Git ref (tag, branch, or SHA) |
| `max-budget` | No | `10.0` | Maximum LLM cost in dollars — agent stops when exceeded |
| `timeout-minutes` | No | `30` | Wall-clock timeout for the QA step |
| `max-iterations` | No | `500` | Maximum agent iterations (each is one LLM call + action) |
| `llm-api-key` | Yes | - | LLM API key |
| `github-token` | Yes | - | GitHub token for API access |

## QA Report Format

The agent posts a PR comment with this structure:

```
## QA Report

**Summary**: [One-sentence verdict]

### Environment Setup
[Build/install results]

### CI & Test Status
[CI check results, any additional tests run beyond CI]

### Functional Verification
[Commands run, outputs observed, screenshots, behavior verified]

### Unable to Verify (if applicable)
[What could not be verified, what was attempted, suggested AGENTS.md guidance]

### Issues Found
- 🔴 **Blocker**: [Description]
- 🟠 **Issue**: [Description]
- 🟡 **Minor**: [Description]

### Verdict
✅ PASS / ⚠️ PASS WITH ISSUES / ❌ FAIL / 🟡 PARTIAL
```

## Customizing QA Guidelines

Add project-specific QA guidelines to your repository:

### Option 1: Custom QA Skill

Create `.agents/skills/qa-guide.md`:

```markdown
---
name: qa-guide
description: Project-specific QA guidelines
triggers:
- /qa-changes
---

# Project QA Guidelines

## Setup Commands
- `make install` to install dependencies
- `make build` to build the project

## Test Commands
- `make test` for unit tests
- `make test-integration` for integration tests

## Key Behaviors to Verify
- [List critical user flows]
- [List known fragile areas]
```

### Option 2: Repository AGENTS.md

Add setup and test commands to `AGENTS.md` at your repository root. The agent reads this file automatically.

## Security

The workflow uses `pull_request` (not `pull_request_target`) so that fork PRs do **not** get access to the base repository's secrets. Since the QA agent *executes* code from the PR (unlike a code-review agent which only reads diffs), using `pull_request_target` would allow untrusted fork code to run with the repo's `GITHUB_TOKEN` and `LLM_API_KEY`.

The trade-off is that fork PRs won't have access to repository secrets and the QA workflow won't run for them. Maintainers can run QA locally or set up a separate trusted workflow for those cases.

**Note**: The `FIRST_TIME_CONTRIBUTOR` and `NONE` author associations are excluded from automatic triggers as an additional safety layer.

## Contributing

See the main [extensions repository](https://github.com/OpenHands/extensions) for contribution guidelines.

## License

This plugin is part of the OpenHands extensions repository. See [LICENSE](/LICENSE) for details.
151 changes: 151 additions & 0 deletions plugins/qa-changes/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
---
name: OpenHands QA Changes
description: Automated QA validation of PR changes using OpenHands agent
author: OpenHands

branding:
icon: check-circle
color: green

inputs:
llm-model:
description: LLM model to use for QA validation.
required: false
default: anthropic/claude-sonnet-4-5-20250929
llm-base-url:
description: LLM base URL (optional, for custom LLM endpoints)
required: false
default: ''
extensions-repo:
description: GitHub repository for extensions (owner/repo)
required: false
default: OpenHands/extensions
extensions-version:
description: Git ref to use for extensions (tag, branch, or commit SHA)
required: false
default: main
max-budget:
description: Maximum LLM cost in dollars. The agent stops when this budget is exceeded.
required: false
default: '10.0'
timeout-minutes:
description: Maximum wall-clock time in minutes for the QA job.
required: false
default: '30'
max-iterations:
description: Maximum number of agent iterations (each iteration is one LLM call + action).
required: false
default: '500'
llm-api-key:
description: LLM API key (required)
required: true
github-token:
description: GitHub token for API access (required)
required: true
lmnr-api-key:
description: Laminar API key for observability (optional)
required: false
default: ''

runs:
using: composite
steps:
- name: Checkout extensions repository
uses: actions/checkout@v4
with:
repository: ${{ inputs.extensions-repo }}
ref: ${{ inputs.extensions-version }}
path: extensions

- name: Checkout PR repository
uses: actions/checkout@v4
with:
repository: ${{ github.event.pull_request.head.repo.full_name }}
ref: ${{ github.event.pull_request.head.ref }}
fetch-depth: 0
persist-credentials: false
path: pr-repo
submodules: recursive

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true

- name: Install system dependencies
shell: bash
run: |
sudo apt-get update
# gh: GitHub CLI for posting QA reports
# tmux: required by the OpenHands agent runtime
sudo apt-get install -y gh tmux

- name: Check required configuration
shell: bash
env:
LLM_API_KEY: ${{ inputs.llm-api-key }}
GITHUB_TOKEN: ${{ inputs.github-token }}
run: |
if [ -z "$LLM_API_KEY" ]; then
echo "Error: llm-api-key is required."
exit 1
fi
if [ -z "$GITHUB_TOKEN" ]; then
echo "Error: github-token is required."
exit 1
fi

echo "PR Number: ${{ github.event.pull_request.number }}"
echo "PR Title: ${{ github.event.pull_request.title }}"
echo "Repository: ${{ github.repository }}"
echo "LLM model: ${{ inputs.llm-model }}"

- name: Run QA validation
shell: bash
env:
LLM_MODEL: ${{ inputs.llm-model }}
LLM_BASE_URL: ${{ inputs.llm-base-url }}
LLM_API_KEY: ${{ inputs.llm-api-key }}
GITHUB_TOKEN: ${{ inputs.github-token }}
LMNR_PROJECT_API_KEY: ${{ inputs.lmnr-api-key }}
MAX_BUDGET: ${{ inputs.max-budget }}
MAX_ITERATIONS: ${{ inputs.max-iterations }}
TIMEOUT_MINUTES: ${{ inputs.timeout-minutes }}
PR_NUMBER: ${{ github.event.pull_request.number }}
PR_TITLE: ${{ github.event.pull_request.title }}
PR_BODY: ${{ github.event.pull_request.body }}
PR_BASE_BRANCH: ${{ github.event.pull_request.base.ref }}
PR_HEAD_BRANCH: ${{ github.event.pull_request.head.ref }}
REPO_NAME: ${{ github.repository }}
run: |
cd pr-repo
# timeout-minutes is not supported on composite action steps,
# so we enforce the time limit via the coreutils timeout command.
TIMEOUT_SECONDS=$((TIMEOUT_MINUTES * 60))
timeout "${TIMEOUT_SECONDS}" \
uv run --no-project --with openhands-sdk --with openhands-tools --with lmnr \
python ../extensions/plugins/qa-changes/scripts/agent_script.py

- name: Upload logs as artifact
uses: actions/upload-artifact@v4
if: always()
with:
name: openhands-qa-changes-logs
path: |
*.log
output/
retention-days: 7

- name: Upload Laminar trace info for evaluation
uses: actions/upload-artifact@v4
if: success()
with:
name: qa-changes-trace-${{ github.event.pull_request.number }}
path: pr-repo/laminar_trace_info.json
retention-days: 30
if-no-files-found: ignore
Loading
Loading