fix: align standalone GRPO with WAA API format and add retry logic by abrichr · Pull Request #193 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-24T04:06:33Z

Summary

Fix screenshot(): WAA's /screenshot returns raw PNG bytes via send_file(), not base64-encoded JSON. Changed from resp.json() to resp.content (matching WAALiveAdapter)
Fix execute_action(): WAA's /execute_windows uses exec(command) directly, not subprocess. Removed python -c "..." wrapper that caused SyntaxError inside the VM. Now sends bare Python statements (matching WAALiveAdapter._build_pixel_command)
Fix train script import: scripts/train_grpo_standalone.py triggered openadapt_evals/__init__.py which eagerly imports open_clip (via demo_library), causing numpy ABI crashes. Now shims sys.modules to bypass the top-level init
Add retry logic: screenshot() retries 3 times with 2s delay; trainer does pre-rollout health check via new probe() method; training loop handles empty rollout groups gracefully
Add missing action types: double_click, right_click, scroll (matching WAALiveAdapter)

These two API format bugs (screenshot parsing + execute wrapping) are the root cause of the standalone GRPO trainer producing zero rewards.

Test plan

With WAA running, verify WAADirect.screenshot() returns valid PNG bytes (len > 100, parseable by PIL)
Verify WAADirect.execute_action(SimpleAction(type="click", x=500, y=500)) succeeds (status 200, no SyntaxError)
Verify python scripts/train_grpo_standalone.py --help works without importing open_clip
Verify WAADirect.probe() returns {"reachable": True, "screenshot_ok": True} when server is up
Verify WAADirect.health_check() returns False when server is down (no hang)

🤖 Generated with Claude Code

The standalone GRPO trainer produced zero rewards due to two API format bugs in WAADirect: 1. screenshot() tried resp.json() expecting base64-encoded JSON, but WAA's /screenshot returns raw PNG bytes via Flask's send_file(). Fixed to use resp.content (matching WAALiveAdapter). 2. execute_action() wrapped commands in `python -c "..."`, but WAA's /execute_windows uses exec() directly -- the wrapper caused SyntaxError inside the VM. Fixed to send bare Python statements (matching WAALiveAdapter._build_pixel_command). Additional improvements: - Add probe() method for structured health checking - Add screenshot retry logic (3 attempts with 2s delay) - Add double_click, right_click, scroll action types - Fix type action to click target first then type (match WAALiveAdapter) - Add pre-rollout health check in trainer._collect_group() - Handle empty rollouts gracefully in training loop - Fix train script to bypass openadapt_evals/__init__.py eager imports (open_clip -> numpy ABI crash in minimal training environments) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

abrichr merged commit 43cac1c into main Mar 24, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: align standalone GRPO with WAA API format and add retry logic#193

fix: align standalone GRPO with WAA API format and add retry logic#193
abrichr merged 1 commit intomainfrom
fix/standalone-grpo-waa-compat

abrichr commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 24, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant