tests: make wsproto agent_server fixture teardown robust #1278

enyst · 2025-11-28T19:36:33Z

This PR proposes to make the wsproto test more robust, it can happen that terminate fails in pytest.

—-

(Agent Summary)

Record creator_pid at fixture setup and ensure teardown only runs in the original parent process

Why

pytest session-scoped fixtures can teardown in forked workers (e.g., pytest-xdist, pytest-forked), causing assertions if a process tries to manage a process it didn’t create

How

Capture creator_pid = os.getpid() when starting the child
During teardown, guard with if os.getpid() == creator_pid: before terminate/join/kill
Keep the simple liveness-based shutdown sequence

Result

Clean, explicit, and portable teardown logic
Avoids private attribute access
Verified locally: pre-commit hooks pass and the wsproto tests pass

Co-authored-by: openhands openhands@all-hands.dev

@enyst can click here to continue refining the PR

…ate attr\n\n- Record creator_pid and only tear down from original parent\n- Avoid using multiprocessing.Process._parent_pid which is private\n- Minor safety: terminate only if alive on startup failure\n\nCo-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-11-29T21:33:17Z

@OpenHands Think deeply, run the entire test suite, then run this test in isolation, and come up with a few reasons why was this flaky, which MIGHT NOT be the same reason this PR attempts to fix.

/codereview-roasted

openhands-ai · 2025-11-29T21:33:26Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2025-11-29T21:46:27Z

Summary of work and answer

What I did

Built the project: make build
Ran the entire test suite: uv run pytest
- Observed an early failure in tests/agent_server/test_agent_server_wsproto.py during the full run (.F).
- Saw many “Exception ignored in atexit callback” messages from LocalConversation.close complaining “Agent not initialized,” likely unrelated noise but indicative of lifecycle issues elsewhere.
Ran the wsproto tests in isolation: uv run pytest tests/agent_server/test_agent_server_wsproto.py
- Passed consistently (2 tests).
Stress-ran the wsproto file 10 times back-to-back
- 10/10 green; couldn’t reproduce flakiness in isolation.

Why this was flaky (beyond the PR’s teardown guard)

Port selection race (TOCTOU): The test finds a “free” port, closes it, then starts the server. Under load/parallelism, some other test can grab the port first, causing intermittent startup failures.
Heavy readiness probe: Using /docs for readiness can be slow under load (OpenAPI UI generation/serving). Combined with a tight per-request timeout (1s), this can sporadically fail the health check loop even though the server is about to be ready.
Session-scoped lifetime: The server persists across the entire test session. Other tests can interfere with it (state changes, shutdown paths, CPU contention), making failures order-dependent.
Start-method/platform variability: spawn vs fork differences (CI/OS dependent) can change startup timing and behavior, contributing to occasional flakes.
Websocket timing sensitivity: Short open_timeout/recv windows (5s/2s) are mostly fine here, but under heavy CI load they can still be borderline.
Atexit cleanup noise elsewhere: The LocalConversation.close tracebacks suggest lifecycle inconsistencies. While not directly causing this test to fail, they add nondeterminism during global teardown.

Bottom line

The PR’s fix (guard teardown to only run in the creating process) addresses one real flake vector with xdist/forked teardown, but the biggest contributors are the free-port race and the heavy /docs readiness probe. Even after merging, I’d expect occasional flakes until those are tightened.
Simple mitigation ideas:
- Avoid TOCTOU free-port selection (ideally start the server on port 0 and read back the bound port; otherwise reduce the race window).
- Use a lightweight health endpoint with a slightly higher request timeout (e.g., 3–5s per attempt).
- Consider narrower fixture scope if feasible to reduce cross-test interference.
- Clean up the atexit lifecycle errors in LocalConversation to reduce teardown noise.

_{View full conversation}

enyst marked this pull request as draft November 29, 2025 23:01

enyst changed the title ~~tests: make wsproto agent_server fixture teardown robust without private attr~~ tests: make wsproto agent_server fixture teardown robust Nov 30, 2025

Merge branch 'main' into chore/robust-wsproto-teardown

87e89e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tests: make wsproto agent_server fixture teardown robust #1278

tests: make wsproto agent_server fixture teardown robust #1278

Uh oh!

enyst commented Nov 28, 2025 •

edited

Loading

Uh oh!

enyst commented Nov 29, 2025

Uh oh!

openhands-ai bot commented Nov 29, 2025

Uh oh!

openhands-ai bot commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tests: make wsproto agent_server fixture teardown robust #1278

Are you sure you want to change the base?

tests: make wsproto agent_server fixture teardown robust #1278

Uh oh!

Conversation

enyst commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enyst commented Nov 29, 2025

Uh oh!

openhands-ai bot commented Nov 29, 2025

Uh oh!

openhands-ai bot commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

enyst commented Nov 28, 2025 •

edited

Loading