Draft
Conversation
Each pytest process now auto-acquires an exclusive slot via file locks, giving it isolated PostgreSQL databases, Redis DBs, and Kafka topics. This enables safe concurrent test execution without shared-state collisions. Key changes: - New isolation module with slot allocation and per-resource helpers - Session-scoped ClickHouse reset (reset_snuba) instead of per-test - Snowflake ID preservation across Redis flushdb to prevent ID reuse - Unique snowflake_id per worker for non-colliding model creation - Worker-aware Kafka topics and Relay container names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire xdist's PYTEST_XDIST_WORKER env var ("gw0", "gw1", ...) into the
isolation module so each xdist worker gets its own DB/Redis/Kafka slot.
- isolation.py: parse xdist gateway ID to numeric worker slot
- sentry.py: add pytest_xdist_setupnodes for ClickHouse reset and
DJANGO_SETTINGS_MODULE stripping before workers spawn
- fixtures.py: skip session-scoped reset_snuba on xdist workers
- env.py: recognize PYTEST_XDIST_WORKER for in_test_environment()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Scope test assertions to their own org/project IDs so concurrent workers with shared ClickHouse tables don't see each other's data. Add unique insert IDs to prevent ClickHouse deduplication across tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
Remove SENTRY_PYTEST_SERIAL and SENTRY_TEST_WORKER_ID env vars. Every pytest process now unconditionally acquires a file-lock slot, giving it an isolated Redis DB, PostgreSQL suffix, and Kafka topics. No configuration needed — works automatically across xdist workers, plain pytest invocations, and concurrent runs in separate worktrees. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ac20c31 to
4d85e84
Compare
Replace xdist's default load-balancing scheduler with a deterministic one that assigns test files to workers via round-robin and preserves collection order within each file. This prevents test pollution and flakiness caused by shuffled execution order. Key design: the xdist worker protocol requires a "shutdown" command to trigger execution of the last queued test, so the scheduler sends all work upfront then immediately shuts down each node. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dynamic silo test classes (e.g. MyTest__InControlMode, MyTest__InRegionMode) were created by iterating a frozenset, whose order varies across Python processes due to hash randomization. This caused xdist workers to collect tests in different orders, aborting the run with a collection diff error. Sort the silo modes before creating dynamic classes so all workers see identical collection order. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Workers using set/dict-based pytest.mark.parametrize produce collections in different orders due to hash randomization across Python processes. The scheduler now compares sorted collections (same tests, any order) instead of requiring identical ordering. Also builds a canonical sorted collection for deterministic round-robin assignment, and uses O(1) index lookup per worker instead of list.index. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The rerunfailures plugin uses socket-based IPC between controller and workers that hangs during worker startup, causing 60-minute CI timeouts. Our deterministic scheduler sends all work upfront and shuts down nodes immediately, which is incompatible with the plugin's connection model. Disable the plugin when -n is specified. Reruns are not meaningful with the deterministic scheduler anyway since work cannot be re-distributed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pytest-rerunfailures <=16.1 has a bug in SocketDB._sock_recv where recv(1) returning b"" on a closed connection never matches the newline delimiter, causing an infinite loop. The server's run_connection threads also crash on TimeoutError which isn't caught by suppress(ConnectionError). Monkey-patch _sock_recv to raise ConnectionError on EOF and TimeoutError so the existing suppression handles both cases cleanly. This replaces the previous workaround of disabling rerunfailures entirely under xdist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous fix handled server-side socket EOF but ConnectionError still propagated from the client side in xdist workers, crashing the worker process. Patch ClientStatusDB._get/_set to catch connection errors and fall back to StatusDB no-op behavior (return 0 / no-op). This is safe because our DeterministicScheduling doesn't support mark_test_pending (crash reruns), and normal reruns are self-contained within each worker's pytest_runtest_protocol loop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the complex _sock_recv monkey-patch with a simple one-liner that disables the socket mechanism entirely by setting HAS_PYTEST_HANDLECRASHITEM = False at import time in conftest.py. This prevents ServerStatusDB/ClientStatusDB from ever being created. Normal within-worker reruns still work (self-contained in pytest_runtest_protocol). Only crash-item reruns are disabled, which our DeterministicScheduling doesn't support anyway (mark_test_pending raises NotImplementedError). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Backend Test FailuresFailures on
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
wip