fix(releases): Prevent row-lock contention on last_seen bump#115443
Open
yuvmen wants to merge 4 commits into
Open
fix(releases): Prevent row-lock contention on last_seen bump#115443yuvmen wants to merge 4 commits into
yuvmen wants to merge 4 commits into
Conversation
…aseProjectEnvironment and ReleaseEnvironment High-volume releases cause concurrent workers to pile up on the same row's UPDATE for last_seen, hitting statement_timeout (SENTRY-5HQZ). This adds a cache-based distributed lock so only one worker per row per 60s attempts the DB update, and catches OperationalError so a failed bump doesn't abort the rest of the event save pipeline. Fixes SENTRY-5HQZ
wedamija
approved these changes
May 12, 2026
| if cache.add(bump_key, "1", timeout=60): | ||
| try: | ||
| cls.objects.filter( | ||
| id=instance.id, last_seen__lt=datetime - timedelta(seconds=60) |
Member
There was a problem hiding this comment.
This could probably be updated to last_seen__lt=datetime now, since the lock is doing the work for us
| if cache.add(bump_key, "1", timeout=60): | ||
| try: | ||
| cls.objects.filter( | ||
| id=instance.id, last_seen__lt=datetime - timedelta(seconds=60) |
The cache lock handles the 60s throttle now, so the SQL filter only needs to prevent setting last_seen backwards.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UPDATE last_seenon the sameReleaseProjectEnvironment/ReleaseEnvironmentrow, triggering PostgreSQLstatement_timeoutcancellations (SENTRY-5HQZ — 6105 occurrences).OperationalErroraborts the rest of the event save pipeline (nodestore persistence, release counts, group release records), causing silent data loss.cache.add-based distributed lock so only one worker per row per 60s attempts the DB update, eliminating the thundering herd. Also catchesOperationalErroras a safety net so a failed best-effortlast_seenbump never blocks event processing.ReleaseProjectEnvironmentandReleaseEnvironmentwhich had identical vulnerable patterns.Fixes SENTRY-5HQZ
Test plan
test_bump_skipped_when_cache_lock_held— verifies second worker skips the DB update when the cache lock is heldtest_bump_survives_operational_error— verifiesOperationalErroris caught and instance is returned