feat(consensus): add ClientTable with WAL-backed commit path and view-change safety#3023
Merged
hubcio merged 12 commits intoapache:masterfrom Apr 7, 2026
Merged
feat(consensus): add ClientTable with WAL-backed commit path and view-change safety#3023hubcio merged 12 commits intoapache:masterfrom
hubcio merged 12 commits intoapache:masterfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #3023 +/- ##
============================================
- Coverage 70.67% 70.48% -0.20%
Complexity 943 943
============================================
Files 1114 1115 +1
Lines 94780 95388 +608
Branches 71980 72606 +626
============================================
+ Hits 66987 67234 +247
- Misses 25320 25658 +338
- Partials 2473 2496 +23
🚀 New features to boost your workflow:
|
numinnex
reviewed
Mar 26, 2026
364894a to
5ce819d
Compare
Contributor
Author
|
Seems like a flaky test: |
Contributor
Author
|
Flaky test is fixed here: #3052 |
hubcio
requested changes
Mar 30, 2026
796fcac to
7e17d46
Compare
Contributor
Author
|
More flaky connector tests: Working on fixing them. |
Contributor
Author
Fixed here: #3077 |
7e17d46 to
dbfdd5d
Compare
instead of HashMap for deterministic ordering across replicas
commit_min/commit_max split
dbfdd5d to
1b1d8da
Compare
crashing. The behavior is still correct but suboptimal.
hubcio
requested changes
Apr 7, 2026
hubcio
approved these changes
Apr 7, 2026
numinnex
approved these changes
Apr 7, 2026
spetz
approved these changes
Apr 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the VSR client-table (VR Revisited, Section 4, Figure 2) with a WAL-backed commit path. Both metadata and partitions planes now execute committed ops through the same path on all replicas, ensuring the client table is always populated for view-change correctness.
ClientTable (
client_table.rs): fixed-size slot array withHashMapsecondary index for O(1) lookups. Tracks pending requests viaNotifyprimitive for async commit notification. Deterministic eviction by lowest commit number.PrepareJournal (renamed from
MetadataJournal): reusable append-only WAL for consensus prepare messages, now shared by both planes.WAL-backed commit path: operations are applied at commit time (not replicate time), eliminating duplicate data after view-change re-execution (IGGY-66). Both
commit_journal(backup) andon_ack(primary) read from the WAL, apply the state machine, flush, advancecommit_min, and update the client table - per-op, in a single loop.commit_min/commit_maxsplit:commit_maxtracks what the cluster says is committed (advances immediately).commit_mintracks what this replica has actually executed (advances sequentially, +1 per op). DVCmessages use
commit_min. Fence and chain-replication usecommit_min. Enables future message repair without data loss.Additional fixes
build_reply_messageusesprepare_header.op(notcommit_max) for deterministic eviction ordering across replicasfence_old_prepare_by_commitusescommit_minso retransmitted prepares needed for gap repair are not fenced outhandle_start_viewreturns range-basedSendPrepareOk- caller must verify WAL presence before sending (prevents false acks)drain_committable_prefixreplacesdrain_committable_allfor strict global op ordering (preventsadvance_commit_mininvariant violation)Sequencer/checksum updated after WAL append (not before) - failed append no longer leaves consensus state inconsistent
commit_replyasserts:client_id != 0,client_id == header.client, commit monotonicity, request monotonicity (all hard asserts)advance_commit_minsequential invariant is a hard assert (not debug-only)checkpoint_if_neededusescommit_min(notcommit_max) - prevents draining unexecuted WAL entriescommit_messagescalled per-op (no dedup) - each apply's data flushed before advancingcommit_minns_commitsreset inclear()- prevents stale view-change stateIggyPartitions<C, J>- partitions plane now generic over journal typeCloses Clients table implementation from VSR paper, for sending the response #3022