ISSUE-190: fix the issue that leader coredump once follower asks for … by jingyichen1223 · Pull Request #191 · eBay/Gringofts

jingyichen1223 · 2026-05-19T10:10:23Z

…raftlog that has been truncated

Copilot

Pull request overview

This PR addresses a Raft leader coredump scenario triggered when a follower responds based on raft logs that have been truncated (or otherwise become inconsistent with the leader’s cached peer indices), and adds a detailed scenario matrix documenting AppendEntries index-handling behaviors.

Changes:

Add early-return reset handling in RaftCore::handleAppendEntriesResponse() for (a) follower log rollback vs cached matchIndex, and (b) nextIndex falling at/below the leader’s firstLogIndex after truncation.
Improve logging around these abnormal index shapes to aid diagnosis.
Add a comprehensive AppendEntries scenario walkthrough document.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
src/infra/raft/v2/RaftCore.cpp	Adds defensive handling for abnormal AE rejection responses to prevent crashes when indices fall into truncated/rolled-back regions.
docs/Gringofts_AE_Scenario_Matrix.md	New documentation describing AE index-shape scenarios and expected behaviors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    /// Two special cases out of raft consensus but practical happenings:
+    /// 1. follower's log is rolled back due to disk crash, so response.last_log_index() is less than peer.mMatchIndex.
+    /// 2. leader's log is truncated due to disk usage, so response.last_log_index() is less than leader's firstLogIndex


+      // Suppress bulk data until we find proper peer.mMatchIndex and peer.mNextIndex.
      peer.mSuppressBulkData = true;
-      SPDLOG_WARN("{} reset Follower {}: matchIndex from {} to 0, nextIndex from {} to {} ",
+      SPDLOG_WARN("Follower {} last_log_index({}) fall behind its matchIndex({})", response.id(),


+    }
+
+    /// there should be a gap between matchIndex and nextIndex
+    assert(peer.mMatchIndex + 1 < peer.mNextIndex);


+- `handleAppendEntriesRequest()`
+- `handleAppendEntriesResponse()`
+
+in `core/third_party/Gengar/third_party/Gringofts/src/infra/raft/v2/RaftCore.cpp`.


+This walkthrough cover these scenarios:
+- Normal cases
+- New leader cases
+- Follower disk was lost cases
+- Truncate cases
+


ISSUE-190: fix the issue that leader coredump once follower asks for …

98173b7

…raftlog that has been truncated

jingyichen1223 requested a review from Copilot May 22, 2026 08:50

Copilot started reviewing on behalf of jingyichen1223 May 22, 2026 08:51 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISSUE-190: fix the issue that leader coredump once follower asks for …#191

ISSUE-190: fix the issue that leader coredump once follower asks for …#191
jingyichen1223 wants to merge 1 commit into
eBay:masterfrom
jingyichen1223:ISSUE-190

jingyichen1223 commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jingyichen1223 commented May 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants