fix: correct diff output when duplicate functions exist in new file#94
Draft
fix: correct diff output when duplicate functions exist in new file#94
Conversation
…or duplicate functions When a function is copied verbatim in the new file (e.g. old getNewPlan + new modified getNewPlan + old getNewPlan copy), the Myers algorithm matches the old content to the identical later copy (0 edit distance), leaving the modified earlier version as all-added. Two complementary post-processing passes are applied after diffLines(): 1. cleanupLongDistanceSame — handles the simple [ADDED(A), SAME(S)] pattern by finding S's first significant line inside A, splitting A at that point, and rearranging to [ADDED(before), REMOVED(S), ADDED(from+S)]. 2. cleanupSplitMatches — handles the interleaved [ADDED,SAME,ADDED,SAME…] pattern by detecting consecutive SAME blocks whose new-file positions jump far more than their old-file positions (ratio > 3×), then consolidating to [ADDED(before), REMOVED(all-SAMEs-old), ADDED(new-content-from-first-SAME)]. Three new tests cover: the interleaved Myers variant, the simple variant, and a regression guard that prepending content to a file does not falsely trigger the fix. Agent-Logs-Url: https://github.com/Aeolun/react-diff-viewer-continued/sessions/400ed340-4aa8-4e09-9c98-d583dcbfb7d1 Co-authored-by: Aeolun <1116482+Aeolun@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix bug with difference algorithm for same named functions
fix: correct diff output when duplicate functions exist in new file
Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When new code introduces both a modified version of a function and an identical copy of the original, Myers diff (minimum edit distance) silently matches old→identical copy at zero cost — making the modified version appear entirely green and the original appear as unchanged context. The old function should be shown as modified.
Root cause
Myers prefers the zero-cost match to the identical copy over the more expensive match to the modified version. This is algorithmically correct but semantically wrong from a code-review perspective.
Fix — two post-processing passes in
compute-lines.tsafterdiffLines()cleanupLongDistanceSame— handles the simple[ADDED(A), SAME(S)]pattern.Detects when S's first significant line (>8 trimmed chars) also appears inside A, meaning Myers matched old S to a later identical copy while a nearby modified version lives in A. Splits A at that line and rearranges:
REMOVED(S)is now paired withADDED(A_from)— different content — so changes are highlighted.cleanupSplitMatches— handles the interleaved[ADDED, SAME, ADDED, SAME, …]pattern that Myers produces when old and new share one or more lines (e.g. a commonsetLoading(true)call).Detects consecutive SAME blocks whose new-file position gap is >3× their old-file gap (matched across different occurrences). Consolidates the affected run:
Both passes are no-ops when the diff already contains REMOVED blocks (non-pure-addition scenario).
Tests added
Three cases in a new
describeblock: