Worktrees in-depth support, tests#419
Conversation
| let hash_len = if options.long_rev { | ||
| 40 // Full hash for long revision | ||
| } else if let Some(abbrev) = options.abbrev { | ||
| abbrev as usize | ||
| // git blame reserves one extra char in the hash column for boundary | ||
| if hunk.is_boundary { | ||
| abbrev as usize | ||
| } else { | ||
| (abbrev + 1) as usize | ||
| } |
There was a problem hiding this comment.
🟡 Default blame formatter adds an extra SHA character for non-boundary commits when --abbrev is set
When --abbrev is provided, the formatter computes hash_len as (abbrev + 1) for non-boundary commits, and as abbrev for boundary commits.
Actual behavior: non-boundary commits display one more hex character than requested, while boundary commits display exactly the requested length.
Expected behavior: --abbrev <n> should display exactly n hex characters for the abbreviated SHA (with boundary commits typically showing a ^ marker in addition to that, not by increasing the SHA length for non-boundary hunks).
Root Cause / Detailed Explanation
In output_default_format, the code says it is matching git blame’s behavior of reserving an extra column for boundary commits, but implements the opposite: it adds the extra character to non-boundary hashes:
if hunk.is_boundary { abbrev } else { abbrev + 1 }.
This makes the output inconsistent with the --abbrev contract and with the code’s own comment.
Impact: Users passing --abbrev get incorrect blame output formatting (hash column wider than requested, and different formatting for boundary vs non-boundary lines).
| let hash_len = if options.long_rev { | |
| 40 // Full hash for long revision | |
| } else if let Some(abbrev) = options.abbrev { | |
| abbrev as usize | |
| // git blame reserves one extra char in the hash column for boundary | |
| if hunk.is_boundary { | |
| abbrev as usize | |
| } else { | |
| (abbrev + 1) as usize | |
| } | |
| let hash_len = if options.long_rev { | |
| 40 // Full hash for long revision | |
| } else if let Some(abbrev) = options.abbrev { | |
| // git blame reserves one extra char in the hash column for boundary | |
| // (the '^' marker), but the abbreviated SHA itself should remain `abbrev`. | |
| abbrev as usize | |
| } else { | |
| 7 // Default 7 chars | |
| }; |
Was this helpful? React with 👍 or 👎 to provide feedback.
6cfec90 to
fb221ae
Compare
| let stdout = str::from_utf8(&output.stdout)?; | ||
| let filenames: HashSet<String> = stdout | ||
| .lines() | ||
| .filter(|line| !line.is_empty()) | ||
| .map(|line| line.to_string()) | ||
| .collect(); |
There was a problem hiding this comment.
🟡 get_staged_filenames drops -z flag, breaking filenames containing newlines
The -z flag (NUL-separated output) was removed from git diff --cached --name-only, and the output parsing was changed from NUL-splitting to line-splitting. Filenames containing literal newline characters will now be incorrectly split into multiple entries.
Detailed Explanation
The old code at src/git/status.rs:62 used -z for NUL-separated output and split on NUL bytes:
args.push("-z".to_string());
// ...
let filenames: HashSet<String> = output.stdout
.split(|&b| b == 0)
.filter(|bytes| !bytes.is_empty())
.filter_map(|bytes| String::from_utf8(bytes.to_vec()).ok())
.collect();The new code at src/git/status.rs:73-78 uses line-based parsing:
let stdout = str::from_utf8(&output.stdout)?;
let filenames: HashSet<String> = stdout
.lines()
.filter(|line| !line.is_empty())
.map(|line| line.to_string())
.collect();While filenames with newlines are rare, they are valid on Unix/macOS systems. Git's -z flag exists precisely to handle this edge case safely. The sibling method get_staged_and_unstaged_filenames (src/git/status.rs:88) and the status() method (src/git/status.rs:126) still use -z and NUL-splitting, creating an inconsistency in how the codebase handles filenames.
Impact: Files with newline characters in their names will be incorrectly reported as multiple separate files, potentially causing incorrect pathspec filtering and missing or phantom status entries.
| let stdout = str::from_utf8(&output.stdout)?; | |
| let filenames: HashSet<String> = stdout | |
| .lines() | |
| .filter(|line| !line.is_empty()) | |
| .map(|line| line.to_string()) | |
| .collect(); | |
| args.push("-z".to_string()); // NUL-separated output for proper filename handling | |
| let output = exec_git(&args)?; | |
| if !output.status.success() { | |
| return Err(GitAiError::Generic(format!( | |
| "git diff exited with status {}", | |
| output.status | |
| ))); | |
| } | |
| // With -z, output is NUL-separated. The output may contain a trailing NUL. | |
| let filenames: HashSet<String> = output | |
| .stdout | |
| .split(|&b| b == 0) | |
| .filter(|bytes| !bytes.is_empty()) | |
| .filter_map(|bytes| String::from_utf8(bytes.to_vec()).ok()) | |
| .collect(); | |
Was this helpful? React with 👍 or 👎 to provide feedback.
| let checkpoints_file = self.dir.join("checkpoints.jsonl"); | ||
| fs::write(&checkpoints_file, "")?; | ||
|
|
||
| // Clear INITIAL attributions file so stale attributions from a | ||
| // previous working state do not persist across resets | ||
| if self.initial_file.exists() { | ||
| fs::remove_file(&self.initial_file)?; | ||
| } | ||
|
|
||
| Ok(()) |
There was a problem hiding this comment.
🔴 reset_working_log no longer clears INITIAL attributions file, causing stale AI attributions to persist
The reset_working_log method previously deleted the INITIAL attributions file to prevent stale attributions from a prior working state from persisting across resets. This deletion was removed, which means stale INITIAL attributions can leak into subsequent commits.
Root Cause and Impact
The old code in src/git/repo_storage.rs included:
// Clear INITIAL attributions file so stale attributions from a
// previous working state do not persist across resets
if self.initial_file.exists() {
fs::remove_file(&self.initial_file)?;
}This was removed from reset_working_log. The method is called from several places:
-
src/commands/checkpoint.rs:268— whenreset=trueflag is set. After reset, the code starts with empty checkpoints but the stale INITIAL file persists. WhenVirtualAttributions::from_just_working_log()is called during the next commit, it reads the stale INITIAL attributions and incorrectly attributes lines as AI-authored. -
src/commands/hooks/reset_hooks.rs:328— during git reset handling. Merged checkpoints are written after the reset, but stale INITIAL attributions from the old HEAD persist and could incorrectly attribute lines. -
src/authorship/rebase_authorship.rs:1679— this caller is safe because it immediately writes new INITIAL attributions afterward, overwriting the stale file.
Impact: After a working log reset (e.g., via checkpoint --reset or git reset operations), AI attributions from a previous session can incorrectly persist and be attributed to new commits, leading to false AI authorship claims.
(Refers to lines 203-214)
Was this helpful? React with 👍 or 👎 to provide feedback.
status.rsare they necessary?