Skip to content

fix: retain the original line break style of the file and fix cross-platform CRLF/LF issues#2362

Open
Sisyphbaous-DT-Project wants to merge 3 commits into
MoonshotAI:mainfrom
Sisyphbaous-DT-Project:fix-preserve-line-endings
Open

fix: retain the original line break style of the file and fix cross-platform CRLF/LF issues#2362
Sisyphbaous-DT-Project wants to merge 3 commits into
MoonshotAI:mainfrom
Sisyphbaous-DT-Project:fix-preserve-line-endings

Conversation

@Sisyphbaous-DT-Project
Copy link
Copy Markdown

@Sisyphbaous-DT-Project Sisyphbaous-DT-Project commented May 24, 2026

Related Issue

Resolve #1952
Resolve #2191

Description

Fix StrReplaceFile and WriteFile tools corrupting the original newline style when editing files.

Root Cause: readtext() uses universal newlines mode by default (newline=None), which silently converts \r\n to \n when reading files; writetext() was already fixed in PR #1693 (by adding newline=""). This causes the entire read→edit→write pipeline to convert CRLF files to LF, so even if only one line is modified, git diff shows the entire file as changed.

Fix Details:

  1. packages/kaos/src/kaos/local.py

    • Add newline="" to readtext() and readlines() to disable Python's universal newlines auto-conversion, aligning with the already-fixed writetext().
  2. packages/kaos/src/kaos/ssh.py

    • Change readtext() to use readbytes() + decode() to avoid newline conversion by asyncssh SFTP text mode.
    • Use splitlines(keepends=True) in readlines() to preserve trailing newlines.
    • Change writetext() to byte mode ("wb"/"ab") to prevent newline conversion by the remote server.
  3. src/kimi_cli/tools/file/replace.py

    • Change StrReplaceFile to read raw bytes and detect the file's dominant newline style (CRLF / LF / CR).
    • Normalize content to \n for model matching (model-generated old/new strings use \n).
    • Restore the original newline style after replacement and write to disk via write_bytes().
  4. src/kimi_cli/tools/file/write.py

    • In WriteFile, when in overwrite mode and the file already exists, detect its newline style and write the new content in that style.
    • For new files, preserve the newline style of the passed-in content.

Test Results:

  • packages/kaos/tests/test_local_kaos.py: 16 passed
  • Added 9 new EOL-preservation test cases, all passing.
  • Additional regression tests: 6/6 passed (covering no-match, replace_all, empty old, append, no-trailing-newline, real-code-scenario).

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

@github-actions github-actions Bot changed the title fix: 保留文件原始换行符风格,修复跨平台 CRLF/LF 问题 fix: 保留文件原始换行符风格,修复跨平台 CRLF/LF 问题 || fix: retain the original line break style of the file and fix cross-platform CRLF/LF issues May 24, 2026
@Randy-sin
Copy link
Copy Markdown

There is a line-ending edge case worth covering before merge.

For CRLF files, both the WriteFile overwrite path and the StrReplaceFile restore path convert with content.replace("\n", "\r\n"). If the incoming replacement content already contains CRLF, this turns each \r\n into \r\r\n.

Example shape:

"a\r\nb\r\n".replace("\n", "\r\n") == "a\r\r\nb\r\r\n"

This can happen if the tool input already carries CRLF text, or if edit.new contains a CRLF block. A safer approach is to normalize incoming content first (\r\n/\r -> \n) before converting to the detected file EOL, and add a test for pre-CRLF input.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8174eceea8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/kimi_cli/tools/file/write.py Outdated
Comment thread src/kimi_cli/tools/file/replace.py Outdated
Comment thread packages/kaos/src/kaos/ssh.py Outdated
@Sisyphbaous-DT-Project Sisyphbaous-DT-Project changed the title fix: 保留文件原始换行符风格,修复跨平台 CRLF/LF 问题 || fix: retain the original line break style of the file and fix cross-platform CRLF/LF issues fix: retain the original line break style of the file and fix cross-platform CRLF/LF issues May 24, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread src/kimi_cli/tools/file/write.py
Comment thread packages/kaos/src/kaos/ssh.py Outdated
@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Author

Sisyphbaous-DT-Project commented May 24, 2026

Thanks to all review bots for your feedback. This commit 5a07501 has fixed the following 3 issues:

@chatgpt-codex-connector

  • P1 Badge (write.py: normalize content before converting LF to CRLF) fixed

    • Added a normalization step before writing: content.replace("\r\n", "\n").replace("\r", "\n") to prevent existing CRLF from being double-written into CRCRLF.
  • P2 Badge (replace.py: Preserve mixed line endings during string replacement) Some improvements

    • Also adds standardized pre-steps to eliminate the risk of double writing.
    • But the global EOL unified transformation is still intentionally simplified by design. Exact line-by-line mapping of mixed line-end files will significantly increase the complexity (previously it was evaluated that the EOL mapping table needs to be established line by line), but mixed line-end files are extremely rare in actual scenarios. The current solution meets most usage scenarios.
  • P2 Badge (ssh.py: returns character count written from SSH text) fixed

    • Instead await f.write(encoded) and then return len(data), making sure to return the number of characters instead of the number of bytes. Added CJK regression test verification.

@devin-ai-integration

  • WriteFile append mode does not preserve CRLF line endings By design

    • The append mode is to append new content. The content passed in by the model is usually the exact text it wants to append. Casting newlines for appended content may not match the model's intent (the model may intentionally append LF lines to a CRLF file). Overwrite requires consistency because it replaces the entire file, while append is an incremental append and is acceptable in edge cases.
  • SSH writetext returns number of bytes Fixed (ditto)

The above changes have passed all 50 tests (including 12 EOL-related tests).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a075014e3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/kimi_cli/tools/file/write.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants