Skip to content

Add format validation for WriteFile tool#1738

Open
MaxwellGengYF wants to merge 2 commits intoMoonshotAI:mainfrom
MaxwellGengYF:main
Open

Add format validation for WriteFile tool#1738
MaxwellGengYF wants to merge 2 commits intoMoonshotAI:mainfrom
MaxwellGengYF:main

Conversation

@MaxwellGengYF
Copy link
Copy Markdown

@MaxwellGengYF MaxwellGengYF commented Apr 3, 2026

#1736

Format validation is triggered after a file is successfully written or appended. The performance impact is negligible — validation runs only on JSON, XML, and Markdown files after write completion, incurring minimal overhead from standard library parsers.

If validation fails:

  • The operation is marked as unsuccessful (returns ToolError)
  • Error messages exceeding 65,536 characters are saved to temporary files to prevent display issues
  • Error messages include line and column numbers to facilitate debugging

Open with Devin

Format validation is triggered after a file is successfully written or appended. **The performance impact is negligible** — validation runs only on JSON, XML, and Markdown files after write completion, incurring minimal overhead from standard library parsers.

If validation fails:
- The operation is marked as unsuccessful (returns `ToolError`)
- Error messages exceeding 65,536 characters are saved to temporary files to prevent display issues
- Error messages include line and column numbers to facilitate debugging
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5bd4f923c3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +216 to +217
if line.strip().startswith('```'):
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip fenced code contents in markdown validators

check_md is now wired into WriteFile, but the markdown checks only skip lines that start with ``` and still lint the lines inside fenced blocks. This makes valid markdown fail when code samples contain characters like # or `[` (for example `#include <...>`), so `WriteFile` returns a `ToolError` for otherwise valid `.md` files. Track fence state across lines (as `_check_code_fence_balance` already does) and bypass header/link/table checks while inside a fenced block.

Useful? React with 👍 / 👎.

i = 0
while i < len(line):
# Look for opening bracket (not escaped)
if line[i] == '[' and not _is_escaped(line, i):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Limit link-balance checks to actual markdown links

The validator treats every unescaped [ as the start of a markdown link and reports an error if no matching ] is found, but unmatched brackets are valid plain text in markdown. With this change, writing a .md file containing normal prose like array[index now fails format validation and surfaces as a WriteFile error. The scan should only enforce parenthesis/bracket balance for recognized link/image syntaxes rather than all bracket characters.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review


if md_callback:
md_callback(content)
return errors
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 check_md and check_md_str return empty list [] instead of None on valid input

When markdown validation passes (no errors), _validate_markdown returns an empty list []. The if errors: guard at line 80 is falsy for [], so the early return is skipped, and return errors at line 85 returns [] instead of the documented None. The same issue exists in check_md_str at line 110. While the current caller in write.py:203 uses if fmt_error: (which treats [] as falsy), this violates the function's str | None return contract and will break any future caller using if result is None: or similar identity checks.

Suggested change
return errors
return None
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


if md_callback:
md_callback(content)
return errors
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 check_md_str returns empty list [] instead of None on valid input

Same issue as check_md at line 85: when markdown is valid, return errors returns [] instead of None, violating the documented str | None return type.

Suggested change
return errors
return None
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +171 to +197
def _check_header_syntax(lines: list[str]) -> list[str]:
"""Check if headers have proper syntax."""
errors: list[str] = []

for line_idx, line in enumerate(lines, start=1):
stripped = line.lstrip()

# Check for header syntax
if stripped.startswith('#'):
# Count leading hashes
match = re.match(r'^(#{1,6})\s', stripped)
if not match:
# Check if it's more than 6 hashes (invalid header level)
hash_match = re.match(r'^(#{7,})\s', stripped)
if hash_match:
col = line.index('#') + 1
errors.append(
f"Markdown validation error at line {line_idx}, column {col}: Invalid header level (max 6), found {len(hash_match.group(1))} # characters")
else:
# Header without space after hashes
hash_only_match = re.match(r'^(#{1,6})([^\s]|$)', stripped)
if hash_only_match:
col = line.index('#') + 1
errors.append(
f"Markdown validation error at line {line_idx}, column {col}: Header must be followed by a space")

return errors
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Markdown validators don't track code block state, causing false positive errors on valid files

The _check_header_syntax, _check_link_balance, and _check_table_syntax functions all iterate through every line of the markdown without tracking whether lines are inside a fenced code block. This means content inside code blocks (e.g., #comment in a Python block, data[0] split across lines, or piped shell output) is validated as if it were markdown prose. For instance, #comment inside a ```python block triggers "Header must be followed by a space", and a multi-line array index like items[\n 0\n] triggers "Unclosed link bracket". Since check_md is called from write.py:201 and any returned error causes a ToolError at write.py:205-208, valid .md files with code blocks will be reported as having format validation failures, potentially causing the LLM to attempt unnecessary and harmful "fixes". Note that _check_code_fence_balance already correctly tracks open/close fence state but this information is not shared with the other validators.

Reproduction: Python comment inside code block

A markdown file containing:

# Title

\`\`\`python
#comment
\`\`\`

_check_header_syntax flags line 4 (#comment) as "Header must be followed by a space" because it doesn't know it's inside a code fence.

Similarly, _check_link_balance at line 216 has a comment saying "Skip code blocks" but only skips the fence marker lines themselves (if line.strip().startswith('\``'): continue`), not lines inside the block.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +214 to +217
for line_idx, line in enumerate(lines, start=1):
# Skip code blocks
if line.strip().startswith('```'):
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 _check_link_balance only skips code fence lines, not content inside code blocks

At check_fmt.py:216, the function skips lines matching line.strip().startswith('\``')(the fence markers themselves), but does not track whether it is currently inside a fenced code block. All lines *between* opening and closing fences are still checked for bracket balance. Any[character inside a code block (e.g., array indexingarr[i], or example link syntax) that isn't closed on the same line will be reported as "Unclosed link bracket". This produces false positive ToolErrors from the WriteFile tool for valid .md` files.

Prompt for agents
The _check_link_balance function in src/kimi_cli/tools/file/check_fmt.py at line 216 only skips lines that ARE code fence markers (starting with ```), but does not skip content lines that are inside a fenced code block. This causes false positives for bracket characters inside code blocks.

The fix should add an in_code_block state variable that toggles when a ``` fence line is encountered, and skip all link balance checking while in_code_block is True. This is the same pattern needed by _check_header_syntax and _check_table_syntax.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 122a6b94a3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

"""
try:
js = None
with open(file_path, 'r', encoding='utf-8') as f:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate files through KaosPath backend

check_json reads file_path with Python open(), but WriteFile writes via KaosPath (which can target non-local backends such as ACP/remote Kaos). In those sessions this validation step reads the local filesystem path instead of the file that was just written, so .json/.xml/.md writes can incorrectly fail (or validate the wrong file). Please run format validation through Kaos-backed reads so validation uses the same filesystem backend as the write.

Useful? React with 👍 / 👎.

Comment on lines +284 to +287
if stripped.startswith('|') and stripped.endswith('|'):
if not in_table:
in_table = True
table_start_line = line_idx
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid treating any pipe-delimited line as a table

This branch marks every line that starts and ends with | as a table row, then later errors if no separator row is found. In Markdown, a lone line like | not a table | is valid plain text unless it is followed by a table separator, so this logic rejects valid .md content and causes WriteFile to return a format error unnecessarily. Detect a table only after confirming a valid header+separator pattern.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant