Skip to content

feat(logging): add diagnostic logging at key error paths and bundle logs in export#1756

Merged
RealKai42 merged 11 commits intomainfrom
kaiyi/delhi-v1
Apr 7, 2026
Merged

feat(logging): add diagnostic logging at key error paths and bundle logs in export#1756
RealKai42 merged 11 commits intomainfrom
kaiyi/delhi-v1

Conversation

@RealKai42
Copy link
Copy Markdown
Collaborator

@RealKai42 RealKai42 commented Apr 4, 2026

Description

Add diagnostic logging across the codebase so that when users export a session via kimi export, developers have sufficient error context to debug issues.

Changes

1. Logger calls at key error paths (25+ call sites)

Added logger.warning / logger.error at previously silent error paths:

  • Soul layer: agent loop step failures, retry attempts (now include error type/message), connection recovery exhaustion, compaction failures (now distinguishable from step failures)
  • Tool execution layer: toolset catch-all, JSON parse errors, MCP tool failures/timeouts/error responses
  • File tools: catch-all handlers in ReadFile, WriteFile, StrReplaceFile, Glob, Grep, ReadMediaFile
  • Web tools: SearchWeb (HTTP errors, timeouts, parse errors), FetchURL (HTTP errors, timeouts, network errors for both local and service paths)
  • Shell tool: foreground execution catch-all, background task start failures
  • Other: create_llm missing config, hook block events, hook callback failures with error details

2. Export bundles global diagnostic logs

kimi export now includes recent global log files (logs/ subdirectory in ZIP):

  • Collects log files near the session's active period (from wire.jsonl timestamps ± 2 days)
  • Also collects log files near the export time (± 2 days)
  • Falls back to export-time-only when wire.jsonl is absent
  • Prints a privacy notice when logs are included

3. Documentation

Updated kimi export descriptions in both EN/ZH docs to mention diagnostic logs.


Open with Devin

…rt bundling

- Added logging for various error scenarios in tools, including file read/write, glob, and shell command execution.
- Implemented logging for HTTP errors and timeouts in web fetch and search tools.
- Introduced a new test suite to validate logging behavior for diagnostic purposes, ensuring key error paths are logged appropriately.
- Updated the export functionality to include recent diagnostic logs and added warnings for potential sensitive information.
Copilot AI review requested due to automatic review settings April 4, 2026 16:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9a04f8bfa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +255 to +256
for log_path in log_files:
zf.write(log_path, arcname=f"logs/{log_path.name}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Tolerate log rotation while adding files to export ZIP

The new log bundling loop assumes every path in log_files still exists when zf.write runs, but these files live in the global rotating log directory and can be renamed/deleted by rotation/retention between collection and archive creation. In that case zipfile raises FileNotFoundError/OSError and aborts the whole export, even though the session data itself is still exportable. Please guard each log write with error handling (skip missing logs) so kimi export remains reliable during concurrent log rotation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds broader diagnostic logging across error paths and enhances kimi export to bundle recent global log files into the exported session ZIP, so exported artifacts contain enough context to debug failures.

Changes:

  • Added logger.warning / logger.error at multiple previously-silent failure points (tools, soul loop, hooks, LLM creation, subagent runner).
  • Updated kimi export to include recent global log files in the ZIP (with a privacy notice).
  • Added focused tests for log-file collection and for selected new logging call sites; updated EN/ZH docs to mention diagnostic logs.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/core/test_session_logging.py Adds tests for export log collection and for representative new logging call sites.
src/kimi_cli/cli/export.py Collects and zips recent global logs alongside session files; adds privacy notice.
src/kimi_cli/tools/web/search.py Logs HTTP/parse/timeout/network errors for SearchWeb.
src/kimi_cli/tools/web/fetch.py Logs HTTP/timeout/network errors for FetchURL (local + service paths).
src/kimi_cli/tools/shell/init.py Logs exceptions for foreground shell execution and background task start failures.
src/kimi_cli/tools/file/read.py Logs unexpected ReadFile failures before returning ToolError.
src/kimi_cli/tools/file/write.py Logs unexpected WriteFile failures before returning ToolError.
src/kimi_cli/tools/file/replace.py Logs unexpected StrReplaceFile failures before returning ToolError.
src/kimi_cli/tools/file/read_media.py Logs unexpected ReadMediaFile failures before returning ToolError.
src/kimi_cli/tools/file/glob.py Logs unexpected Glob failures before returning ToolError.
src/kimi_cli/tools/file/grep_local.py Logs unexpected Grep failures before returning ToolError.
src/kimi_cli/subagents/runner.py Adds warning logs for common subagent failure classes (max steps / provider errors).
src/kimi_cli/soul/toolset.py Logs tool JSON parse failures, tool execution failures, tool-load failures, and MCP tool error/timeout paths.
src/kimi_cli/soul/kimisoul.py Adds error/warning logs for compaction failures, step failures, connection recovery exhaustion, and retry attempts.
src/kimi_cli/llm.py Logs a warning when LLM creation fails due to missing base_url/model.
src/kimi_cli/hooks/engine.py Promotes some hook failure paths to warning logs and logs hook “block” events.
docs/en/reference/kimi-command.md Updates kimi export docs to mention diagnostic logs.
docs/zh/reference/kimi-command.md Updates kimi export docs to mention diagnostic logs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +169 to +172
result.append(f)

result.sort(key=lambda p: p.stat().st_mtime)
return result
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_collect_recent_log_files() sorts using p.stat() again, which can raise (e.g., file rotated/deleted between the earlier stat() and the sort) and abort the entire export. Consider caching mtimes alongside paths and sorting on the cached value, or handling OSError during the sort to make export best-effort.

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +335
except ImportError:
logger.warning("Tool module import failed: {module_name}", module_name=module_name)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The except ImportError: path logs only the module name, but not the actual exception, which makes diagnosing missing deps / import-time errors harder. Capture the exception (except ImportError as e) and include it in the log (or use logger.exception) so diagnostics contain the root cause.

Suggested change
except ImportError:
logger.warning("Tool module import failed: {module_name}", module_name=module_name)
except ImportError as e:
logger.warning(
"Tool module import failed: {module_name}: {error}",
module_name=module_name,
error=str(e),
)

Copilot uses AI. Check for mistakes.
Comment on lines 60 to +65
if response.status >= 400:
logger.warning(
"FetchURL HTTP error: status={status}, url={url}",
status=response.status,
url=params.url,
)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These warnings log the full params.url. URLs can embed credentials/tokens (userinfo or query params), and these logs are now bundled into kimi export. Consider redacting sensitive URL parts (e.g., strip userinfo and query string, or redact known secret keys) before logging.

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +130
except Exception as e:
logger.error(
"Shell command execution failed: {command}: {error}",
command=params.command,
error=e,
)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging the full shell command can leak secrets (tokens/passwords embedded in CLI args), and these logs are now bundled by kimi export. Consider logging a redacted/normalized command (or only the executable + a hint) while keeping the full command in the approval display/output where appropriate.

Copilot uses AI. Check for mistakes.
Comment on lines 705 to +729
@@ -712,6 +721,12 @@ async def _agent_loop(self) -> TurnOutcome:
back_to_the_future = e
except Exception as e:
# any other exception should interrupt the step
logger.error(
"Agent step {step_no} failed: {error_type}: {error}",
step_no=step_no,
error_type=type(e).__name__,
error=e,
)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These exception handlers log only type/str via logger.error, which drops the traceback. Since this is a key diagnostic path (and logs are now exported for debugging), consider using logger.exception(...) (or equivalent) to include the stack trace for compaction/step failures.

Copilot uses AI. Check for mistakes.
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fcc56c89a1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

):
result.append(f)

result.sort(key=lambda p: p.stat().st_mtime)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle missing log files when sorting by mtime

The new export path can still fail under concurrent log rotation before ZIP creation: _collect_recent_log_files already handles OSError during the first stat, but result.sort(key=lambda p: p.stat().st_mtime) does a second uncaught stat. If retention/rotation removes or renames a log between collection and sort, FileNotFoundError bubbles up and aborts kimi export. Please sort using the already-read mtimes (or catch OSError in the sort key path) so export remains robust while logs rotate.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f07c30a3a5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 304 to +306
for file_path in files:
zf.write(file_path, arcname=file_path.name)
with contextlib.suppress(OSError):
zf.write(file_path, arcname=str(file_path.relative_to(session_dir)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail export when session files cannot be archived

Wrapping session-file writes in contextlib.suppress(OSError) makes kimi export succeed even when core artifacts (for example context.jsonl or wire.jsonl) cannot be read due to transient deletion, permission changes, or I/O errors. That silently produces incomplete archives with no signal to the user, which undermines the purpose of exporting a session for debugging. The suppression should be limited to optional global logs (or at least surfaced as an explicit warning/error) rather than required session data.

Useful? React with 👍 / 👎.

Comment on lines +188 to +189
if total_size + size > _MAX_LOG_BYTES:
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Continue past oversized logs when enforcing size cap

The size-cap loop breaks on the first file that would exceed _MAX_LOG_BYTES, so one very large high-priority log can cause all subsequent smaller logs to be skipped even when they would fit within the cap. In practice this can yield an export with no diagnostic logs at all despite having many eligible small logs. To preserve diagnostic coverage, oversized entries should be skipped and iteration should continue.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61b974e2c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


# Collect candidates with (mtime, size, path)
candidates: list[tuple[float, int, Path]] = []
for f in log_dir.iterdir():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle OSError when scanning the global log directory

_collect_recent_log_files() treats diagnostic logs as optional, but for f in log_dir.iterdir() is unguarded. If ~/.kimi/logs exists but is unreadable (permission change) or disappears between is_dir() and iteration, iterdir() raises OSError and aborts kimi export, even though session data is still exportable. Wrap directory iteration in try/except OSError (or fall back to []) so export remains best-effort for logs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1688a3722

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


# Collect candidates with (mtime, size, path)
candidates: list[tuple[float, int, Path]] = []
for f in log_dir.iterdir():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle OSError when enumerating the global logs directory

_collect_recent_log_files only checks log_dir.is_dir() before iterating, but log_dir.iterdir() itself can raise OSError (for example if the logs directory is deleted, permission-changed, or transiently unavailable between the check and iteration). In that case kimi export aborts even though logs are optional and the session archive could still be produced. Please wrap the directory listing step in try/except OSError and treat this as a best-effort miss.

Useful? React with 👍 / 👎.

Comment on lines +307 to +309
for log_path in log_files:
with contextlib.suppress(OSError):
zf.write(log_path, arcname=f"logs/{log_path.name}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid buffering log-heavy exports fully in memory

This change can now add up to 100MB of global logs to each export, but the archive is still assembled in an in-memory BytesIO and then copied again via getvalue(). In low-memory environments, exporting a large session plus diagnostic logs can consume hundreds of MB transiently and fail with OOM, which makes the diagnostic path unreliable exactly when needed. Stream the zip directly to the destination file instead of buffering the whole archive in RAM.

Useful? React with 👍 / 👎.

@RealKai42 RealKai42 merged commit 8bbdcd0 into main Apr 7, 2026
21 checks passed
@RealKai42 RealKai42 deleted the kaiyi/delhi-v1 branch April 7, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants