Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
142c9db
feat: add btw
RealKai42 Apr 1, 2026
207998c
Add QuestionRequestPanel and QuestionPromptDelegate for interactive q…
RealKai42 Apr 1, 2026
989d1d8
fix(render): ensure render_agent_status uses compose_agent_output to …
RealKai42 Apr 3, 2026
1811855
fix(visualize): improve task management and modal handling in _Prompt…
RealKai42 Apr 3, 2026
10640a3
fix(prompt): update input section header and hint messages for clarity
RealKai42 Apr 3, 2026
5329c60
fix(interactive): implement wait_for_btw_dismiss to handle modal dism…
RealKai42 Apr 3, 2026
ef01c86
merge from main
RealKai42 Apr 3, 2026
0fb6f0a
fix(shell): rename _set_active_approval_sink to _set_active_view for …
RealKai42 Apr 3, 2026
6e15d5c
fix(interactive): update steer handling to use counter instead of deq…
RealKai42 Apr 3, 2026
6f1adf1
fix: UI polish, bug fixes, and e2e test alignment
RealKai42 Apr 3, 2026
66311b8
fix: update wire protocol tests for v1.9 and btw slash command
RealKai42 Apr 3, 2026
b070a49
fix: shell-only btw, queue safety, and review feedback
RealKai42 Apr 4, 2026
3ffa6c2
fix(shell): improve queue drain safety, btw panel scroll borders, and…
RealKai42 Apr 4, 2026
1b54269
fix(btw): reject mixed text+tool responses and block shell commands d…
RealKai42 Apr 5, 2026
849ee2c
fix: escape Rich markup in btw spinner and panel title to prevent ren…
RealKai42 Apr 7, 2026
6b7e477
feat(datetime): add format_elapsed function for human-friendly elapse…
RealKai42 Apr 7, 2026
add0983
docs: update changelog
RealKai42 Apr 7, 2026
f702def
Merge branch 'main' into kaiyi/perth
RealKai42 Apr 7, 2026
9f29170
Merge branch 'main' into kaiyi/perth
RealKai42 Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@ Only write entries that are worth mentioning to users.

## Unreleased

- Shell: Add `/btw` side question command — ask a quick question during streaming without interrupting the main conversation; uses the same system prompt and tool definitions for prompt cache alignment; responses display in a scrollable modal panel with streaming support
- Shell: Redesign bottom dynamic area — split the monolithic `visualize.py` (1865 lines) into a modular package (`visualize/`) with dedicated modules for input routing, interactive prompts, approval/question panels, and btw modal; unify input semantics with `classify_input()` for consistent command routing
- Shell: Add queue and steer dual-channel input during streaming — Enter queues messages for delivery after the current turn; Ctrl+S injects messages immediately into the running turn's context; queued messages display in the prompt area with count indicator and can be recalled with ↑
- Shell: Add `BtwBegin`/`BtwEnd` wire events for cross-client side question support
- Shell: Improve elapsed time formatting in spinners — durations over 60 seconds now display as `"1m 23s"` instead of `"83s"`; sub-second durations show `"<1s"`
- Shell: Fix Rich markup injection in btw panel — user questions containing `[`/`]` characters are now escaped to prevent broken rendering or style injection in spinner text and panel titles
- Core: Improve error diagnostics — enrich internal logging coverage, include relevant log files and system manifest in `kimi export` archives, and surface actionable error messages for common failures (auth, network, timeout, quota)
- Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
- Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery
Expand Down
13 changes: 13 additions & 0 deletions docs/en/reference/keyboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Kimi Code CLI shell mode supports the following keyboard shortcuts.
| `Ctrl-O` | Edit in external editor (`$VISUAL`/`$EDITOR`) |
| `Ctrl-J` | Insert newline |
| `Alt-Enter` | Insert newline (same as `Ctrl-J`) |
| `Ctrl-S` | Steer: inject input immediately into the running turn (during streaming) |
| `Ctrl-V` | Paste (supports images and video files) |
| `Ctrl-E` | Expand full approval request content |
| `1`–`4` | Quick select approval option (`4` for decline with feedback) |
Expand Down Expand Up @@ -82,6 +83,18 @@ Paste clipboard content into the input box. Supports:
Image pasting requires the model to support `image_in` capability. Video pasting requires the model to support `video_in` capability.
:::

## Streaming input

### `Ctrl-S`: Steer

During streaming, press `Ctrl-S` to submit the current input (or pop the oldest queued message) and inject it immediately into the running turn's context. The model sees your message right away without waiting for the current turn to end.

If the input box is empty and there are queued messages, `Ctrl-S` pops the oldest queued message and steers it instead.

### `Enter`: Queue

During streaming, pressing `Enter` queues your message for delivery after the current turn completes. The queued message count is shown in the input header (e.g., `── input · 2 queued ──`). Press `↑` on an empty input to recall the last queued message for editing.

## Approval request operations

### `Ctrl-E`: Expand full content
Expand Down
14 changes: 14 additions & 0 deletions docs/en/reference/slash-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,20 @@ Directories already within the working directory do not need to be added, as the

## Others

### `/btw`

Ask a quick side question without interrupting the main conversation. Available both when idle and during streaming.

Usage: `/btw <question>`

The side question runs in an isolated context: it sees the conversation history but does not modify it. Tool calls are disabled — the response is text-only, based on the model's existing knowledge of the conversation.

During streaming, the response appears in a scrollable modal panel overlaying the prompt area. Use `↑`/`↓` to scroll, `Escape` to dismiss.

::: tip
This command is only available in interactive shell mode. Wire and ACP clients can use the `BtwBegin`/`BtwEnd` wire events with the `run_side_question()` API.
:::

### `/init`

Analyze the current project and generate an `AGENTS.md` file.
Expand Down
6 changes: 6 additions & 0 deletions docs/en/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ This page documents the changes in each Kimi Code CLI release.

## Unreleased

- Shell: Add `/btw` side question command — ask a quick question during streaming without interrupting the main conversation; uses the same system prompt and tool definitions for prompt cache alignment; responses display in a scrollable modal panel with streaming support
- Shell: Redesign bottom dynamic area — split the monolithic `visualize.py` (1865 lines) into a modular package (`visualize/`) with dedicated modules for input routing, interactive prompts, approval/question panels, and btw modal; unify input semantics with `classify_input()` for consistent command routing
- Shell: Add queue and steer dual-channel input during streaming — Enter queues messages for delivery after the current turn; Ctrl+S injects messages immediately into the running turn's context; queued messages display in the prompt area with count indicator and can be recalled with ↑
- Shell: Add `BtwBegin`/`BtwEnd` wire events for cross-client side question support
- Shell: Improve elapsed time formatting in spinners — durations over 60 seconds now display as `"1m 23s"` instead of `"83s"`; sub-second durations show `"<1s"`
- Shell: Fix Rich markup injection in btw panel — user questions containing `[`/`]` characters are now escaped to prevent broken rendering or style injection in spinner text and panel titles
- Core: Improve error diagnostics — enrich internal logging coverage, include relevant log files and system manifest in `kimi export` archives, and surface actionable error messages for common failures (auth, network, timeout, quota)
- Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
- Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery
Expand Down
13 changes: 13 additions & 0 deletions docs/zh/reference/keyboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Kimi Code CLI Shell 模式支持以下键盘快捷键。
| `Ctrl-O` | 在外部编辑器中编辑(`$VISUAL`/`$EDITOR`) |
| `Ctrl-J` | 插入换行 |
| `Alt-Enter` | 插入换行(同 `Ctrl-J`) |
| `Ctrl-S` | Steer:在 streaming 期间将输入立即注入到正在运行的轮次 |
| `Ctrl-V` | 粘贴(支持图片和视频文件) |
| `Ctrl-E` | 展开审批请求完整内容 |
| `1`–`4` | 审批面板快速选择(`4` 为附带反馈拒绝) |
Expand Down Expand Up @@ -82,6 +83,18 @@ Kimi Code CLI Shell 模式支持以下键盘快捷键。
图片粘贴需要模型支持 `image_in` 能力,视频粘贴需要模型支持 `video_in` 能力。
:::

## Streaming 期间输入

### `Ctrl-S`:Steer(立即注入)

在 streaming 期间,按 `Ctrl-S` 提交当前输入(或弹出最早的排队消息)并立即注入到正在运行的轮次上下文中。模型会立即看到你的消息,无需等待当前轮次结束。

如果输入框为空且有排队消息,`Ctrl-S` 会弹出最早的排队消息并注入。

### `Enter`:排队

在 streaming 期间,按 `Enter` 将消息排入队列,等当前轮次完成后再发送。排队消息数量显示在输入区标题中(如 `── input · 2 queued ──`)。在空输入框中按 `↑` 可召回最后一条排队消息进行编辑。

## 审批请求操作

### `Ctrl-E`:展开完整内容
Expand Down
14 changes: 14 additions & 0 deletions docs/zh/reference/slash-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,20 @@ Flow Skill 也可以通过 `/skill:<name>` 调用,此时作为普通 Skill 加

## 其他

### `/btw`

在不打断主对话的情况下提出快速侧问。在空闲和 streaming 期间均可使用。

用法:`/btw <问题>`

侧问在隔离的上下文中运行:能看到对话历史但不会修改它。工具调用被禁用——响应仅基于模型对当前对话的已有了解,以纯文本形式回答。

在 streaming 期间,响应会显示在一个可滚动的模态面板中,覆盖在提示区域上方。使用 `↑`/`↓` 滚动,`Escape` 关闭。

::: tip
此命令仅在交互式 Shell 模式下可用。Wire 和 ACP 客户端可使用 `BtwBegin`/`BtwEnd` Wire 事件配合 `run_side_question()` API。
:::

### `/init`

分析当前项目并生成 `AGENTS.md` 文件。
Expand Down
6 changes: 6 additions & 0 deletions docs/zh/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@

## 未发布

- Shell:新增 `/btw` 侧问命令——在 streaming 期间提出快速问题,不打断主对话;使用相同的系统提示词和工具定义以对齐 Prompt 缓存;响应在可滚动的模态面板中显示,支持流式输出
- Shell:重新设计底部动态区——将单体 `visualize.py`(1865 行)拆分为模块化包(`visualize/`),包含输入路由、交互式提示、审批/提问面板和 btw 模态面板等独立模块;通过 `classify_input()` 统一输入语义,实现一致的命令路由
- Shell:新增 streaming 期间的排队和 steer 双通道输入——Enter 将消息排队,在当前轮次结束后发送;Ctrl+S 将消息立即注入到正在运行的轮次上下文中;排队消息在提示区域显示计数指示器,可通过 ↑ 键召回编辑
- Shell:新增 `BtwBegin`/`BtwEnd` Wire 事件,支持跨客户端侧问
- Shell:改进 spinner 中的耗时格式——超过 60 秒的时长现在显示为 `"1m 23s"` 而非 `"83s"`;低于 1 秒的显示为 `"<1s"`
- Shell:修复 btw 面板中的 Rich markup 注入问题——包含 `[`/`]` 字符的用户问题现在会被转义,防止 spinner 文本和面板标题出现渲染错误或样式注入
- Core:改进错误诊断——丰富内部日志覆盖,在 `kimi export` 导出的归档中包含相关日志文件和系统信息,并为常见错误(认证、网络、超时、配额)提供可操作的提示消息
- Shell:使用 `git ls-files` 进行 `@` 文件引用发现——文件补全器现在优先使用 `git ls-files --recurse-submodules` 查询文件列表(5 秒超时),非 Git 仓库则回退到 `os.walk`;此修复解决了大型仓库(如包含 6.5 万+文件的 apache/superset)中 1000 文件限制导致字母顺序靠后的目录无法访问的问题(修复 #1375)
- Core:新增共享的 `file_filter` 模块——通过 `src/kimi_cli/utils/file_filter.py` 统一 Shell 和 Web 的文件引用逻辑,提供一致的路径过滤、忽略目录排除和 Git 感知文件发现
Expand Down
212 changes: 212 additions & 0 deletions src/kimi_cli/soul/btw.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
"""Side question ("/btw") - answer a quick question without interrupting the main conversation.

Uses the same system_prompt + normalized history + tool definitions as the main
agent to maximize prompt cache hits. Tools are declared (for cache) but denied
at execution time. maxTurns=2 so if the LLM mistakenly calls a tool on the
first turn, the error result gives it a second chance to answer with text.

The question and response are NOT written to the main context history.
"""

from __future__ import annotations

import uuid
from collections.abc import Callable
from typing import TYPE_CHECKING

import kosong
from kosong.message import Message, ToolCall
from kosong.tooling import Tool, ToolError, ToolResult

from kimi_cli.soul import LLMNotSet, wire_send
from kimi_cli.soul.dynamic_injection import normalize_history
from kimi_cli.soul.message import system_reminder
from kimi_cli.utils.logging import logger
from kimi_cli.wire.types import BtwBegin, BtwEnd, TextPart

if TYPE_CHECKING:
from kosong.chat_provider import StreamedMessagePart

from kimi_cli.soul.kimisoul import KimiSoul

_BTW_MAX_TURNS = 2

SIDE_QUESTION_SYSTEM_REMINDER = """\
This is a side question from the user. Answer directly in a single response.

IMPORTANT:
- You are a separate, lightweight instance answering one question.
- The main agent continues independently — do NOT reference being interrupted.
- Do NOT call any tools. All tool calls are disabled and will be rejected.
Even though tool definitions are visible in this request, they exist only
for technical reasons (prompt cache). You MUST NOT use them.
- Respond ONLY with text based on what you already know from the conversation.
- This is a one-off response — no follow-up turns.
- If you don't know the answer, say so directly."""


# ---------------------------------------------------------------------------
# DenyAllToolset: advertises tools (cache match) but rejects every call
# ---------------------------------------------------------------------------


class _DenyAllToolset:
"""A toolset that exposes the same tool definitions as the agent (for prompt
cache matching) but rejects every tool call with an error message."""

def __init__(self, source_tools: list[Tool]) -> None:
self._tools = source_tools

@property
def tools(self) -> list[Tool]:
return self._tools

def handle(self, tool_call: ToolCall) -> ToolResult:
return ToolResult(
tool_call_id=tool_call.id,
return_value=ToolError(
message="Tool calls are disabled for side questions. Answer with text only.",
brief="denied",
),
)


# ---------------------------------------------------------------------------
# Context construction
# ---------------------------------------------------------------------------


def _build_btw_context(soul: KimiSoul, question: str) -> tuple[str, list[Message], _DenyAllToolset]:
"""Build (system_prompt, history, toolset) aligned with the main agent.

Uses the same system_prompt, normalize_history(), and tool definitions
as ``KimiSoul._step`` so the LLM provider can reuse the prompt cache.
"""
system_prompt = soul._agent.system_prompt # pyright: ignore[reportPrivateUsage]
effective_history = normalize_history(soul.context.history)

wrapped = f"{system_reminder(SIDE_QUESTION_SYSTEM_REMINDER).text}\n\n{question}"
side_message = Message(role="user", content=wrapped)

toolset = _DenyAllToolset(soul._agent.toolset.tools) # pyright: ignore[reportPrivateUsage]

return system_prompt, [*effective_history, side_message], toolset


# ---------------------------------------------------------------------------
# Execution
# ---------------------------------------------------------------------------


async def execute_side_question(
soul: KimiSoul,
question: str,
on_text_chunk: Callable[[str], None] | None = None,
) -> tuple[str | None, str | None]:
"""Execute a side question and return (response, error).

Runs up to ``_BTW_MAX_TURNS`` steps. On the first step, if the LLM
returns a tool call instead of text, the denied tool result is appended
to the history and a second step gives the LLM another chance.

Args:
soul: The KimiSoul instance (for context and chat_provider access).
question: The user's side question.
on_text_chunk: Optional callback for streaming text chunks.

Returns:
(response_text, None) on success, (None, error_message) on failure.
"""
if soul._runtime.llm is None: # pyright: ignore[reportPrivateUsage]
return None, "LLM is not set."

try:
chat_provider = soul._runtime.llm.chat_provider # pyright: ignore[reportPrivateUsage]
system_prompt, history, toolset = _build_btw_context(soul, question)

text_chunks: list[str] = []

def _on_part(part: StreamedMessagePart) -> None:
if isinstance(part, TextPart) and part.text:
text_chunks.append(part.text)
if on_text_chunk is not None:
on_text_chunk(part.text)

# Multi-turn loop: give the LLM a second chance if it calls tools
for turn in range(_BTW_MAX_TURNS):
result = await kosong.step(
chat_provider,
system_prompt,
toolset,
history,
on_message_part=_on_part,
)

# Check for text response — but only accept it if the LLM
# didn't also call tools (mixed text+tool = incomplete preamble).
response_text = "".join(text_chunks).strip()
if response_text and not result.tool_calls:
return response_text, None
Comment on lines +147 to +149
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check tool calls before accepting streamed /btw text

The side-question loop returns as soon as any text chunk is present, before evaluating whether the same model turn also requested tools. In tool-capable models, mixed text+tool outputs are possible (e.g., a short preamble followed by a tool call), and this logic will return that partial preamble instead of entering the deny-and-retry path, yielding incomplete answers. Require result.tool_calls to be empty (or handle tool calls first) before treating streamed text as final.

Useful? React with 👍 / 👎.


# No text — did the LLM try to call a tool?
tool_results = await result.tool_results()
if not result.tool_calls:
break # No text, no tool calls — give up

# Tool calls were denied. If we have turns left, feed the error
# back so the LLM can try again with text.
if turn + 1 < _BTW_MAX_TURNS:
# Build the next history: original + assistant message + tool error results
history = [
*history,
result.message,
*[_tool_result_to_message(tr) for tr in tool_results],
]
text_chunks.clear() # Reset for next turn
continue

# Last turn and still no text — report the tool call attempt
tool_names = [tc.function.name for tc in result.tool_calls]
return None, (
f"Side question tried to call tools ({', '.join(tool_names)}) "
"instead of answering directly. Try rephrasing or ask in the main conversation."
)

return None, "No response received."
except Exception as e:
logger.warning("Side question failed: {error}", error=e)
return None, str(e)


def _tool_result_to_message(tool_result: ToolResult) -> Message:
"""Convert a ToolResult to a tool-result Message for history."""
content = tool_result.return_value.message or "Tool call denied."
return Message(
role="tool",
content=content,
tool_call_id=tool_result.tool_call_id,
)


# ---------------------------------------------------------------------------
# Wire-based entry point (for web UI / non-interactive)
# ---------------------------------------------------------------------------


async def run_side_question(soul: KimiSoul, question: str) -> None:
"""Execute a side question via wire events."""
if soul._runtime.llm is None: # pyright: ignore[reportPrivateUsage]
raise LLMNotSet()

btw_id = uuid.uuid4().hex[:12]
wire_send(BtwBegin(id=btw_id, question=question))

try:
response, error = await execute_side_question(soul, question)
if response:
wire_send(BtwEnd(id=btw_id, response=response))
else:
wire_send(BtwEnd(id=btw_id, error=error or "No response received."))
except Exception as e:
logger.warning("Side question failed: {error}", error=e)
wire_send(BtwEnd(id=btw_id, error=str(e)))
3 changes: 3 additions & 0 deletions src/kimi_cli/soul/kimisoul.py
Original file line number Diff line number Diff line change
Expand Up @@ -433,6 +433,9 @@ async def _consume_pending_steers(self) -> bool:
"""Drain the steer queue and inject as follow-up user messages.

Returns True if any steers were consumed.

Note: /btw is intercepted at the UI layer (``classify_input``) before
reaching the steer queue, so it never appears here.
"""
consumed = False
while not self._steer_queue.empty():
Expand Down
Loading
Loading