MoonshotAI · RealKai42 · Apr 7, 2026 · Apr 1, 2026 · Apr 1, 2026 · Apr 3, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -11,6 +11,12 @@ Only write entries that are worth mentioning to users.
 
 ## Unreleased
 
+- Shell: Add `/btw` side question command — ask a quick question during streaming without interrupting the main conversation; uses the same system prompt and tool definitions for prompt cache alignment; responses display in a scrollable modal panel with streaming support
+- Shell: Redesign bottom dynamic area — split the monolithic `visualize.py` (1865 lines) into a modular package (`visualize/`) with dedicated modules for input routing, interactive prompts, approval/question panels, and btw modal; unify input semantics with `classify_input()` for consistent command routing
+- Shell: Add queue and steer dual-channel input during streaming — Enter queues messages for delivery after the current turn; Ctrl+S injects messages immediately into the running turn's context; queued messages display in the prompt area with count indicator and can be recalled with ↑
+- Shell: Add `BtwBegin`/`BtwEnd` wire events for cross-client side question support
+- Shell: Improve elapsed time formatting in spinners — durations over 60 seconds now display as `"1m 23s"` instead of `"83s"`; sub-second durations show `"<1s"`
+- Shell: Fix Rich markup injection in btw panel — user questions containing `[`/`]` characters are now escaped to prevent broken rendering or style injection in spinner text and panel titles
 - Core: Improve error diagnostics — enrich internal logging coverage, include relevant log files and system manifest in `kimi export` archives, and surface actionable error messages for common failures (auth, network, timeout, quota)
 - Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
 - Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery

diff --git a/docs/en/reference/keyboard.md b/docs/en/reference/keyboard.md
@@ -11,6 +11,7 @@ Kimi Code CLI shell mode supports the following keyboard shortcuts.
 | `Ctrl-O` | Edit in external editor (`$VISUAL`/`$EDITOR`) |
 | `Ctrl-J` | Insert newline |
 | `Alt-Enter` | Insert newline (same as `Ctrl-J`) |
+| `Ctrl-S` | Steer: inject input immediately into the running turn (during streaming) |
 | `Ctrl-V` | Paste (supports images and video files) |
 | `Ctrl-E` | Expand full approval request content |
 | `1`–`4` | Quick select approval option (`4` for decline with feedback) |
@@ -82,6 +83,18 @@ Paste clipboard content into the input box. Supports:
 Image pasting requires the model to support `image_in` capability. Video pasting requires the model to support `video_in` capability.
 :::
 
+## Streaming input
+
+### `Ctrl-S`: Steer
+
+During streaming, press `Ctrl-S` to submit the current input (or pop the oldest queued message) and inject it immediately into the running turn's context. The model sees your message right away without waiting for the current turn to end.
+
+If the input box is empty and there are queued messages, `Ctrl-S` pops the oldest queued message and steers it instead.
+
+### `Enter`: Queue
+
+During streaming, pressing `Enter` queues your message for delivery after the current turn completes. The queued message count is shown in the input header (e.g., `── input · 2 queued ──`). Press `↑` on an empty input to recall the last queued message for editing.
+
 ## Approval request operations
 
 ### `Ctrl-E`: Expand full content

diff --git a/docs/en/reference/slash-commands.md b/docs/en/reference/slash-commands.md
@@ -242,6 +242,20 @@ Directories already within the working directory do not need to be added, as the
 
 ## Others
 
+### `/btw`
+
+Ask a quick side question without interrupting the main conversation. Available both when idle and during streaming.
+
+Usage: `/btw <question>`
+
+The side question runs in an isolated context: it sees the conversation history but does not modify it. Tool calls are disabled — the response is text-only, based on the model's existing knowledge of the conversation.
+
+During streaming, the response appears in a scrollable modal panel overlaying the prompt area. Use `↑`/`↓` to scroll, `Escape` to dismiss.
+
+::: tip
+This command is only available in interactive shell mode. Wire and ACP clients can use the `BtwBegin`/`BtwEnd` wire events with the `run_side_question()` API.
+:::
+
 ### `/init`
 
 Analyze the current project and generate an `AGENTS.md` file.

diff --git a/docs/en/release-notes/changelog.md b/docs/en/release-notes/changelog.md
@@ -4,6 +4,12 @@ This page documents the changes in each Kimi Code CLI release.
 
 ## Unreleased
 
+- Shell: Add `/btw` side question command — ask a quick question during streaming without interrupting the main conversation; uses the same system prompt and tool definitions for prompt cache alignment; responses display in a scrollable modal panel with streaming support
+- Shell: Redesign bottom dynamic area — split the monolithic `visualize.py` (1865 lines) into a modular package (`visualize/`) with dedicated modules for input routing, interactive prompts, approval/question panels, and btw modal; unify input semantics with `classify_input()` for consistent command routing
+- Shell: Add queue and steer dual-channel input during streaming — Enter queues messages for delivery after the current turn; Ctrl+S injects messages immediately into the running turn's context; queued messages display in the prompt area with count indicator and can be recalled with ↑
+- Shell: Add `BtwBegin`/`BtwEnd` wire events for cross-client side question support
+- Shell: Improve elapsed time formatting in spinners — durations over 60 seconds now display as `"1m 23s"` instead of `"83s"`; sub-second durations show `"<1s"`
+- Shell: Fix Rich markup injection in btw panel — user questions containing `[`/`]` characters are now escaped to prevent broken rendering or style injection in spinner text and panel titles
 - Core: Improve error diagnostics — enrich internal logging coverage, include relevant log files and system manifest in `kimi export` archives, and surface actionable error messages for common failures (auth, network, timeout, quota)
 - Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
 - Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery

diff --git a/docs/zh/reference/keyboard.md b/docs/zh/reference/keyboard.md
@@ -11,6 +11,7 @@ Kimi Code CLI Shell 模式支持以下键盘快捷键。
 | `Ctrl-O` | 在外部编辑器中编辑（`$VISUAL`/`$EDITOR`） |
 | `Ctrl-J` | 插入换行 |
 | `Alt-Enter` | 插入换行（同 `Ctrl-J`） |
+| `Ctrl-S` | Steer：在 streaming 期间将输入立即注入到正在运行的轮次 |
 | `Ctrl-V` | 粘贴（支持图片和视频文件） |
 | `Ctrl-E` | 展开审批请求完整内容 |
 | `1`–`4` | 审批面板快速选择（`4` 为附带反馈拒绝） |
@@ -82,6 +83,18 @@ Kimi Code CLI Shell 模式支持以下键盘快捷键。
 图片粘贴需要模型支持 `image_in` 能力，视频粘贴需要模型支持 `video_in` 能力。
 :::
 
+## Streaming 期间输入
+
+### `Ctrl-S`：Steer（立即注入）
+
+在 streaming 期间，按 `Ctrl-S` 提交当前输入（或弹出最早的排队消息）并立即注入到正在运行的轮次上下文中。模型会立即看到你的消息，无需等待当前轮次结束。
+
+如果输入框为空且有排队消息，`Ctrl-S` 会弹出最早的排队消息并注入。
+
+### `Enter`：排队
+
+在 streaming 期间，按 `Enter` 将消息排入队列，等当前轮次完成后再发送。排队消息数量显示在输入区标题中（如 `── input · 2 queued ──`）。在空输入框中按 `↑` 可召回最后一条排队消息进行编辑。
+
 ## 审批请求操作
 
 ### `Ctrl-E`：展开完整内容

diff --git a/docs/zh/reference/slash-commands.md b/docs/zh/reference/slash-commands.md
@@ -242,6 +242,20 @@ Flow Skill 也可以通过 `/skill:<name>` 调用，此时作为普通 Skill 加
 
 ## 其他
 
+### `/btw`
+
+在不打断主对话的情况下提出快速侧问。在空闲和 streaming 期间均可使用。
+
+用法：`/btw <问题>`
+
+侧问在隔离的上下文中运行：能看到对话历史但不会修改它。工具调用被禁用——响应仅基于模型对当前对话的已有了解，以纯文本形式回答。
+
+在 streaming 期间，响应会显示在一个可滚动的模态面板中，覆盖在提示区域上方。使用 `↑`/`↓` 滚动，`Escape` 关闭。
+
+::: tip
+此命令仅在交互式 Shell 模式下可用。Wire 和 ACP 客户端可使用 `BtwBegin`/`BtwEnd` Wire 事件配合 `run_side_question()` API。
+:::
+
 ### `/init`
 
 分析当前项目并生成 `AGENTS.md` 文件。

diff --git a/docs/zh/release-notes/changelog.md b/docs/zh/release-notes/changelog.md
@@ -4,6 +4,12 @@
 
 ## 未发布
 
+- Shell：新增 `/btw` 侧问命令——在 streaming 期间提出快速问题，不打断主对话；使用相同的系统提示词和工具定义以对齐 Prompt 缓存；响应在可滚动的模态面板中显示，支持流式输出
+- Shell：重新设计底部动态区——将单体 `visualize.py`（1865 行）拆分为模块化包（`visualize/`），包含输入路由、交互式提示、审批/提问面板和 btw 模态面板等独立模块；通过 `classify_input()` 统一输入语义，实现一致的命令路由
+- Shell：新增 streaming 期间的排队和 steer 双通道输入——Enter 将消息排队，在当前轮次结束后发送；Ctrl+S 将消息立即注入到正在运行的轮次上下文中；排队消息在提示区域显示计数指示器，可通过 ↑ 键召回编辑
+- Shell：新增 `BtwBegin`/`BtwEnd` Wire 事件，支持跨客户端侧问
+- Shell：改进 spinner 中的耗时格式——超过 60 秒的时长现在显示为 `"1m 23s"` 而非 `"83s"`；低于 1 秒的显示为 `"<1s"`
+- Shell：修复 btw 面板中的 Rich markup 注入问题——包含 `[`/`]` 字符的用户问题现在会被转义，防止 spinner 文本和面板标题出现渲染错误或样式注入
 - Core：改进错误诊断——丰富内部日志覆盖，在 `kimi export` 导出的归档中包含相关日志文件和系统信息，并为常见错误（认证、网络、超时、配额）提供可操作的提示消息
 - Shell：使用 `git ls-files` 进行 `@` 文件引用发现——文件补全器现在优先使用 `git ls-files --recurse-submodules` 查询文件列表（5 秒超时），非 Git 仓库则回退到 `os.walk`；此修复解决了大型仓库（如包含 6.5 万+文件的 apache/superset）中 1000 文件限制导致字母顺序靠后的目录无法访问的问题（修复 #1375）
 - Core：新增共享的 `file_filter` 模块——通过 `src/kimi_cli/utils/file_filter.py` 统一 Shell 和 Web 的文件引用逻辑，提供一致的路径过滤、忽略目录排除和 Git 感知文件发现

diff --git a/src/kimi_cli/soul/btw.py b/src/kimi_cli/soul/btw.py
@@ -0,0 +1,212 @@
+"""Side question ("/btw") - answer a quick question without interrupting the main conversation.
+
+Uses the same system_prompt + normalized history + tool definitions as the main
+agent to maximize prompt cache hits.  Tools are declared (for cache) but denied
+at execution time.  maxTurns=2 so if the LLM mistakenly calls a tool on the
+first turn, the error result gives it a second chance to answer with text.
+
+The question and response are NOT written to the main context history.
+"""
+
+from __future__ import annotations
+
+import uuid
+from collections.abc import Callable
+from typing import TYPE_CHECKING
+
+import kosong
+from kosong.message import Message, ToolCall
+from kosong.tooling import Tool, ToolError, ToolResult
+
+from kimi_cli.soul import LLMNotSet, wire_send
+from kimi_cli.soul.dynamic_injection import normalize_history
+from kimi_cli.soul.message import system_reminder
+from kimi_cli.utils.logging import logger
+from kimi_cli.wire.types import BtwBegin, BtwEnd, TextPart
+
+if TYPE_CHECKING:
+    from kosong.chat_provider import StreamedMessagePart
+
+    from kimi_cli.soul.kimisoul import KimiSoul
+
+_BTW_MAX_TURNS = 2
+
+SIDE_QUESTION_SYSTEM_REMINDER = """\
+This is a side question from the user. Answer directly in a single response.
+
+IMPORTANT:
+- You are a separate, lightweight instance answering one question.
+- The main agent continues independently — do NOT reference being interrupted.
+- Do NOT call any tools. All tool calls are disabled and will be rejected.
+  Even though tool definitions are visible in this request, they exist only
+  for technical reasons (prompt cache). You MUST NOT use them.
+- Respond ONLY with text based on what you already know from the conversation.
+- This is a one-off response — no follow-up turns.
+- If you don't know the answer, say so directly."""
+
+
+# ---------------------------------------------------------------------------
+# DenyAllToolset: advertises tools (cache match) but rejects every call
+# ---------------------------------------------------------------------------
+
+
+class _DenyAllToolset:
+    """A toolset that exposes the same tool definitions as the agent (for prompt
+    cache matching) but rejects every tool call with an error message."""
+
+    def __init__(self, source_tools: list[Tool]) -> None:
+        self._tools = source_tools
+
+    @property
+    def tools(self) -> list[Tool]:
+        return self._tools
+
+    def handle(self, tool_call: ToolCall) -> ToolResult:
+        return ToolResult(
+            tool_call_id=tool_call.id,
+            return_value=ToolError(
+                message="Tool calls are disabled for side questions. Answer with text only.",
+                brief="denied",
+            ),
+        )
+
+
+# ---------------------------------------------------------------------------
+# Context construction
+# ---------------------------------------------------------------------------
+
+
+def _build_btw_context(soul: KimiSoul, question: str) -> tuple[str, list[Message], _DenyAllToolset]:
+    """Build (system_prompt, history, toolset) aligned with the main agent.
+
+    Uses the same system_prompt, normalize_history(), and tool definitions
+    as ``KimiSoul._step`` so the LLM provider can reuse the prompt cache.
+    """
+    system_prompt = soul._agent.system_prompt  # pyright: ignore[reportPrivateUsage]
+    effective_history = normalize_history(soul.context.history)
+
+    wrapped = f"{system_reminder(SIDE_QUESTION_SYSTEM_REMINDER).text}\n\n{question}"
+    side_message = Message(role="user", content=wrapped)
+
+    toolset = _DenyAllToolset(soul._agent.toolset.tools)  # pyright: ignore[reportPrivateUsage]
+
+    return system_prompt, [*effective_history, side_message], toolset
+
+
+# ---------------------------------------------------------------------------
+# Execution
+# ---------------------------------------------------------------------------
+
+
+async def execute_side_question(
+    soul: KimiSoul,
+    question: str,
+    on_text_chunk: Callable[[str], None] | None = None,
+) -> tuple[str | None, str | None]:
+    """Execute a side question and return (response, error).
+
+    Runs up to ``_BTW_MAX_TURNS`` steps.  On the first step, if the LLM
+    returns a tool call instead of text, the denied tool result is appended
+    to the history and a second step gives the LLM another chance.
+
+    Args:
+        soul: The KimiSoul instance (for context and chat_provider access).
+        question: The user's side question.
+        on_text_chunk: Optional callback for streaming text chunks.
+
+    Returns:
+        (response_text, None) on success, (None, error_message) on failure.
+    """
+    if soul._runtime.llm is None:  # pyright: ignore[reportPrivateUsage]
+        return None, "LLM is not set."
+
+    try:
+        chat_provider = soul._runtime.llm.chat_provider  # pyright: ignore[reportPrivateUsage]
+        system_prompt, history, toolset = _build_btw_context(soul, question)
+
+        text_chunks: list[str] = []
+
+        def _on_part(part: StreamedMessagePart) -> None:
+            if isinstance(part, TextPart) and part.text:
+                text_chunks.append(part.text)
+                if on_text_chunk is not None:
+                    on_text_chunk(part.text)
+
+        # Multi-turn loop: give the LLM a second chance if it calls tools
+        for turn in range(_BTW_MAX_TURNS):
+            result = await kosong.step(
+                chat_provider,
+                system_prompt,
+                toolset,
+                history,
+                on_message_part=_on_part,
+            )
+
+            # Check for text response — but only accept it if the LLM
+            # didn't also call tools (mixed text+tool = incomplete preamble).
+            response_text = "".join(text_chunks).strip()
+            if response_text and not result.tool_calls:
+                return response_text, None
+
+            # No text — did the LLM try to call a tool?
+            tool_results = await result.tool_results()
+            if not result.tool_calls:
+                break  # No text, no tool calls — give up
+
+            # Tool calls were denied. If we have turns left, feed the error
+            # back so the LLM can try again with text.
+            if turn + 1 < _BTW_MAX_TURNS:
+                # Build the next history: original + assistant message + tool error results
+                history = [
+                    *history,
+                    result.message,
+                    *[_tool_result_to_message(tr) for tr in tool_results],
+                ]
+                text_chunks.clear()  # Reset for next turn
+                continue
+
+            # Last turn and still no text — report the tool call attempt
+            tool_names = [tc.function.name for tc in result.tool_calls]
+            return None, (
+                f"Side question tried to call tools ({', '.join(tool_names)}) "
+                "instead of answering directly. Try rephrasing or ask in the main conversation."
+            )
+
+        return None, "No response received."
+    except Exception as e:
+        logger.warning("Side question failed: {error}", error=e)
+        return None, str(e)
+
+
+def _tool_result_to_message(tool_result: ToolResult) -> Message:
+    """Convert a ToolResult to a tool-result Message for history."""
+    content = tool_result.return_value.message or "Tool call denied."
+    return Message(
+        role="tool",
+        content=content,
+        tool_call_id=tool_result.tool_call_id,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Wire-based entry point (for web UI / non-interactive)
+# ---------------------------------------------------------------------------
+
+
+async def run_side_question(soul: KimiSoul, question: str) -> None:
+    """Execute a side question via wire events."""
+    if soul._runtime.llm is None:  # pyright: ignore[reportPrivateUsage]
+        raise LLMNotSet()
+
+    btw_id = uuid.uuid4().hex[:12]
+    wire_send(BtwBegin(id=btw_id, question=question))
+
+    try:
+        response, error = await execute_side_question(soul, question)
+        if response:
+            wire_send(BtwEnd(id=btw_id, response=response))
+        else:
+            wire_send(BtwEnd(id=btw_id, error=error or "No response received."))
+    except Exception as e:
+        logger.warning("Side question failed: {error}", error=e)
+        wire_send(BtwEnd(id=btw_id, error=str(e)))
diff --git a/src/kimi_cli/soul/kimisoul.py b/src/kimi_cli/soul/kimisoul.py
@@ -433,6 +433,9 @@ async def _consume_pending_steers(self) -> bool:
         """Drain the steer queue and inject as follow-up user messages.
 
         Returns True if any steers were consumed.
+
+        Note: /btw is intercepted at the UI layer (``classify_input``) before
+        reaching the steer queue, so it never appears here.
         """
         consumed = False
         while not self._steer_queue.empty():