环境 / Environment
kimi-cli 1.44.0
- Python 3.14.3
- Linux 6.6.114.1-microsoft-standard-WSL2 (x86_64)
- Provider:
managed:kimi-code, base_url https://api.kimi.com/coding/v1
- Model:
kimi-for-coding (max_context_size=262144)
现象 / Symptom
单个长 session 跑到 context ≥ ~120k input token 之后,每个 step 都有较高概率
在新建 HTTPS 连接阶段超时,报 openai.APITimeoutError: Request timed out.,
底层是 httpcore.ConnectTimeout(连接握手就没完成,不是 read 阶段)。
_run_with_connection_recovery 会重试一次,有时能拉起来,有时直接耗尽
("Chat provider recovery exhausted for step")。
证据 / Evidence
从一个 session 的日志统计:
| 日志 |
timeout 次数 |
| 2026-05-25 第二份 |
24 |
| 2026-05-27 |
18 |
| 2026-05-28 (当天) |
53 |
所有失败 step 的 input 都 ≥ ~120k token,本 session 已堆到 154k;
中间也有大段时间正常工作(例:13:17–13:20 连续 11 步全部成功,input 135k–149k),
集中爆发出现在几个窄时间窗内(09:40 / 11:28 / 12:38 / 14:41)。
典型堆栈:
File ".../httpcore/_async/connection.py", line 124, in _connect
httpcore.ConnectTimeout
httpx.ConnectTimeout
raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
File ".../kosong/chat_provider/kimi.py", line 170, in generate
stream = await chat_provider.generate(system_prompt, tools, history)
kosong.chat_provider.APITimeoutError: Request timed out.
File ".../kimi_cli/soul/kimisoul.py", line 1417, in _run_with_connection_recovery
raise convert_error(e) from e
api.kimi.com 当时直连健康:
$ curl -o /dev/null -w "connect=%{time_connect}s tls=%{time_appconnect}s\n" https://api.kimi.com/coding/v1/
connect=0.234s tls=0.279s
DNS 解到 volcddos.com(火山引擎抗 D 边缘),所以怀疑是边缘节点对
大 body/高频新连的瞬时拒连,而 kimi-cli 内部 httpx 的 connect_timeout
偏短(默认 ~5s)直接放弃,既不让用户调,也没有指数退避。
期望 / Expected
- 暴露 httpx 的
connect_timeout / 整体 timeout 给 config.toml
(类似 providers.<name>.http.connect_timeout),让大 context 用户能调高
_run_with_connection_recovery 用指数退避(目前看像固定窄间隔 1 次重试,
赶上节点限流窗口时一起失败)
- 可选:对
ConnectTimeout 单独多重试几次,跟 read-timeout 区分对待
(因为 connect 阶段服务端还没收到任何 token,重试无副作用)
复现 / Reproduce
- 开一个长 session,持续推进让 history 堆到 ~120k+ input token
- 继续正常对话,过段时间就会观察到 step 失败 + APITimeoutError
- 此时直接
curl https://api.kimi.com/coding/v1/ 通常仍然秒回 → 不是
底层网络死了,是连接池新建 + WAF 限流的组合效应
Session ID: c5567dae-2551-426a-9db7-a46bb3b7b225
(完整日志可按需提供,不在 issue 里附 zip)
Environment / Environment
kimi-cli 1.44.0
-Python 3.14.3
- Linux 6.6.114.1-microsoft-standard-WSL2 (x86_64)
- Provider:
managed:kimi-code, base_url https://api.kimi.com/coding/v1
- Model:
kimi-for-coding (max_context_size=262144)
Phenomenon / Symptom
After a single long session runs to context ≥ ~120k input tokens, each step has a higher probability
Timeout occurs during the new HTTPS connection phase, and openai.APITimeoutError: Request timed out. is reported.
The bottom layer is httpcore.ConnectTimeout (the connection handshake is not completed, not the read phase).
_run_with_connection_recovery will try again, sometimes it can be pulled up, sometimes it will be exhausted directly.
("Chat provider recovery exhausted for step").
Evidence / Evidence
Log statistics from a session:
| log |
timeout times |
| 2026-05-25 Second copy |
24 |
| 2026-05-27 |
18 |
| 2026-05-28 (today) |
53 |
The input of all failed steps are ≥ ~120k tokens, and the heap of this session has reached 154k;
There is also a large period of time in the middle that works normally (for example: 13:17–13:20, 11 consecutive steps are all successful, input 135k–149k),
The concentrated outbreak occurred within several narrow time windows (09:40 / 11:28 / 12:38 / 14:41).
Typical stack:
File ".../httpcore/_async/connection.py", line 124, in _connect
httpcore.ConnectTimeout
httpx.ConnectTimeout
raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
File ".../kosong/chat_provider/kimi.py", line 170, in generate
stream = await chat_provider.generate(system_prompt, tools, history)
kosong.chat_provider.APITimeoutError: Request timed out.
File ".../kimi_cli/soul/kimisoul.py", line 1417, in _run_with_connection_recovery
raise convert_error(e) from e
api.kimi.com was directly connected to health:
$ curl -o /dev/null -w "connect=%{time_connect}s tls=%{time_appconnect}s\n" https://api.kimi.com/coding/v1/
connect=0.234s tls=0.279s
DNS found volcddos.com (Volcano engine anti-D edge), so it is suspected that the edge node is
Instantaneous connection rejection for large body/high-frequency new connections**, and connect_timeout of httpx inside kimi-cli
If it is too short (default ~5s), give up directly, neither letting the user adjust nor exponential backoff.
Expected / Expected
- Expose httpx’s
connect_timeout / overall timeout to config.toml
(similar to providers.<name>.http.connect_timeout), allowing users with large contexts to increase the
_run_with_connection_recovery uses exponential backoff (currently it looks like 1 retry at a fixed narrow interval,
Fail together when catching up with the node current limiting window)
- Optional: retry
ConnectTimeout several times separately and treat it differently from read-timeout
(Because the server has not received any token during the connect phase, retrying has no side effects)
Reproduce / Reproduce
- Open a long session and keep pushing until the history pile reaches ~120k+ input tokens
- Continue the normal conversation, and you will observe step failure + APITimeoutError after a while.
- At this time, directly
curl https://api.kimi.com/coding/v1/ usually still returns within seconds → No
The underlying network is dead. It is the combined effect of the new connection pool + WAF current limiting.
Session ID: c5567dae-2551-426a-9db7-a46bb3b7b225
(The complete log can be provided on demand, and the zip is not attached to the issue)
环境 / Environment
kimi-cli1.44.0managed:kimi-code, base_urlhttps://api.kimi.com/coding/v1kimi-for-coding(max_context_size=262144)现象 / Symptom
单个长 session 跑到 context ≥ ~120k input token 之后,每个 step 都有较高概率
在新建 HTTPS 连接阶段超时,报
openai.APITimeoutError: Request timed out.,底层是
httpcore.ConnectTimeout(连接握手就没完成,不是 read 阶段)。_run_with_connection_recovery会重试一次,有时能拉起来,有时直接耗尽("Chat provider recovery exhausted for step")。
证据 / Evidence
从一个 session 的日志统计:
所有失败 step 的
input都 ≥ ~120k token,本 session 已堆到 154k;中间也有大段时间正常工作(例:13:17–13:20 连续 11 步全部成功,input 135k–149k),
集中爆发出现在几个窄时间窗内(09:40 / 11:28 / 12:38 / 14:41)。
典型堆栈:
api.kimi.com当时直连健康:DNS 解到
volcddos.com(火山引擎抗 D 边缘),所以怀疑是边缘节点对大 body/高频新连的瞬时拒连,而 kimi-cli 内部 httpx 的
connect_timeout偏短(默认 ~5s)直接放弃,既不让用户调,也没有指数退避。
期望 / Expected
connect_timeout/ 整体timeout给config.toml(类似
providers.<name>.http.connect_timeout),让大 context 用户能调高_run_with_connection_recovery用指数退避(目前看像固定窄间隔 1 次重试,赶上节点限流窗口时一起失败)
ConnectTimeout单独多重试几次,跟 read-timeout 区分对待(因为 connect 阶段服务端还没收到任何 token,重试无副作用)
复现 / Reproduce
curl https://api.kimi.com/coding/v1/通常仍然秒回 → 不是底层网络死了,是连接池新建 + WAF 限流的组合效应
Session ID:
c5567dae-2551-426a-9db7-a46bb3b7b225(完整日志可按需提供,不在 issue 里附 zip)
Environment / Environment
kimi-cli1.44.0-Python 3.14.3
managed:kimi-code, base_urlhttps://api.kimi.com/coding/v1kimi-for-coding(max_context_size=262144)Phenomenon / Symptom
After a single long session runs to context ≥ ~120k input tokens, each step has a higher probability
Timeout occurs during the new HTTPS connection phase, and
openai.APITimeoutError: Request timed out.is reported.The bottom layer is
httpcore.ConnectTimeout(the connection handshake is not completed, not the read phase)._run_with_connection_recoverywill try again, sometimes it can be pulled up, sometimes it will be exhausted directly.("Chat provider recovery exhausted for step").
Evidence / Evidence
Log statistics from a session:
The
inputof all failed steps are ≥ ~120k tokens, and the heap of this session has reached 154k;There is also a large period of time in the middle that works normally (for example: 13:17–13:20, 11 consecutive steps are all successful, input 135k–149k),
The concentrated outbreak occurred within several narrow time windows (09:40 / 11:28 / 12:38 / 14:41).
Typical stack:
api.kimi.comwas directly connected to health:DNS found
volcddos.com(Volcano engine anti-D edge), so it is suspected that the edge node isInstantaneous connection rejection for large body/high-frequency new connections**, and
connect_timeoutof httpx inside kimi-cliIf it is too short (default ~5s), give up directly, neither letting the user adjust nor exponential backoff.
Expected / Expected
connect_timeout/ overalltimeouttoconfig.toml(similar to
providers.<name>.http.connect_timeout), allowing users with large contexts to increase the_run_with_connection_recoveryuses exponential backoff (currently it looks like 1 retry at a fixed narrow interval,Fail together when catching up with the node current limiting window)
ConnectTimeoutseveral times separately and treat it differently from read-timeout(Because the server has not received any token during the connect phase, retrying has no side effects)
Reproduce / Reproduce
curl https://api.kimi.com/coding/v1/usually still returns within seconds → NoThe underlying network is dead. It is the combined effect of the new connection pool + WAF current limiting.
Session ID:
c5567dae-2551-426a-9db7-a46bb3b7b225(The complete log can be provided on demand, and the zip is not attached to the issue)