大 context 请求频繁 ConnectTimeout，httpx connect_timeout 不可配 || Large context requests frequent ConnectTimeout, httpx connect_timeout is not configurable

## 环境 / Environment
- `kimi-cli` 1.44.0
- Python 3.14.3
- Linux 6.6.114.1-microsoft-standard-WSL2 (x86_64)
- Provider: `managed:kimi-code`, base_url `https://api.kimi.com/coding/v1`
- Model: `kimi-for-coding` (max_context_size=262144)

## 现象 / Symptom
单个长 session 跑到 context ≥ ~120k input token 之后,每个 step 都有较高概率
在新建 HTTPS 连接阶段超时,报 `openai.APITimeoutError: Request timed out.`,
底层是 `httpcore.ConnectTimeout`(连接握手就没完成,不是 read 阶段)。

`_run_with_connection_recovery` 会重试一次,有时能拉起来,有时直接耗尽
("Chat provider recovery exhausted for step")。

## 证据 / Evidence

从一个 session 的日志统计:

| 日志 | timeout 次数 |
|---|---|
| 2026-05-25 第二份 | 24 |
| 2026-05-27 | 18 |
| 2026-05-28 (当天) | 53 |

所有失败 step 的 `input` 都 ≥ ~120k token,本 session 已堆到 154k;
中间也有大段时间正常工作(例:13:17–13:20 连续 11 步全部成功,input 135k–149k),
集中爆发出现在几个窄时间窗内(09:40 / 11:28 / 12:38 / 14:41)。

典型堆栈:
```
File ".../httpcore/_async/connection.py", line 124, in _connect
httpcore.ConnectTimeout
httpx.ConnectTimeout
    raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
  File ".../kosong/chat_provider/kimi.py", line 170, in generate
    stream = await chat_provider.generate(system_prompt, tools, history)
kosong.chat_provider.APITimeoutError: Request timed out.
  File ".../kimi_cli/soul/kimisoul.py", line 1417, in _run_with_connection_recovery
    raise convert_error(e) from e
```

`api.kimi.com` 当时直连健康:
```
$ curl -o /dev/null -w "connect=%{time_connect}s tls=%{time_appconnect}s\n" https://api.kimi.com/coding/v1/
connect=0.234s tls=0.279s
```
DNS 解到 `volcddos.com`(火山引擎抗 D 边缘),所以**怀疑是边缘节点对
大 body/高频新连的瞬时拒连**,而 kimi-cli 内部 httpx 的 `connect_timeout`
偏短(默认 ~5s)直接放弃,既不让用户调,也没有指数退避。

## 期望 / Expected
1. 暴露 httpx 的 `connect_timeout` / 整体 `timeout` 给 `config.toml`
   (类似 `providers.<name>.http.connect_timeout`),让大 context 用户能调高
2. `_run_with_connection_recovery` 用指数退避(目前看像固定窄间隔 1 次重试,
   赶上节点限流窗口时一起失败)
3. 可选:对 `ConnectTimeout` 单独多重试几次,跟 read-timeout 区分对待
   (因为 connect 阶段服务端还没收到任何 token,重试无副作用)

## 复现 / Reproduce
1. 开一个长 session,持续推进让 history 堆到 ~120k+ input token
2. 继续正常对话,过段时间就会观察到 step 失败 + APITimeoutError
3. 此时直接 `curl https://api.kimi.com/coding/v1/` 通常仍然秒回 → 不是
   底层网络死了,是连接池新建 + WAF 限流的组合效应

Session ID: `c5567dae-2551-426a-9db7-a46bb3b7b225`
(完整日志可按需提供,不在 issue 里附 zip)

---
## Environment / Environment
- `kimi-cli` 1.44.0
-Python 3.14.3
- Linux 6.6.114.1-microsoft-standard-WSL2 (x86_64)
- Provider: `managed:kimi-code`, base_url `https://api.kimi.com/coding/v1`
- Model: `kimi-for-coding` (max_context_size=262144)

## Phenomenon / Symptom
After a single long session runs to context ≥ ~120k input tokens, each step has a higher probability
Timeout occurs during the new HTTPS connection phase, and `openai.APITimeoutError: Request timed out.` is reported.
The bottom layer is `httpcore.ConnectTimeout` (the connection handshake is not completed, not the read phase).

`_run_with_connection_recovery` will try again, sometimes it can be pulled up, sometimes it will be exhausted directly.
("Chat provider recovery exhausted for step").

## Evidence / Evidence

Log statistics from a session:

| log | timeout times |
|---|---|
| 2026-05-25 Second copy | 24 |
| 2026-05-27 | 18 |
| 2026-05-28 (today) | 53 |

The `input` of all failed steps are ≥ ~120k tokens, and the heap of this session has reached 154k;
There is also a large period of time in the middle that works normally (for example: 13:17–13:20, 11 consecutive steps are all successful, input 135k–149k),
The concentrated outbreak occurred within several narrow time windows (09:40 / 11:28 / 12:38 / 14:41).

Typical stack:
```
File ".../httpcore/_async/connection.py", line 124, in _connect
httpcore.ConnectTimeout
httpx.ConnectTimeout
    raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
  File ".../kosong/chat_provider/kimi.py", line 170, in generate
    stream = await chat_provider.generate(system_prompt, tools, history)
kosong.chat_provider.APITimeoutError: Request timed out.
  File ".../kimi_cli/soul/kimisoul.py", line 1417, in _run_with_connection_recovery
    raise convert_error(e) from e
```

`api.kimi.com` was directly connected to health:
```
$ curl -o /dev/null -w "connect=%{time_connect}s tls=%{time_appconnect}s\n" https://api.kimi.com/coding/v1/
connect=0.234s tls=0.279s
```
DNS found `volcddos.com` (Volcano engine anti-D edge), so it is suspected that the edge node is
Instantaneous connection rejection for large body/high-frequency new connections**, and `connect_timeout` of httpx inside kimi-cli
If it is too short (default ~5s), give up directly, neither letting the user adjust nor exponential backoff.

## Expected / Expected
1. Expose httpx’s `connect_timeout` / overall `timeout` to `config.toml`
   (similar to `providers.<name>.http.connect_timeout`), allowing users with large contexts to increase the
2. `_run_with_connection_recovery` uses exponential backoff (currently it looks like 1 retry at a fixed narrow interval,
   Fail together when catching up with the node current limiting window)
3. Optional: retry `ConnectTimeout` several times separately and treat it differently from read-timeout
   (Because the server has not received any token during the connect phase, retrying has no side effects)

## Reproduce / Reproduce
1. Open a long session and keep pushing until the history pile reaches ~120k+ input tokens
2. Continue the normal conversation, and you will observe step failure + APITimeoutError after a while.
3. At this time, directly `curl https://api.kimi.com/coding/v1/` usually still returns within seconds → No
   The underlying network is dead. It is the combined effect of the new connection pool + WAF current limiting.

Session ID: `c5567dae-2551-426a-9db7-a46bb3b7b225`
(The complete log can be provided on demand, and the zip is not attached to the issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

大 context 请求频繁 ConnectTimeout，httpx connect_timeout 不可配 || Large context requests frequent ConnectTimeout, httpx connect_timeout is not configurable #2384

环境 / Environment

现象 / Symptom

证据 / Evidence

期望 / Expected

复现 / Reproduce

Environment / Environment

Phenomenon / Symptom

Evidence / Evidence

Expected / Expected

Reproduce / Reproduce

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

大 context 请求频繁 ConnectTimeout，httpx connect_timeout 不可配 || Large context requests frequent ConnectTimeout, httpx connect_timeout is not configurable #2384

Description

环境 / Environment

现象 / Symptom

证据 / Evidence

期望 / Expected

复现 / Reproduce

Environment / Environment

Phenomenon / Symptom

Evidence / Evidence

Expected / Expected

Reproduce / Reproduce

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions