fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB) by M3gA-Mind · Pull Request #2876 · tinyhumansai/openhuman

M3gA-Mind · 2026-05-28T22:21:07Z

Summary

Rebased and fixed version of #2817 by @CodeGhost21.

What changed from #2817:

Added one additional commit to fix the two failing tests (parse_native_response_captures_reasoning_content and parse_native_response_blank_reasoning_is_none). The tests correctly expected reasoning_content to be trimmed and whitespace-only values to be None, but the implementation was cloning the raw untrimmed string. Fixed with the same .as_deref().map(str::trim).filter(!s.is_empty()).map(str::to_owned) pattern used in the chat_with_tools and streaming paths.
Merged current upstream/main to resolve the stale base (includes fix(inference): preserve reasoning_content in multi-turn thinking model conversations #2818, fix(observability): classify list_models 404 as ProviderUserState (Sentry TAURI-RUST-YJ) #2873, fix(cron): accept bare cron-expression string in Schedule deserializer (Sentry CORE-RUST-FY) #2874, etc.).

Note on supersede claim: Despite the comment on #2817 claiming it was superseded by #2818, that is not accurate. PR #2818 (already on main) fixes the main session-turn path via extra_metadata in turn.rs. PR #2817 fixes two separate paths:

tool_loop.rs — passes reasoning_content to build_native_assistant_history
subagent_runner/ops.rs — same
chat_with_tools — returns actual reasoning_content instead of None
convert_messages_for_native — lifts reasoning_content from JSON content (with fallback to extra_metadata)

These are complementary to #2818, not redundant with it. On current main, multi-turn reasoning model tool calls via the tool loop still fail because build_native_assistant_history doesn't embed reasoning_content.

Closes #2817.

Original PR description from @CodeGhost21:

DeepSeek's thinking mode rejects multi-turn tool calls because we never replayed the model's reasoning_content on the follow-up request.
Round-trips reasoning_content for tool-call assistant turns through all four layers of the OpenAI-compatible inference path.
Gated by skip_serializing_if = Option::is_none so non-reasoning providers see zero change on the wire.
Fixes Sentry TAURI-RUST-4KB (issue 5236) — 31 events since v0.56.0, all multi-turn deepseek-reasoner tool calls.

Test plan

cargo test --lib inference::provider::compatible::tests::parse_native_response — 7 passed, 0 failed (includes the 2 previously-failing tests)
cargo test --lib "reasoning" (26 tests), cargo test --lib "agent::" (890 tests), cargo test --lib "inference::provider::" (316 tests) — all pass
cargo fmt --check clean
Diff coverage ≥ 80%

…(Sentry TAURI-RUST-4KB) Resolves Sentry issue 5236 (TAURI-RUST-4KB): https://sentry.tinyhumans.ai/organizations/tinyhumans/issues/5236/ DeepSeek's thinking mode returns `reasoning_content` alongside `tool_calls` and requires that reasoning to be replayed on the follow-up request. Our OpenAI-compatible provider dropped it: `ChatResponse`, the assistant history JSON, and the `NativeMessage` wire type had no carrier for `reasoning_content`, so the next request omitted it and DeepSeek returned: 400 Bad Request: The `reasoning_content` in the thinking mode must be passed back to the API. The agent loop (`run_chat_task`) then failed every multi-turn tool call against deepseek-reasoner (31 events since v0.56.0). Fix: round-trip `reasoning_content` for tool-call assistant turns across all four layers — - `ChatResponse.reasoning_content` (captured in `parse_native_response` and `chat_with_tools`, trimmed; empty -> None) - `build_native_assistant_history` writes it into the assistant history JSON (omitted when empty) - `convert_messages_for_native` lifts it back onto the wire message - `NativeMessage.reasoning_content` serializes only when present Because the field is `skip_serializing_if = Option::is_none` and only populated for reasoning models, non-reasoning providers see zero change on the wire. Tests: provider capture (`parse_native_response_captures_reasoning_content`, blank -> None), wire round-trip (`convert_preserves/omits_reasoning_content`), and history-builder round-trip in `parse_tests`.

…Response initializers The new `ChatResponse.reasoning_content` field was added to every `src/` initializer but the `tests/calendar_grounding_e2e.rs` integration test was missed, so the test build failed to compile (error[E0063]: missing field `reasoning_content`). That broke the Rust Core Tests + Quality, Rust Core Coverage, and Linux Rust integration-suite checks on this PR. Set the field to None at both mock-provider initializers; `cargo test --no-run` now compiles all test targets cleanly.

…ontent Resolved conflicts in: - inference/provider/traits.rs: doc comment wording (took main's) - inference/provider/compatible_types.rs: doc comment wording (took main's) - inference/provider/compatible.rs: combined both storage approaches — prefer JSON-content (tool_loop path) or fall back to extra_metadata (session-turn path), so both replay paths work correctly - agent/dispatcher_tests.rs: indentation (took main's) - agent/harness/session/turn_tests.rs: indentation (took main's) - agent/tests.rs: indentation (took main's)

Both the PR and main added parse_native_response_captures_reasoning_content testing different code paths. Rename the second one (non-streaming API response path) to avoid the duplicate symbol compile error.

The two tests added in this PR (parse_native_response_captures_reasoning_content and parse_native_response_blank_reasoning_is_none) expected the field to be normalised: trimmed and empty-after-trim → None. The implementation was cloning the raw value verbatim, so whitespace-padded strings weren't trimmed and whitespace-only strings weren't collapsed to None. Apply the same `.as_deref().map(str::trim).filter(!s.is_empty()).map(str::to_owned)` pattern already used in the chat_with_tools and streaming paths.

…st-4kb-deepseek-reasoning-content

coderabbitai · 2026-05-28T22:21:15Z

Warning

Review limit reached

@M3gA-Mind, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a664432-3cfc-4861-9e24-1bf47d90ef8e

📥 Commits

Reviewing files that changed from the base of the PR and between 7fbcbe8 and 6de6c1a.

📒 Files selected for processing (6)

src/openhuman/agent/harness/parse.rs
src/openhuman/agent/harness/parse_tests.rs
src/openhuman/agent/harness/subagent_runner/ops.rs
src/openhuman/agent/harness/tool_loop.rs
src/openhuman/inference/provider/compatible.rs
src/openhuman/inference/provider/compatible_tests.rs

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CodeGhost21 and others added 6 commits May 28, 2026 10:40

fix(inference): rename duplicate test to resolve compile error

aab7b59

Both the PR and main added parse_native_response_captures_reasoning_content testing different code paths. Rename the second one (non-streaming API response path) to avoid the duplicate symbol compile error.

Merge remote-tracking branch 'upstream/main' into fix/sentry-tauri-ru…

6de6c1a

…st-4kb-deepseek-reasoning-content

M3gA-Mind requested a review from a team May 28, 2026 22:21

M3gA-Mind mentioned this pull request May 28, 2026

fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB) #2817

Closed

12 tasks

M3gA-Mind merged commit fa50ceb into tinyhumansai:main May 28, 2026
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB)#2876

fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB)#2876
M3gA-Mind merged 6 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/sentry-tauri-rust-4kb-deepseek-reasoning-content

M3gA-Mind commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

M3gA-Mind commented May 28, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented May 28, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants