fix(agent): replay reasoning_content across native tool-call turns to prevent DeepSeek 400s by YellowSnnowmann · Pull Request #2918 · tinyhumansai/openhuman

YellowSnnowmann · 2026-05-29T07:55:37Z

Summary

Persisted assistant reasoning_content on ConversationMessage::AssistantToolCalls so thinking-mode context survives tool-call turns.
Updated native dispatcher serialization to include reasoning_content in assistant tool-call payloads sent back to providers.
Patched agent session turn flow to capture and store non-empty reasoning_content when persisting assistant tool-call intents.
Added a targeted regression test to verify reasoning_content is preserved in native tool-call history replay.
Updated affected test fixtures/constructors for the new optional field to keep behavior backward-compatible.

Problem

Sentry issue TAURI-RUST-4K9 reports repeated DeepSeek 400 failures: The reasoning_content in the thinking mode must be passed back to the API.
In multi-turn/native tool-call flows, reasoning output could be dropped before follow-up provider calls, causing request rejection.
This resulted in high error volume and degraded reliability for thinking-capable providers.

Solution

Extended ConversationMessage::AssistantToolCalls with optional reasoning_content (serde-defaulted for compatibility with existing persisted transcripts).
In agent turn execution, when assistant returns tool calls, capture trimmed non-empty response.reasoning_content and persist it alongside tool-call history.
In NativeToolDispatcher::to_provider_messages, include reasoning_content in serialized assistant tool-call payload JSON so provider replay contract is satisfied.
Preserved existing TAURI-RUST-7 pairing safeguards (unpaired tool-call cycles are still dropped), and adapted regression testing to valid paired cycles.
Tradeoff: slightly larger serialized assistant tool-call payloads, but only when thinking content exists.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change (or N/A: behaviour-only change)
All affected feature IDs from the matrix are listed in the PR description under ## Related
No new external network dependencies introduced (mock backend used per Testing Strategy)
Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)
Linked issue closed via Closes #NNN in the ## Related section

Impact

Runtime/platform: Rust core + Tauri desktop inference path; affects native tool-calling providers in multi-turn sessions.
Compatibility: backward-compatible transcript/history schema change (reasoning_content is optional + serde default).
Performance: negligible overhead (small optional metadata field/payload extension).
Security: no new permissions, secrets, or network surfaces introduced.

New Features
- Conversation messages now optionally capture and propagate model "reasoning content" alongside tool-call results; this is included in outgoing payloads when present and omitted when absent.
- Token estimation and transcript/history reconstruction include reasoning content when available.
Testing
- Added and updated tests to verify serialization, preservation in history, omission when absent, and related handling.

…pdate related functions

…te serialization logic

coderabbitai · 2026-05-29T07:55:45Z

📝 Walkthrough

Walkthrough

Adds an optional reasoning_content: Option to AssistantToolCalls, captures it from provider responses, persists it in turns, serializes it into dispatcher-emitted provider JSON when present, updates token accounting, and adjusts tests/fixtures and pattern matches for compatibility.

Changes

Reasoning Content Propagation

Layer / File(s)	Summary
Schema field definition `src/openhuman/inference/provider/traits.rs`	`ConversationMessage::AssistantToolCalls` now includes optional `reasoning_content: Option<String>` with Serde attributes to skip when `None` and default when absent.
Response capture and turn persistence `src/openhuman/agent/harness/session/turn.rs`	Extracts `response.reasoning_content`, trims it, and persists it on `AssistantToolCalls` when non-empty.
Provider serialization and regression tests `src/openhuman/agent/dispatcher.rs`, `src/openhuman/agent/dispatcher_tests.rs`	`NativeToolDispatcher::to_provider_messages` serializes `reasoning_content` into the assistant message JSON payload. Adds tests that assert presence when set and omission when None; updates test helpers to set `reasoning_content: None` where appropriate.
Token budget accounting `src/openhuman/agent/harness/token_budget.rs`	Token estimation for `AssistantToolCalls` now includes `reasoning_content` when present; corresponding test fixture updated.
Pattern matching and fixture compatibility `src/openhuman/agent/tests.rs`, `src/openhuman/agent/harness/session/runtime_tests.rs`, `src/openhuman/agent/harness/session/turn_tests.rs`, `src/openhuman/context/*`, `src/openhuman/context/summarizer.rs`, `src/openhuman/context/segment_recap_summarizer.rs`	Updated fixtures to set `reasoning_content: None` where appropriate and adjusted match patterns to use `..` or include the new field to keep existing logic unchanged.

Sequence Diagram

sequenceDiagram
  participant Provider
  participant Turn
  participant History
  participant Dispatcher
  Provider->>Turn: response with reasoning_content
  Turn->>History: construct AssistantToolCalls with reasoning_content
  History->>Dispatcher: serialize message for provider
  Dispatcher->>Provider: send JSON with reasoning_content field

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

tinyhumansai/openhuman#2818: Related work on capturing and persisting response.reasoning_content from provider responses.
tinyhumansai/openhuman#2100: Related changes to token-budget handling for assistant tool-call messages.

Suggested Reviewers

M3gA-Mind

Poem

🐰 I nibble through the message stream,
Tuck the model's thinking in a seam,
Saved and sent when it is found,
Quiet reasoning, wrapped and sound. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: persisting and replaying reasoning_content across native tool-call turns to fix a DeepSeek provider issue.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/harness/token_budget.rs`:
- Around line 36-38: The AssistantToolCalls arm in token_budget.rs currently
ignores the new reasoning_content field; update the pattern matching for
ConversationMessage::AssistantToolCalls to capture reasoning_content and include
its token count when estimating tokens for that message in
trim_conversation_history_to_budget by calling
model_token_estimator.tokens_for_text(&reasoning_content) (or the project's
equivalent tokens_for_text method) and adding it to the existing text/tool_calls
token sum so large reasoning payloads are counted in the budget.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e24d24cf-19fb-40d4-8b6c-d8892c85a83c

📥 Commits

Reviewing files that changed from the base of the PR and between c7f4da7 and b396587.

📒 Files selected for processing (14)

src/openhuman/agent/dispatcher.rs
src/openhuman/agent/dispatcher_tests.rs
src/openhuman/agent/harness/session/runtime_tests.rs
src/openhuman/agent/harness/session/turn.rs
src/openhuman/agent/harness/session/turn_tests.rs
src/openhuman/agent/harness/token_budget.rs
src/openhuman/agent/tests.rs
src/openhuman/context/manager_tests.rs
src/openhuman/context/microcompact.rs
src/openhuman/context/pipeline.rs
src/openhuman/context/segment_recap_summarizer.rs
src/openhuman/context/summarizer.rs
src/openhuman/context/summarizer_tests.rs
src/openhuman/inference/provider/traits.rs

… AssistantToolCalls

graycyrus

@YellowSnnowmann this is a solid fix for the DeepSeek 400s — the approach is right and the backward compat via serde default is the correct call. CI is still pending so holding off on approving until those pass.

One thing to check in the meantime: in NativeToolDispatcher::to_provider_messages, reasoning_content is included directly in serde_json::json!, which means when it's None, the serialized assistant payload includes "reasoning_content": null rather than omitting the key entirely. For DeepSeek and other thinking-capable providers that's fine since the session had no reasoning to replay. But other providers routing through NativeToolDispatcher will now receive a field they've never seen before. Most JSON APIs tolerate unexpected null fields, but worth verifying against each provider's schema — or conditionally including the key only when reasoning_content is Some.

Fix CI and i'll approve.

…e reasoning_content in output and add test for omission

coderabbitai

🧹 Nitpick comments (1)

src/openhuman/agent/dispatcher_tests.rs (1)
436-452: 🏗️ Heavy lift

Split this test module to stay within the repo’s Rust file-size guideline.

This file is now ~545 lines, and this new test adds more surface to an already oversized module. Please extract related test groups (for example, reasoning-content replay/omission cases) into dedicated sibling test files to keep this module maintainable.

As per coding guidelines, "**/*.{ts,tsx,rs}: Keep React component files and Rust modules at ≤ ~500 lines; split growing modules into multiple files" and "{src,app/src,app/src-tauri}/**/*.{rs,ts,tsx}: File size: prefer ≤ ~500 lines per source file; split modules when growing to maintain readability and single responsibility".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/dispatcher_tests.rs` around lines 436 - 452, The test
module has grown over the file-size guideline; extract the reasoning-content
tests into a new sibling test file: create a new tests module (e.g., a
dispatcher_reasoning tests file) and move
native_dispatcher_omits_reasoning_content_when_absent and any paired
replay/omission tests there, keeping the exact test names
(native_dispatcher_omits_reasoning_content_when_absent) and references to
NativeToolDispatcher and to_provider_messages; ensure the new file imports or
re-exports required helpers (assistant_tool_calls, tool_results) and types
(NativeToolDispatcher) from the original module (make helpers pub(crate) or add
appropriate use crate::... paths) so the tests compile, and update module
declarations if needed so the test suite still runs.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/agent/dispatcher_tests.rs`:
- Around line 436-452: The test module has grown over the file-size guideline;
extract the reasoning-content tests into a new sibling test file: create a new
tests module (e.g., a dispatcher_reasoning tests file) and move
native_dispatcher_omits_reasoning_content_when_absent and any paired
replay/omission tests there, keeping the exact test names
(native_dispatcher_omits_reasoning_content_when_absent) and references to
NativeToolDispatcher and to_provider_messages; ensure the new file imports or
re-exports required helpers (assistant_tool_calls, tool_results) and types
(NativeToolDispatcher) from the original module (make helpers pub(crate) or add
appropriate use crate::... paths) so the tests compile, and update module
declarations if needed so the test suite still runs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1f773ed6-c38b-4f11-82c6-339e0e52885c

📥 Commits

Reviewing files that changed from the base of the PR and between e2aeba9 and 5c3079b.

📒 Files selected for processing (2)

src/openhuman/agent/dispatcher.rs
src/openhuman/agent/dispatcher_tests.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/openhuman/agent/dispatcher.rs

graycyrus

@YellowSnnowmann hey! previous changes are addressed — the conditional reasoning_content insertion landed in 5c3079b exactly as requested. code looks good to me, but there are still some CI checks pending (Windows Rust tests, E2E suites, Tauri build). once those are green, i'll come back and approve this.

oxoxDev

LGTM. Clean, correctly-scoped fix for TAURI-RUST-4K9 — reasoning_content is persisted on AssistantToolCalls and replayed so DeepSeek thinking-mode follow-ups stop 400ing.

Verified:

#[serde(default, skip_serializing_if = "Option::is_none")] keeps old transcripts deserializing (None) and never emits null — the exact DeepSeek-reject case @graycyrus flagged, fixed via the conditional payload["reasoning_content"] insert in commit 5c3079b.
turn.rs trims + drops empty reasoning before storing.
token_budget now counts reasoning tokens, so the budget estimate doesn't under-count.
Good coverage: serde roundtrip, serialize-present, omit-absent, token estimate.

Minor (optional, non-blocking)

reasoning_content is replayed for every past AssistantToolCalls in history, but DeepSeek only needs it on the most recent pre-tool-call turn. Long thinking-mode sessions will carry every turn's reasoning blob forward — it's budget-counted and microcompact/summarizer will trim it, so it's bounded, but if token cost shows up consider replaying only the latest turn's reasoning. Approving as-is.

oxoxDev · 2026-05-29T11:45:11Z

Merged ✅ — thanks @graycyrus for the close review here, especially catching the serde_json::json! null-serialization issue ("reasoning_content": null would have been sent to DeepSeek when absent). The switch to conditional insertion in 5c3079b resolved it cleanly.

CI was fully green and the change reviewed sane on a re-read — backward-compatible #[serde(default, skip_serializing_if = "Option::is_none")], reasoning trimmed/filtered before persist, and token-budget accounting updated — so we went ahead and merged. Nice fix for TAURI-RUST-4K9.

YellowSnnowmann added 2 commits May 29, 2026 13:20

feat(conversation): add reasoning_content to AssistantToolCalls and u…

9984d0a

…pdate related functions

feat(tests): add reasoning_content handling in various tests and upda…

b396587

…te serialization logic

YellowSnnowmann marked this pull request as ready for review May 29, 2026 07:57

YellowSnnowmann requested a review from a team May 29, 2026 07:57

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. bug labels May 29, 2026

coderabbitai Bot requested changes May 29, 2026

View reviewed changes

Comment thread src/openhuman/agent/harness/token_budget.rs

feat(token_budget): include reasoning_content in token estimation for…

e2aeba9

… AssistantToolCalls

coderabbitai Bot previously approved these changes May 29, 2026

View reviewed changes

graycyrus reviewed May 29, 2026

View reviewed changes

Comment thread src/openhuman/agent/dispatcher.rs Outdated

feat(dispatcher): update NativeToolDispatcher to conditionally includ…

5c3079b

…e reasoning_content in output and add test for omission

YellowSnnowmann dismissed coderabbitai[bot]’s stale review via 5c3079b May 29, 2026 08:31

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

coderabbitai Bot approved these changes May 29, 2026

View reviewed changes

graycyrus reviewed May 29, 2026

View reviewed changes

oxoxDev approved these changes May 29, 2026

View reviewed changes

oxoxDev merged commit 9f3e161 into tinyhumansai:main May 29, 2026
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): replay reasoning_content across native tool-call turns to prevent DeepSeek 400s#2918

fix(agent): replay reasoning_content across native tool-call turns to prevent DeepSeek 400s#2918
oxoxDev merged 4 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/reasoning-content-replay

YellowSnnowmann commented May 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Suggested Reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

graycyrus left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

graycyrus left a comment

Uh oh!

oxoxDev left a comment

Uh oh!

Uh oh!

oxoxDev commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

YellowSnnowmann commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Suggested Reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

oxoxDev left a comment

Choose a reason for hiding this comment

Minor (optional, non-blocking)

Uh oh!

Uh oh!

oxoxDev commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YellowSnnowmann commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading