fix(agent): suppress empty-provider-response from Sentry (TAURI-RUST-4JX)#2790
fix(agent): suppress empty-provider-response from Sentry (TAURI-RUST-4JX)#2790CodeGhost21 wants to merge 1 commit into
Conversation
…4JX)
`agent::harness::session::turn` returned an anonymous `anyhow::anyhow!(
"The model returned an empty response. Please try again.")` when the
provider's chat completion contained no text, no thinking, and no tool
calls. That bubbled to `run_single`'s catch-all `report_error_or_expected`
arm and shipped to Sentry as TAURI-RUST-4JX.
The latest event shows the typical trigger: a Windows user running LM
Studio locally with a community fine-tune
(`qwen3.6-27b-heretic-uncensored-finetune-neo-code-di-imatrix-max`) that
returned an empty stream. That's a model / user-config outcome, not an
OpenHuman bug — the UI already surfaces the user-facing string, and
there is no developer remediation path through Sentry.
Mirror the existing `MaxIterationsExceeded` pattern:
1. Add `AgentError::EmptyProviderResponse { iteration }` with a `Display`
impl that emits the verbatim user-facing string (so UI / fingerprint
contract is preserved).
2. Replace the anonymous `anyhow!` at `turn.rs:805` with the typed
variant, retaining the warn-level breadcrumb that records the
surfacing decision.
3. Introduce `AgentError::skips_sentry()` as the single source of truth
for which variants get suppressed (`MaxIterationsExceeded` +
`EmptyProviderResponse`), and call it from `run_single` in place of
the inline `MaxIterationsExceeded`-only check.
4. Extend `sanitize_event_error_message` (Sentry error_kind tag) and
`agent_error_to_user_message` (cron job user-facing copy) with arms
for the new variant — required by the non-exhaustive match contract,
and gives cron job failures actionable canned copy.
Tests added in `agent::error`:
- `Display` returns the canonical user-facing string (locks the wire
shape against regressions).
- `skips_sentry()` returns true for both suppressed variants and false
for every other AgentError variant (positive + negative coverage).
The user still sees the same error, the `Err` still propagates, and
the `AgentError` domain event / `recoverable` semantics are unchanged.
Sentry just stops getting paged for it.
## Test plan
- [x] `cargo test openhuman::agent::error::tests` — 9 tests pass (4 new)
- [x] `cargo test openhuman::agent` — 751 tests pass, 0 regressions
- [x] `cargo test openhuman::cron::scheduler` — 50 tests pass, 0 regressions
- [x] `cargo check --bin openhuman-core` — passes
- [x] `cargo fmt --check` on touched files — clean
📝 WalkthroughWalkthroughThis change introduces typed error classification for degenerate provider responses (no text, no thinking, no tool calls). A new ChangesEmpty Provider Response Error Handling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
@CodeGhost21 hey! the code looks good to me — clean approach to centralising the Sentry suppression policy behind skips_sentry(), the typed variant is the right call over the anonymous anyhow!, and the test coverage is solid (locking the wire string, both suppressed variants, and all 7 real-failure variants). but there are CI failures on this PR (PR Submission Checklist is failing and most checks are still pending), so i'll hold off on approving until those are green. once CI is clean, i'll come back and approve. let me know if you need any help!
Summary
AgentError::EmptyProviderResponse { iteration }and routeagent::harness::session::turn's empty-final-response bail through it, so the existing user-facing string is preserved while the typed variant flows up torun_single.AgentError::skips_sentry()method covering bothMaxIterationsExceededand the new variant;run_singlecalls it in place of the inlineMaxIterationsExceeded-only check.AgentError/recoverablesemantics.Why
Latest event payload: Windows user, LM Studio at
http://localhost:1234, custom community fine-tune (qwen3.6-27b-heretic-uncensored-finetune-neo-code-di-imatrix-max). Provider response istext_chars=0 thinking_chars=0 tool_calls=0in 60 ms — completely empty.agent/harness/session/turn.rs:805then returned:That bubbled to
run_single(agent/harness/session/runtime.rs:540), which routes anything other thanAgentError::MaxIterationsExceededthroughreport_error_or_expected("agent", "run_single", …)→ Sentry. The result is the same shape as OPENHUMAN-TAURI-99 / -98 (max-iter cap) before that fix: user-state outcome shipped to Sentry as a code-bug.It's not actionable from Sentry (the user picked a flaky local model), the UI already surfaces the user-facing message, and the deeper "fix" lives in the user's model / provider config. Same call for suppression as the max-iter cap.
What changed
src/openhuman/agent/error.rs— new variantEmptyProviderResponse { iteration: usize }with aDisplayarm that emits the verbatim user-facing string (so UI surfaces and any external grep/fingerprint contracts hold). Newskips_sentry(&self) -> boolmethod as single source of truth for the suppressed set: today{ MaxIterationsExceeded, EmptyProviderResponse }.src/openhuman/agent/harness/session/turn.rs:805— replace anonymousanyhow::anyhow!withAgentError::EmptyProviderResponse { iteration: iteration + 1 }.into(). Warn-level breadcrumb that recorded the surfacing decision is preserved.src/openhuman/agent/harness/session/runtime.rs:540— replacelet is_max_iter = matches!(err.downcast_ref::<AgentError>(), Some(AgentError::MaxIterationsExceeded { .. }))witherr.downcast_ref::<AgentError>().is_some_and(AgentError::skips_sentry). Log message generalised to "user-state agent error". Comment updated.src/openhuman/agent/harness/session/runtime.rs:388(sanitize_event_error_message) — new arm returning"empty_provider_response"for the Sentryerror_kindtag (required by the non-exhaustive match).src/openhuman/cron/scheduler.rs:37(agent_error_to_user_message) — new arm returning actionable canned copy: "The model returned an empty response. Try a different model or check your local provider in Settings → AI → LLM." Required by the same non-exhaustive contract; gives cron job failures a useful message instead of the generic fallback.Tests added (
agent::error::tests)empty_provider_response_display_matches_user_facing_string— locks the wire shape against accidental message changes (also a Sentry-fingerprint stability guarantee).skips_sentry_returns_true_for_known_user_state_variants—MaxIterationsExceeded+EmptyProviderResponse.skips_sentry_returns_false_for_real_failures— covers all 7 other variants (ProviderError,ContextLimitExceeded,ToolExecutionError,CostBudgetExceeded,CompactionFailed,PermissionDenied,Other); any future variant that should also suppress must be added toskips_sentry()and this test, both at once.Test plan
cargo test openhuman::agent::error::tests— 9 tests pass (4 new)cargo test openhuman::agent— 751 tests pass, 0 regressionscargo test openhuman::cron::scheduler— 50 tests pass, 0 regressionscargo check --manifest-path Cargo.toml --bin openhuman-core— passescargo fmt --checkon touched files — cleanPost-merge observation: TAURI-RUST-4JX should drop to ~0 events on the next release. The variant still produces structured
log::info!lines locally for diagnosability, so a real spike will still be visible in shipped logs (just not in Sentry).