Skip to content

fix(inference): distinguish missing-data-field from wrong-type in /models parser (Sentry TAURI-RUST-4Y)#2794

Merged
oxoxDev merged 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-4y-list-models-parser-shape
May 28, 2026
Merged

fix(inference): distinguish missing-data-field from wrong-type in /models parser (Sentry TAURI-RUST-4Y)#2794
oxoxDev merged 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-4y-list-models-parser-shape

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented May 27, 2026

Summary

inference_list_models parses the OpenAI-compatible /models envelope in src/openhuman/inference/provider/ops.rs. The pre-fix let-else at line 154 collapsed two distinct failure modes into one misleading error string — that's the bug behind Sentry TAURI-RUST-4Y (133 events: 131 on stale 0.54.0, 5 on current 0.56.0, ongoing).

Failure mode Pre-fix message New message
data field absent provider response missing 'data' array … (got keys: …) unchanged — preserves Sentry fingerprint for the existing stale-client population
data field present but wrong type (object/null/string) Same string — keys list included data, contradicting the "missing" claim provider response has 'data' field but it is <kind>, expected array … ("object" = "<value>") — names the actual JSON type and surfaces the sibling object field for triage
Non-object top-level body Fell through to "missing" arm with <non-object> placeholder Explicit provider response is not a JSON object … (got <kind> at top level)

Why the title looked self-contradictory

Look at the latest 4Y event title: "provider response missing 'data' array — endpoint is not OpenAI-compatible (got keys: data, object)". The keys list literally includes data, but the message claims it's missing — because the old let-else matched both "no key" AND "key present but .as_array() returned None". An upstream like LM Studio returning {"object":"error","data":{...error details...}} at HTTP 200 hits the second case but emits the first message.

Triage saw "data is in the keys but message says missing" and stalled. This PR splits the arm so the actual JSON shape lands in Sentry on the next event.

What changed

  • Extracted parse_models_response(&serde_json::Value) -> Result<Vec<ModelInfo>, String> as a pure helper next to the inline call site. The HTTP / status / error-envelope handling above it is untouched.
  • New private json_value_kind(&Value) -> &'static str for the error-message JSON-kind tokens — kept beside the parser so test assertions on the rendered tokens stay in lock-step with the matcher.
  • 3 new tests:
    • parse_models_response_returns_models_for_well_formed_data_array — happy path including owned_by / context_length / context_window projections.
    • parse_models_response_distinguishes_missing_data_field_from_wrong_type — pins the new distinction for object / string / null / bool / number.
    • parse_models_response_handles_non_object_body — covers bare array / string / null top-level bodies.

Related (not blocking)

Test plan

  • cargo test parse_models_response — 3 new tests pass
  • cargo test openhuman::inference::provider::ops — 27 tests pass, 0 regressions
  • cargo check --manifest-path Cargo.toml --bin openhuman-core — passes
  • cargo fmt --check on touched file — clean

Post-merge observation: TAURI-RUST-4Y events on releases ≥ this fix should split into either (a) the unchanged "missing data field" message (genuine wrong-endpoint typos), or (b) a new fingerprint with the actual JSON-kind in the title (provider-shape problems). The latter becomes its own triageable Sentry issue with actionable context.

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced model listing from OpenAI-compatible endpoints with improved data validation and clearer error reporting.
  • Bug Fixes

    • Improved error messages when model data is malformed or missing required fields.
  • Tests

    • Added comprehensive unit tests for model response parsing covering success paths, missing/invalid fields, and edge cases.

Review Change Stack

…dels parser (Sentry TAURI-RUST-4Y)

`inference_list_models` parses the OpenAI-compatible `/models` envelope
at `src/openhuman/inference/provider/ops.rs`. Before this fix the
`let-else` at line 154 collapsed two distinct failure modes into one
misleading error string:

- `data` field absent → `"provider response missing 'data' array
  — endpoint is not OpenAI-compatible (got keys: …)"` (expected).
- `data` field present but wrong type (object/null/string) → SAME
  error string, with the keys list including `"data"` —
  contradicting the "missing" claim. This produced the TAURI-RUST-4Y
  events titled `"... missing 'data' array ... (got keys: data,
  object)"` (133 events: 131 on 0.54.0 stragglers, 5 on current
  0.56.0).

The collapse made the issue look like a parser hallucination at
triage time — the keys list said `data` IS present, but the message
said it was missing. The actual cause is providers returning a
non-array `data` value (e.g. an OpenAI-style error envelope
`{"object":"error","data":{…}}` served at HTTP 200).

Refactor extracts the parsing logic into a pure
`parse_models_response(&serde_json::Value) -> Result<Vec<ModelInfo>,
String>` helper with three distinct error arms:

1. Non-object top-level body — surfaces the actual JSON kind.
2. Missing `data` field — original wire shape, preserved verbatim so
   the existing Sentry fingerprint stays stable for that population.
3. `data` present but wrong type — new arm naming the actual JSON
   type (`object`/`string`/`null`/…) and surfacing the sibling
   `"object"` field value (OpenAI servers set it to `"list"` on
   success, `"error"` on failure — fastest triage signal).

3 new tests in `provider::ops::tests` cover all three arms plus a
happy-path projection check (`owned_by` / `context_length` /
`context_window` aliases). 27/27 module tests pass, 0 regressions.

## Related

PR tinyhumansai#2785 (Sentry TAURI-RUST-28Z) touches the same file but at a
different section (cloud_providers lookup, lines 46-58 + a new
`synthesize_local_runtime_entry` helper). The two fixes are
complementary — tinyhumansai#2785 makes more requests reach the parser by
fixing the lookup-miss case for ollama/lmstudio; this PR makes the
parser produce triageable errors when those requests hit malformed
responses. No merge conflict.

## Test plan
- [x] `cargo test parse_models_response` — 3 new tests pass
- [x] `cargo test openhuman::inference::provider::ops` — 27 tests pass, 0 regressions
- [x] `cargo check --bin openhuman-core` — passes
- [x] `cargo fmt --check` — clean
@CodeGhost21 CodeGhost21 requested a review from a team May 27, 2026 21:41
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bb455583-49bd-4f26-afba-d0b22311a397

📥 Commits

Reviewing files that changed from the base of the PR and between d8696c1 and 7113015.

📒 Files selected for processing (1)
  • src/openhuman/inference/provider/ops.rs

📝 Walkthrough

Walkthrough

The PR refactors the OpenAI-compatible /models response parser to validate JSON structure strictly, distinguish missing data fields from incorrectly-typed ones with specific error diagnostics, convert array entries to typed ModelInfo objects, and integrates the improved parser into the model listing RPC endpoint with comprehensive test coverage.

Changes

Models response parser validation and integration

Layer / File(s) Summary
Parser validation and model extraction
src/openhuman/inference/provider/ops.rs
parse_models_response validates the response is a JSON object, requires a data field, and produces distinct errors for missing vs incorrectly-typed data (including actual JSON kind and optional object field context). Array entries are converted to ModelInfo by extracting id (string; entries without it are skipped), owned_by, and context_window. New json_value_kind helper renders JSON value types for error messages. Unit tests cover well-formed parsing, missing vs wrong-type data error messages, and non-object top-level rejection.
Integration with model listing endpoint
src/openhuman/inference/provider/ops.rs
list_configured_models_from_config delegates /models parsing to parse_models_response, logs the parsed model count, and returns an RPC outcome with the models list and a "fetched N models" status message.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

bug

Suggested reviewers

  • graycyrus
  • M3gA-Mind
  • senamakel

Poem

🐰 Through JSON fields and types we hop,
Validating data, never stop,
When data missing or type's all wrong,
Error messages sing their song,
Models parsed with care so deep,
Our parser's promises we keep! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing a bug in the /models parser that distinguishes between missing data fields and wrong-type data fields.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug label May 27, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix for TAURI-RUST-4Y. The root cause is exactly right — the old let-else chain chained .get("data").and_then(|d| d.as_array()), so any non-array data value fell through to the missing-field arm and produced the contradiction you were seeing in Sentry (keys list included data but message said "missing").

The extraction into parse_models_response is clean — three explicit failure arms, each with a distinct, actionable error string. json_value_kind is exhaustive over all serde_json::Value variants and returns &'static str, which is the right type here. The tests are solid: the parametric wrong-type loop covers the full data-present-but-not-array space, and the happy path exercises context_length / context_window aliasing and owned_by projection.

One thing worth tightening in a follow-up: parse_models_response_handles_non_object_body only asserts !err.is_empty(). The other two tests assert on specific substrings, which means if the non-object branch message regresses (typo, accidental branch merge), only the other tests would catch it. Not a blocker, but it's the weakest assertion in the suite.

Approved.

@oxoxDev oxoxDev self-assigned this May 28, 2026
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

Splits /models parser's let-else into pure parse_models_response + json_value_kind helpers, distinguishing missing-data / wrong-type-data / non-object-body. 3 tests pin all 3 arms. CodeRabbit + graycyrus APPROVED. CI green. State CLEAN.

Nits

  • ops.rs:205Value::to_string() on the sibling "object" field renders strings quoted ("error" with backslash-quotes). Use v.as_str().map(String::from).unwrap_or_else(|| v.to_string()) so strings come out bare.
  • ops.rs:216-224 — per-entry filter_map silently drops rows lacking string id (pre-existing behavior). Worth a log::debug! with dropped count so partial-shape providers are spotable from logs.
  • json_value_kind returns "object" for Value::Object — collides with the literal "object" envelope field name in the wrong-type error string. Rename to "map" or "json-object" to disambiguate.

Questions

  • PR body claims the missing-data arm "preserves Sentry fingerprint" — but the message changed "missing data array""missing data field". Substring change → new fingerprint. Intent: restore array wording, or update PR body to admit clean re-fingerprint?
  • Wrong-type test always wraps inside {"object": "list", ...} — sibling-"error" triage path (the actual TAURI-RUST-4Y shape) isn't pinned. One fixture with "object": "error" would lock it.

@oxoxDev oxoxDev merged commit 8365e0c into tinyhumansai:main May 28, 2026
39 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants