fix(inference): distinguish missing-data-field from wrong-type in /models parser (Sentry TAURI-RUST-4Y)#2794
Conversation
…dels parser (Sentry TAURI-RUST-4Y)
`inference_list_models` parses the OpenAI-compatible `/models` envelope
at `src/openhuman/inference/provider/ops.rs`. Before this fix the
`let-else` at line 154 collapsed two distinct failure modes into one
misleading error string:
- `data` field absent → `"provider response missing 'data' array
— endpoint is not OpenAI-compatible (got keys: …)"` (expected).
- `data` field present but wrong type (object/null/string) → SAME
error string, with the keys list including `"data"` —
contradicting the "missing" claim. This produced the TAURI-RUST-4Y
events titled `"... missing 'data' array ... (got keys: data,
object)"` (133 events: 131 on 0.54.0 stragglers, 5 on current
0.56.0).
The collapse made the issue look like a parser hallucination at
triage time — the keys list said `data` IS present, but the message
said it was missing. The actual cause is providers returning a
non-array `data` value (e.g. an OpenAI-style error envelope
`{"object":"error","data":{…}}` served at HTTP 200).
Refactor extracts the parsing logic into a pure
`parse_models_response(&serde_json::Value) -> Result<Vec<ModelInfo>,
String>` helper with three distinct error arms:
1. Non-object top-level body — surfaces the actual JSON kind.
2. Missing `data` field — original wire shape, preserved verbatim so
the existing Sentry fingerprint stays stable for that population.
3. `data` present but wrong type — new arm naming the actual JSON
type (`object`/`string`/`null`/…) and surfacing the sibling
`"object"` field value (OpenAI servers set it to `"list"` on
success, `"error"` on failure — fastest triage signal).
3 new tests in `provider::ops::tests` cover all three arms plus a
happy-path projection check (`owned_by` / `context_length` /
`context_window` aliases). 27/27 module tests pass, 0 regressions.
## Related
PR tinyhumansai#2785 (Sentry TAURI-RUST-28Z) touches the same file but at a
different section (cloud_providers lookup, lines 46-58 + a new
`synthesize_local_runtime_entry` helper). The two fixes are
complementary — tinyhumansai#2785 makes more requests reach the parser by
fixing the lookup-miss case for ollama/lmstudio; this PR makes the
parser produce triageable errors when those requests hit malformed
responses. No merge conflict.
## Test plan
- [x] `cargo test parse_models_response` — 3 new tests pass
- [x] `cargo test openhuman::inference::provider::ops` — 27 tests pass, 0 regressions
- [x] `cargo check --bin openhuman-core` — passes
- [x] `cargo fmt --check` — clean
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe PR refactors the OpenAI-compatible ChangesModels response parser validation and integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
Good fix for TAURI-RUST-4Y. The root cause is exactly right — the old let-else chain chained .get("data").and_then(|d| d.as_array()), so any non-array data value fell through to the missing-field arm and produced the contradiction you were seeing in Sentry (keys list included data but message said "missing").
The extraction into parse_models_response is clean — three explicit failure arms, each with a distinct, actionable error string. json_value_kind is exhaustive over all serde_json::Value variants and returns &'static str, which is the right type here. The tests are solid: the parametric wrong-type loop covers the full data-present-but-not-array space, and the happy path exercises context_length / context_window aliasing and owned_by projection.
One thing worth tightening in a follow-up: parse_models_response_handles_non_object_body only asserts !err.is_empty(). The other two tests assert on specific substrings, which means if the non-object branch message regresses (typo, accidental branch merge), only the other tests would catch it. Not a blocker, but it's the weakest assertion in the suite.
Approved.
There was a problem hiding this comment.
Walkthrough
Splits /models parser's let-else into pure parse_models_response + json_value_kind helpers, distinguishing missing-data / wrong-type-data / non-object-body. 3 tests pin all 3 arms. CodeRabbit + graycyrus APPROVED. CI green. State CLEAN.
Nits
ops.rs:205—Value::to_string()on the sibling"object"field renders strings quoted ("error"with backslash-quotes). Usev.as_str().map(String::from).unwrap_or_else(|| v.to_string())so strings come out bare.ops.rs:216-224— per-entryfilter_mapsilently drops rows lacking stringid(pre-existing behavior). Worth alog::debug!with dropped count so partial-shape providers are spotable from logs.json_value_kindreturns"object"forValue::Object— collides with the literal"object"envelope field name in the wrong-type error string. Rename to"map"or"json-object"to disambiguate.
Questions
- PR body claims the missing-
dataarm "preserves Sentry fingerprint" — but the message changed"missingdataarray"→"missingdatafield". Substring change → new fingerprint. Intent: restorearraywording, or update PR body to admit clean re-fingerprint? - Wrong-type test always wraps inside
{"object": "list", ...}— sibling-"error"triage path (the actual TAURI-RUST-4Y shape) isn't pinned. One fixture with"object": "error"would lock it.
Summary
inference_list_modelsparses the OpenAI-compatible/modelsenvelope insrc/openhuman/inference/provider/ops.rs. The pre-fixlet-elseat line 154 collapsed two distinct failure modes into one misleading error string — that's the bug behind Sentry TAURI-RUST-4Y (133 events: 131 on stale0.54.0, 5 on current0.56.0, ongoing).datafield absentprovider response missing 'data' array … (got keys: …)datafield present but wrong type (object/null/string)data, contradicting the "missing" claimprovider response has 'data' field but it is <kind>, expected array … ("object" = "<value>")— names the actual JSON type and surfaces the siblingobjectfield for triage<non-object>placeholderprovider response is not a JSON object … (got <kind> at top level)Why the title looked self-contradictory
Look at the latest 4Y event title:
"provider response missing 'data' array — endpoint is not OpenAI-compatible (got keys: data, object)". The keys list literally includesdata, but the message claims it's missing — because the oldlet-elsematched both "no key" AND "key present but.as_array()returnedNone". An upstream like LM Studio returning{"object":"error","data":{...error details...}}at HTTP 200 hits the second case but emits the first message.Triage saw "data is in the keys but message says missing" and stalled. This PR splits the arm so the actual JSON shape lands in Sentry on the next event.
What changed
parse_models_response(&serde_json::Value) -> Result<Vec<ModelInfo>, String>as a pure helper next to the inline call site. The HTTP / status / error-envelope handling above it is untouched.json_value_kind(&Value) -> &'static strfor the error-message JSON-kind tokens — kept beside the parser so test assertions on the rendered tokens stay in lock-step with the matcher.parse_models_response_returns_models_for_well_formed_data_array— happy path includingowned_by/context_length/context_windowprojections.parse_models_response_distinguishes_missing_data_field_from_wrong_type— pins the new distinction forobject/string/null/bool/number.parse_models_response_handles_non_object_body— covers bare array / string / null top-level bodies.Related (not blocking)
cloud_providerslookup (lines 46-58) + a newsynthesize_local_runtime_entryhelper at line 198. The two PRs are complementary — fix(inference): synth local-runtime entry for list_models when no cloud_providers row (Sentry TAURI-RUST-28Z) #2785 makes moreinference_list_modelsrequests succeed past the lookup gate for ollama/lmstudio; this PR makes the parser produce triageable errors when those requests reach malformed responses. No merge conflict (separate sections of the file).Test plan
cargo test parse_models_response— 3 new tests passcargo test openhuman::inference::provider::ops— 27 tests pass, 0 regressionscargo check --manifest-path Cargo.toml --bin openhuman-core— passescargo fmt --checkon touched file — cleanPost-merge observation: TAURI-RUST-4Y events on releases ≥ this fix should split into either (a) the unchanged "missing
datafield" message (genuine wrong-endpoint typos), or (b) a new fingerprint with the actual JSON-kind in the title (provider-shape problems). The latter becomes its own triageable Sentry issue with actionable context.Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests