_get_by_json_path in src/provably/handoff/eval_modes.py only walks dot-separated dict keys. Any segment evaluated against a list raises expected dict at segment ..., got list, so the three path-based verification modes (field_extraction, schema_type, range_threshold) cannot verify any value that lives inside a JSON array.
Where
# src/provably/handoff/eval_modes.py:115
def _get_by_json_path(obj, path):
rel = _normalize_json_path(path)
if not rel: return obj
cursor = obj
for segment in rel.split("."):
segment = segment.strip()
if not segment: continue
if not isinstance(cursor, dict):
raise KeyError(f"expected dict at segment {segment!r}, got {type(cursor).__name__}")
if segment not in cursor: raise KeyError(segment)
cursor = cursor[segment]
return cursor
The function never branches on isinstance(cursor, list), so there is no path syntax that walks into a list.
Repro
Tool response is a list. The LLM tries every reasonable syntax to address the first element:
json_path |
Indexed root |
Result |
[0].status |
[{"status": "open"}] |
KeyError: expected dict at segment [0] |
0.status |
same |
KeyError: expected dict at segment 0 |
value[0].status |
{"value": [{"status": "open"}]} |
walks value ok, then KeyError: expected dict at segment [0] |
value.0.status |
same |
walks value ok, then KeyError: expected dict at segment 0 |
items[2].quantity |
{"items": [{...},{...},{"quantity": 3}]} |
walks items ok, then KeyError on [2] |
Impact
field_extraction is the most-used verification mode. With this gap any list-returning tool (a list_* endpoint, a search result, an array of line items) is unverifiable except via verbatim (canonical-JSON equality of the whole payload), which is brittle and useless for partial claims.
Concretely: in our consumer the LLM was asked "What is Acmes open ticket about?". The tool returned a one-element list. Every attempted path was rejected by the verifier; the heal loop exhausted retries; the run ended in failed even though the claim was correct.
Suggested fix
Recognise array indexing inside the segment loop. Two equivalent surfaces:
- Bracket notation —
items[0].subject, [0].subject. Standard JSONPath-ish, easy to teach.
- Numeric segment fallback —
items.0.subject. Less standard but maps cleanly to the existing dot-split.
Pseudocode:
import re
_BRACKET = re.compile(r"^\[(\d+)\]$")
def _walk(cursor, segment):
# Bracket notation, possibly with a leading key in the same segment: "items[0]"
m = _BRACKET.match(segment)
if m and isinstance(cursor, list):
idx = int(m.group(1))
if idx >= len(cursor):
raise IndexError(f"index {idx} out of range")
return cursor[idx]
if isinstance(cursor, list) and segment.isdigit():
idx = int(segment)
if idx >= len(cursor):
raise IndexError(f"index {idx} out of range")
return cursor[idx]
if isinstance(cursor, dict):
if segment not in cursor:
raise KeyError(segment)
return cursor[segment]
raise KeyError(f"expected dict or list at segment {segment!r}, got {type(cursor).__name__}")
Plus a tokenizer that splits "items[0].subject" into ["items", "[0]", "subject"] (or pre-rewrites it to items.[0].subject so the existing .split(".") works unchanged).
Tests to add
assert _get_by_json_path([{"a": 1}], "[0].a") == 1
assert _get_by_json_path({"items": [{"a": 1}]}, "items[0].a") == 1
assert _get_by_json_path({"items": [{"a": 1}]}, "items.0.a") == 1 # if numeric fallback supported
# Existing dict tests must still pass:
assert _get_by_json_path({"plan": "Enterprise"}, "plan") == "Enterprise"
assert _get_by_json_path({"a": {"b": 1}}, "a.b") == 1
# Error surfaces:
with pytest.raises(IndexError):
_get_by_json_path([{"a": 1}], "[5].a")
with pytest.raises(KeyError):
_get_by_json_path({"a": [1, 2]}, "a.b")
Backward compatibility
Pure-dict paths are unaffected: the new branch fires only when cursor is a list, which previously always raised. No registered claim payload changes shape.
Related
_get_by_json_pathinsrc/provably/handoff/eval_modes.pyonly walks dot-separated dict keys. Any segment evaluated against a list raisesexpected dict at segment ..., got list, so the three path-based verification modes (field_extraction,schema_type,range_threshold) cannot verify any value that lives inside a JSON array.Where
The function never branches on
isinstance(cursor, list), so there is no path syntax that walks into a list.Repro
Tool response is a list. The LLM tries every reasonable syntax to address the first element:
json_path[0].status[{"status": "open"}]KeyError: expected dict at segment [0]0.statusKeyError: expected dict at segment 0value[0].status{"value": [{"status": "open"}]}valueok, thenKeyError: expected dict at segment [0]value.0.statusvalueok, thenKeyError: expected dict at segment 0items[2].quantity{"items": [{...},{...},{"quantity": 3}]}itemsok, then KeyError on[2]Impact
field_extractionis the most-used verification mode. With this gap any list-returning tool (alist_*endpoint, a search result, an array of line items) is unverifiable except viaverbatim(canonical-JSON equality of the whole payload), which is brittle and useless for partial claims.Concretely: in our consumer the LLM was asked "What is Acmes open ticket about?". The tool returned a one-element list. Every attempted path was rejected by the verifier; the heal loop exhausted retries; the run ended in
failedeven though the claim was correct.Suggested fix
Recognise array indexing inside the segment loop. Two equivalent surfaces:
items[0].subject,[0].subject. Standard JSONPath-ish, easy to teach.items.0.subject. Less standard but maps cleanly to the existing dot-split.Pseudocode:
Plus a tokenizer that splits
"items[0].subject"into["items", "[0]", "subject"](or pre-rewrites it toitems.[0].subjectso the existing.split(".")works unchanged).Tests to add
Backward compatibility
Pure-dict paths are unaffected: the new branch fires only when
cursoris a list, which previously always raised. No registered claim payload changes shape.Related
{rest:path}in URLs; the equivalent on the verification side closes the symmetry.