Skip to content

json_path: support array indexing in field_extraction / schema_type / range_threshold #23

@aural-psynapse

Description

@aural-psynapse

_get_by_json_path in src/provably/handoff/eval_modes.py only walks dot-separated dict keys. Any segment evaluated against a list raises expected dict at segment ..., got list, so the three path-based verification modes (field_extraction, schema_type, range_threshold) cannot verify any value that lives inside a JSON array.

Where

# src/provably/handoff/eval_modes.py:115
def _get_by_json_path(obj, path):
    rel = _normalize_json_path(path)
    if not rel: return obj
    cursor = obj
    for segment in rel.split("."):
        segment = segment.strip()
        if not segment: continue
        if not isinstance(cursor, dict):
            raise KeyError(f"expected dict at segment {segment!r}, got {type(cursor).__name__}")
        if segment not in cursor: raise KeyError(segment)
        cursor = cursor[segment]
    return cursor

The function never branches on isinstance(cursor, list), so there is no path syntax that walks into a list.

Repro

Tool response is a list. The LLM tries every reasonable syntax to address the first element:

json_path Indexed root Result
[0].status [{"status": "open"}] KeyError: expected dict at segment [0]
0.status same KeyError: expected dict at segment 0
value[0].status {"value": [{"status": "open"}]} walks value ok, then KeyError: expected dict at segment [0]
value.0.status same walks value ok, then KeyError: expected dict at segment 0
items[2].quantity {"items": [{...},{...},{"quantity": 3}]} walks items ok, then KeyError on [2]

Impact

field_extraction is the most-used verification mode. With this gap any list-returning tool (a list_* endpoint, a search result, an array of line items) is unverifiable except via verbatim (canonical-JSON equality of the whole payload), which is brittle and useless for partial claims.

Concretely: in our consumer the LLM was asked "What is Acmes open ticket about?". The tool returned a one-element list. Every attempted path was rejected by the verifier; the heal loop exhausted retries; the run ended in failed even though the claim was correct.

Suggested fix

Recognise array indexing inside the segment loop. Two equivalent surfaces:

  • Bracket notationitems[0].subject, [0].subject. Standard JSONPath-ish, easy to teach.
  • Numeric segment fallbackitems.0.subject. Less standard but maps cleanly to the existing dot-split.

Pseudocode:

import re

_BRACKET = re.compile(r"^\[(\d+)\]$")

def _walk(cursor, segment):
    # Bracket notation, possibly with a leading key in the same segment: "items[0]"
    m = _BRACKET.match(segment)
    if m and isinstance(cursor, list):
        idx = int(m.group(1))
        if idx >= len(cursor):
            raise IndexError(f"index {idx} out of range")
        return cursor[idx]
    if isinstance(cursor, list) and segment.isdigit():
        idx = int(segment)
        if idx >= len(cursor):
            raise IndexError(f"index {idx} out of range")
        return cursor[idx]
    if isinstance(cursor, dict):
        if segment not in cursor:
            raise KeyError(segment)
        return cursor[segment]
    raise KeyError(f"expected dict or list at segment {segment!r}, got {type(cursor).__name__}")

Plus a tokenizer that splits "items[0].subject" into ["items", "[0]", "subject"] (or pre-rewrites it to items.[0].subject so the existing .split(".") works unchanged).

Tests to add

assert _get_by_json_path([{"a": 1}], "[0].a") == 1
assert _get_by_json_path({"items": [{"a": 1}]}, "items[0].a") == 1
assert _get_by_json_path({"items": [{"a": 1}]}, "items.0.a") == 1   # if numeric fallback supported

# Existing dict tests must still pass:
assert _get_by_json_path({"plan": "Enterprise"}, "plan") == "Enterprise"
assert _get_by_json_path({"a": {"b": 1}}, "a.b") == 1

# Error surfaces:
with pytest.raises(IndexError):
    _get_by_json_path([{"a": 1}], "[5].a")
with pytest.raises(KeyError):
    _get_by_json_path({"a": [1, 2]}, "a.b")

Backward compatibility

Pure-dict paths are unaffected: the new branch fires only when cursor is a list, which previously always raised. No registered claim payload changes shape.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions