fix: parse Bedrock event-stream framing for non-Claude models#252
fix: parse Bedrock event-stream framing for non-Claude models#252safayavatsal wants to merge 2 commits into
Conversation
The previous streaming parser detected event-stream frames by substring-scanning decoded lines for "message-type" and "bytes", then extracted text via a fixed `delta.text` JSONPath. This silently yielded nothing for Bedrock models whose chunks don't carry `delta.text`: Llama-3 emits `generation`, Cohere emits top-level `text`, etc. Replace with proper AWS event-stream parsing via botocore.eventstream.EventStreamBuffer, fed from response.iter_bytes() / aiter_bytes(). Detect wire format from Content-Type (application/vnd.amazon.eventstream vs text/event-stream) instead of sniffing payload text. Unify sync and async paths through shared helpers; remove the dead is_bedrock parameter and debug print()s. On a missed JSONPath, log a single structured WARNING with the payload keys observed so the next misconfiguration is debuggable instead of producing a silent empty stream. Adds botocore as a direct runtime dep, a top-level tests/ directory (the repo previously lacked one), and a deterministic event-stream fixture builder so regression tests run without live Bedrock access. Closes highflame-ai#146, Closes highflame-ai#147. Backend audit tracked in highflame-ai#251.
There was a problem hiding this comment.
Code Review
This pull request refactors the streaming response handling to support both AWS Bedrock event-stream and standard SSE formats, utilizing JSONPath for flexible text extraction across different model providers. It introduces robust parsing logic using botocore.eventstream, replaces legacy substring-based extraction methods, and adds comprehensive regression tests with synthetic fixtures. Review feedback identifies opportunities to improve error handling during base64 decoding and suggests refining the extraction logic to correctly handle empty string chunks, thereby avoiding false-positive warnings when a JSONPath match is found but contains no text.
Two changes, addressing 5 inline review comments:
1. Add TypeError to the base64.b64decode catch tuple (line 241).
b64decode raises TypeError on non-string/non-bytes inputs (e.g. None,
numbers). Including it prevents the parser from crashing on malformed
event-stream payloads.
2. Distinguish path-miss from empty-string-match (lines 247-272, plus four
call sites). _extract_text now returns:
- None when the JSONPath misses, errors, or matches a non-string
value (debug-logged so misconfiguration is still visible)
- "" when the path matches an empty string (legitimate keep-alive
chunk; e.g. OpenAI streaming sometimes emits delta.content
with an empty string before the real tokens arrive)
- the value when the path matches a non-empty string
Callers changed from `if text:` to `if text is None: warn elif text:
yield`. Empty chunks no longer trigger the "did not match" WARNING.
Diverged from the bot's literal suggestion (`return str(val) if val is
not None else ""`) because str()-coercion would silently stringify dicts,
lists, and numbers, masking misconfigured stream_response_path values.
Tests:
- Added test_empty_content_chunk_does_not_warn: SSE stream with one
empty `delta.content: ""` chunk yields no warning and produces only
the non-empty chunks.
- Added test_extract_text_distinguishes_miss_from_empty: unit-level
coverage for the None / "" / "value" / non-string / null cases.
- Added openai_sse_with_empty.txt fixture for the SSE test.
13 tests pass (was 11). Lint clean.
|
Addressed all 5 review comments in e614b32: Comment 1 (line 241, Comments 2-5 (empty-string vs path-miss): Applied with one deliberate divergence.
The four call sites changed from Divergence from the literal suggestion: the bot proposed Tests: added |
Summary
route_service.pywith proper AWS event-stream parsing viabotocore.eventstream.EventStreamBuffer. Fixes silent empty streams on Bedrock models whose chunks don't carrydelta.text(Llama-3 emitsgeneration, Cohere emits top-leveltext).Content-Typeinstead of payload sniffing; unifies sync/async paths; removes deadis_bedrockparameter and stray debugprint()s.WARNINGnaming the path that missed and the payload keys observed — turns silent empty streams into actionable diagnostics.Closes #146, Closes #147.
Backend dependency
The SDK parser is now correct, but each Bedrock route still needs the right
ModelSpec.stream_response_pathfor its model family. Tracked separately in #251 — please audit existing routes before relying on this fix end-to-end.What changed
highflame/services/route_service.py— new parser internals (_detect_stream_format,_iter_event_stream_payloads,_iter_sse_payloads,_stream_text_sync,_stream_text_async). Public method signatures unchanged.pyproject.toml— addsbotocore ^1.34.0as a runtime dep; adds[tool.pytest.ini_options](asyncio_mode = strict,testpaths = [\"tests\"])..flake8—extend-ignore = E203(black/flake8 compatibility; recommended by black docs).README.md— new "Streaming" section with per-modelstream_response_pathtable.tests/— new top-level directory (the repo previously lacked one). Includes a deterministic event-stream fixture builder so regression tests run without live Bedrock access.Type of change
Testing
tests/streaming/test_event_stream_parser.pycovering Llama-3, Cohere, Claude (regression), and OpenAI SSE; sync + async; missing-path warning; invalid-path resilience; sync/async parity.pytest tests/→ 11 passed.black --checkclean.flake8clean.botocore.eventstreamso spec compliance is verified, but live behavior is not.Impact / Risks
botocoreas a direct runtime dep (~5MB). Most users already have it transitively. Vendoring a ~150-line parser is the alternative if this is undesirable.botocore.eventstream.