fix: parse Bedrock event-stream framing for non-Claude models by safayavatsal · Pull Request #252 · highflame-ai/highflame-python

safayavatsal · 2026-05-13T11:00:55Z

Summary

Replaces the substring-scanning streaming parser in route_service.py with proper AWS event-stream parsing via botocore.eventstream.EventStreamBuffer. Fixes silent empty streams on Bedrock models whose chunks don't carry delta.text (Llama-3 emits generation, Cohere emits top-level text).
Detects wire format from Content-Type instead of payload sniffing; unifies sync/async paths; removes dead is_bedrock parameter and stray debug print()s.
On a missed JSONPath, logs one structured WARNING naming the path that missed and the payload keys observed — turns silent empty streams into actionable diagnostics.

Closes #146, Closes #147.

Backend dependency

The SDK parser is now correct, but each Bedrock route still needs the right ModelSpec.stream_response_path for its model family. Tracked separately in #251 — please audit existing routes before relying on this fix end-to-end.

What changed

highflame/services/route_service.py — new parser internals (_detect_stream_format, _iter_event_stream_payloads, _iter_sse_payloads, _stream_text_sync, _stream_text_async). Public method signatures unchanged.
pyproject.toml — adds botocore ^1.34.0 as a runtime dep; adds [tool.pytest.ini_options] (asyncio_mode = strict, testpaths = [\"tests\"]).
.flake8 — extend-ignore = E203 (black/flake8 compatibility; recommended by black docs).
README.md — new "Streaming" section with per-model stream_response_path table.
tests/ — new top-level directory (the repo previously lacked one). Includes a deterministic event-stream fixture builder so regression tests run without live Bedrock access.

Type of change

Bug fix (non-breaking change which fixes an issue)
This change requires a documentation update (done in this PR)

Testing

Unit tests added — 11 tests in tests/streaming/test_event_stream_parser.py covering Llama-3, Cohere, Claude (regression), and OpenAI SSE; sync + async; missing-path warning; invalid-path resilience; sync/async parity.
Local run: pytest tests/ → 11 passed. black --check clean. flake8 clean.
Manual smoke against a live Bedrock route — needs reviewer with gateway access; fixtures validated against real botocore.eventstream so spec compliance is verified, but live behavior is not.

Impact / Risks

botocore as a direct runtime dep (~5MB). Most users already have it transitively. Vendoring a ~150-line parser is the alternative if this is undesirable.
Behavior change for callers relying on the lenient parser. Anyone whose stream "worked" only because the old string-scan happened to match a malformed frame may now get the structured warning instead. No silent regressions — the warning makes the cause explicit.
Without live repro, "fixes meta.llama3-8b-instruct-v1:0, Interaction type: invoke-with-response-stream #147" is verified against the AWS event-stream spec, not against production Bedrock. Fixtures were generated synthetically and round-trip through real botocore.eventstream.

The previous streaming parser detected event-stream frames by substring-scanning decoded lines for "message-type" and "bytes", then extracted text via a fixed `delta.text` JSONPath. This silently yielded nothing for Bedrock models whose chunks don't carry `delta.text`: Llama-3 emits `generation`, Cohere emits top-level `text`, etc. Replace with proper AWS event-stream parsing via botocore.eventstream.EventStreamBuffer, fed from response.iter_bytes() / aiter_bytes(). Detect wire format from Content-Type (application/vnd.amazon.eventstream vs text/event-stream) instead of sniffing payload text. Unify sync and async paths through shared helpers; remove the dead is_bedrock parameter and debug print()s. On a missed JSONPath, log a single structured WARNING with the payload keys observed so the next misconfiguration is debuggable instead of producing a silent empty stream. Adds botocore as a direct runtime dep, a top-level tests/ directory (the repo previously lacked one), and a deterministic event-stream fixture builder so regression tests run without live Bedrock access. Closes highflame-ai#146, Closes highflame-ai#147. Backend audit tracked in highflame-ai#251.

gemini-code-assist

Code Review

This pull request refactors the streaming response handling to support both AWS Bedrock event-stream and standard SSE formats, utilizing JSONPath for flexible text extraction across different model providers. It introduces robust parsing logic using botocore.eventstream, replaces legacy substring-based extraction methods, and adds comprehensive regression tests with synthetic fixtures. Review feedback identifies opportunities to improve error handling during base64 decoding and suggests refining the extraction logic to correctly handle empty string chunks, thereby avoiding false-positive warnings when a JSONPath match is found but contains no text.

Two changes, addressing 5 inline review comments: 1. Add TypeError to the base64.b64decode catch tuple (line 241). b64decode raises TypeError on non-string/non-bytes inputs (e.g. None, numbers). Including it prevents the parser from crashing on malformed event-stream payloads. 2. Distinguish path-miss from empty-string-match (lines 247-272, plus four call sites). _extract_text now returns: - None when the JSONPath misses, errors, or matches a non-string value (debug-logged so misconfiguration is still visible) - "" when the path matches an empty string (legitimate keep-alive chunk; e.g. OpenAI streaming sometimes emits delta.content with an empty string before the real tokens arrive) - the value when the path matches a non-empty string Callers changed from `if text:` to `if text is None: warn elif text: yield`. Empty chunks no longer trigger the "did not match" WARNING. Diverged from the bot's literal suggestion (`return str(val) if val is not None else ""`) because str()-coercion would silently stringify dicts, lists, and numbers, masking misconfigured stream_response_path values. Tests: - Added test_empty_content_chunk_does_not_warn: SSE stream with one empty `delta.content: ""` chunk yields no warning and produces only the non-empty chunks. - Added test_extract_text_distinguishes_miss_from_empty: unit-level coverage for the None / "" / "value" / non-string / null cases. - Added openai_sse_with_empty.txt fixture for the SSE test. 13 tests pass (was 11). Lint clean.

safayavatsal · 2026-05-13T11:59:03Z

Addressed all 5 review comments in e614b32:

Comment 1 (line 241, TypeError in b64decode catch): Applied. TypeError added to the exception tuple alongside binascii.Error, json.JSONDecodeError, UnicodeDecodeError.

Comments 2-5 (empty-string vs path-miss): Applied with one deliberate divergence. _extract_text now returns:

None when the path misses, errors, or matches a non-string value (debug-logged)
"" when the path matches a literal empty string (legitimate keep-alive chunk)
the value when the path matches a non-empty string

The four call sites changed from if text: to if text is None: warn elif text: yield. Empty chunks no longer trigger the "did not match" warning, which is the bot's core ask.

Divergence from the literal suggestion: the bot proposed return str(val) if val is not None else "". I didn't take the str(val) coercion because it would silently stringify dicts, lists, and numbers — masking misconfigured stream_response_path values that point at non-text nodes. Instead the new code keeps strict type discipline: non-string match returns None with a debug log, so a misconfiguration still surfaces in the once-per-stream WARNING.

Tests: added test_empty_content_chunk_does_not_warn (SSE stream with delta.content: "" in the middle: no warning fires, only non-empty chunks yield) and test_extract_text_distinguishes_miss_from_empty (unit coverage for None / "" / "value" / non-string / null). 13 tests pass (was 11). Lint clean.

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

Comment thread highflame/services/route_service.py Outdated

Comment thread highflame/services/route_service.py Outdated

Comment thread highflame/services/route_service.py

Comment thread highflame/services/route_service.py

Comment thread highflame/services/route_service.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: parse Bedrock event-stream framing for non-Claude models#252

fix: parse Bedrock event-stream framing for non-Claude models#252
safayavatsal wants to merge 2 commits into
highflame-ai:mainfrom
safayavatsal:fix/bedrock-event-stream-parser

safayavatsal commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

safayavatsal commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

safayavatsal commented May 13, 2026

Summary

Backend dependency

What changed

Type of change

Testing

Impact / Risks

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

safayavatsal commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant