fix: keep StreamingDataSource alive on non-Eof stream errors#168
Draft
kinyoklion wants to merge 3 commits intomainfrom
Draft
fix: keep StreamingDataSource alive on non-Eof stream errors#168kinyoklion wants to merge 3 commits intomainfrom
kinyoklion wants to merge 3 commits intomainfrom
Conversation
The fetch loop matched any error other than `Eof` (and recoverable HTTP responses) with a catch-all `_ => break`, dropping the spawned task and leaving the data source silently dysfunctional. Because no one was polling the eventsource-client's stream anymore, its reconnect logic never ran. The `None` branch also broke without notifying the caller, so callers waiting on initialization had no signal that the stream had closed permanently. Replace the catch-all `break` with `continue` so the eventsource-client gets to reconnect on the next poll, and call `init_complete(false)` in the `None` branch so the caller can observe permanent stream closure. Adds `streaming_source_recovers_from_non_eof_stream_error` test -- sends invalid UTF-8 to trigger `Error::InvalidLine` from the parser and asserts the mock receives at least two requests. Pairs with launchdarkly/rust-eventsource-client#134, which fixes the matching state-transition gap on the eventsource-client side. Together they restore the reconnection contract: the eventsource-client owns reconnection, and the SDK keeps polling and trusts it. Closes #116.
The 401 path (and any other status that `is_http_error_recoverable` rejects) should stop the streaming fetch loop and signal init failure to the caller. The catch-all-error change in this PR sits directly next to that branch, so add a behavioral test exercising the 401 path end-to-end through the fetch loop -- existing coverage was only of the predicate function.
Three follow-ups from review of #168: * Routes `Error::Eof` to `debug!` so healthy stream rotations don't spam `warn!` lines on every reconnect. * Rewords the catch-all comment and log: some non-Eof errors converge via the eventsource-client's `StreamClosed` transition (observed as `None` on the next poll), not in-stream reconnect, so "will retry" was misleading. * Strengthens `streaming_source_recovers_from_non_eof_stream_error` with an `Arc<Mutex<Option<bool>>>` capture asserting `init_complete` is *not* called during the retry window. Permanent-failure init is already covered by `streaming_source_stops_on_unrecoverable_http_status`.
keelerm84
approved these changes
May 9, 2026
| // on the next iteration). Breaking here | ||
| // would drop this task and leave the data | ||
| // source silently dysfunctional. | ||
| warn!("error on event stream, will keep polling: {e:?}"); |
Member
There was a problem hiding this comment.
Might change the language here. Customers will think polling means non-streaming.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
StreamingDataSourcefetch loop matched any error other thanEof(and recoverable HTTP responses) with a catch-all_ => break, dropping the spawned task and leaving the data source silently dysfunctional. Because nothing was polling the eventsource-client's stream anymore, its reconnect logic never ran. TheNonebranch also broke without notifying the caller.This PR:
breakwithcontinue, so the eventsource-client gets to reconnect on the next poll. A warning log keeps transient errors visible.init_complete(false)in theNonebranch before breaking, so callers waiting on initialization get a signal when the stream is permanently closed.Context
Surfaced from launchdarkly/rust-server-sdk#116. That report has two contributing bugs; this PR addresses the rust-server-sdk side. The matching rust-eventsource-client side is in launchdarkly/rust-eventsource-client#134 (SDK-2345). Together they restore the reconnection contract: the eventsource-client owns reconnection, and the SDK keeps polling and trusts it.
Tracked in SDK-2346.
Test plan
streaming_source_recovers_from_non_eof_stream_error(launchdarkly-server-sdk/src/data_source.rs) — sends invalid UTF-8 that triggersError::InvalidLinefrom the eventsource-client parser and asserts the mock receives at least two requests, proving the SDK kept polling and the underlying client reconnected.cargo test --package launchdarkly-server-sdk --lib— 300 tests pass; no regressions.cargo fmt --checkclean.Closes #116.