Add opt-in stream error log throttling by He-Pin · Pull Request #2814 · apache/pekko

He-Pin · 2026-03-28T08:30:55Z

Motivation

When a stream stage fails rapidly and repeatedly (e.g., persistent network failure, bad data in a loop), each failure generates a log message. In high-throughput systems, this can result in thousands of log messages per second, overwhelming log aggregation systems and masking other important messages.

Modification

Added a new configuration option:

pekko.stream.materializer.stage-errors-log-throttle-period = off

When set to a positive duration (e.g. 10s), only the first stage error within each time window is logged at ERROR level. Subsequent errors in the window are counted silently, and a summary warning is emitted when the next window opens or the interpreter finishes.

Implementation details

Aspect	Detail
Scope	Per-interpreter throttle state (not shared across streams)
First error	Uses `errorLogInitialized` flag to guarantee first error always logs, regardless of `System.nanoTime()` origin
Config validation	Rejects negative durations with `require()`
Cleanup	Flushes suppressed count in `finish()` for best-effort reporting
Default	`off` — zero behavior change for existing users

Files changed

stream/src/main/resources/reference.conf — New config key
stream/.../GraphInterpreter.scala — Throttle state fields + modified reportStageError + finish() flush
stream-tests/.../StageErrorLogThrottleSpec.scala — 4 tests (enabled throttle with Broadcast fan-out, single error, disabled with fan-out, disabled single error)

Result

Operators can opt-in to throttle stage error logs in noisy environments
Zero behavior change by default (off)
Tests use Broadcast with 5 parallel failing stages to exercise the actual throttle code path (multiple reportStageError calls per interpreter)
All new tests pass, all existing ActorGraphInterpreterSpec and InterpreterSpec tests pass (no regression)

References

Complements PR Add fine-grained stream error logging control #2805 (fine-grained stream error log level control)

Add a new configuration option to throttle repeated stage error log messages in stream interpreters: pekko.stream.materializer.stage-errors-log-throttle-period = off When set to a positive duration (e.g. '10s'), only the first stage error within each time window is logged at ERROR level. Subsequent errors in the window are counted silently, and a summary warning is emitted when the next window opens or the interpreter finishes. Key implementation details: - Per-interpreter throttle state (not shared across streams) - Uses errorLogInitialized flag to ensure first error always logs regardless of System.nanoTime() origin (fix for GPT-5.4 review finding) - Validates negative durations with require() (fix for GPT-5.4 finding) - Flushes suppressed count in finish() for best-effort cleanup - Default 'off' preserves existing behavior (zero behavior change) Tests use Broadcast with 5 parallel failing stages to exercise actual throttle code paths (fix for Sonnet 4.6 + GPT-5.4 review finding that original tests only triggered single errors). Cross-reviewed by GPT-5.4 and Sonnet 4.6. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

He-Pin force-pushed the improve/stream-log-throttle branch from b0b4b83 to c712091 Compare March 28, 2026 08:33

He-Pin changed the title ~~Validate replica sequence number in external replication~~ Add opt-in stream error log throttling Mar 28, 2026

He-Pin force-pushed the improve/stream-log-throttle branch 2 times, most recently from bf1d66b to bd4dba9 Compare March 28, 2026 09:44

He-Pin force-pushed the improve/stream-log-throttle branch from bd4dba9 to e662724 Compare March 28, 2026 12:50

He-Pin marked this pull request as ready for review March 28, 2026 13:11

He-Pin marked this pull request as draft March 28, 2026 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opt-in stream error log throttling#2814

Add opt-in stream error log throttling#2814
He-Pin wants to merge 1 commit intoapache:mainfrom
He-Pin:improve/stream-log-throttle

He-Pin commented Mar 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Implementation details

Files changed

Result

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

He-Pin commented Mar 28, 2026 •

edited

Loading