Skip to content

fix: Catch exceptions from user-provided retry handlers to prevent orchestration crash#193

Merged
YunchuWang merged 2 commits intomainfrom
copilot-finds/bug/retry-handler-exception-crashes-orchestration
Mar 24, 2026
Merged

fix: Catch exceptions from user-provided retry handlers to prevent orchestration crash#193
YunchuWang merged 2 commits intomainfrom
copilot-finds/bug/retry-handler-exception-crashes-orchestration

Conversation

@YunchuWang
Copy link
Member

Fixes #192

Problem

When a user-provided retry handler (AsyncRetryHandler) or handleFailure predicate (RetryPolicy) throws an exception, the error propagates uncaught through the orchestration executor, crashing the entire orchestration instead of just failing the individual task.

Changes

  • Wrap user-provided function calls in tryHandleRetry() with try-catch blocks
  • When an exception is caught: log a warning, return false (don't retry), and let the task fail normally with its original error
  • The orchestrator can then catch the TaskFailedError as usual

…chestration crashes

When a user-provided retry handler (AsyncRetryHandler) or handleFailure
predicate (RetryPolicy) throws an exception, the error previously
propagated uncaught through tryHandleRetry() -> handleFailedTask() ->
processEvent() -> execute(), crashing the entire orchestration.

The fix wraps these user-provided function calls in try-catch blocks
within tryHandleRetry(). When an exception is caught:
- A warning is logged (EVENT_RETRY_HANDLER_EXCEPTION = 737)
- The method returns false (don't retry)
- The task fails normally with its original error
- The orchestrator can catch the TaskFailedError as usual

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 23, 2026 22:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prevents user-provided retry callbacks (custom AsyncRetryHandler and RetryPolicy.handleFailure) from crashing the entire orchestration when they throw, by safely catching those exceptions in the orchestration executor and allowing the original task failure to flow normally.

Changes:

  • Add try/catch around retry policy delay computation and custom retry handler evaluation in tryHandleRetry().
  • Emit a new warning log event when retry evaluation throws, and treat it as “don’t retry”.
  • Add regression tests covering both retry-handler and handleFailure-predicate exception scenarios (including orchestrator catch behavior).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/durabletask-js/src/worker/orchestration-executor.ts Catches exceptions from user retry evaluation logic to avoid failing the whole orchestration with the handler error.
packages/durabletask-js/src/worker/logs.ts Introduces a dedicated log event ID and helper for retry-evaluation exceptions.
packages/durabletask-js/test/orchestration_executor.spec.ts Adds tests ensuring retry callback exceptions don’t bypass normal task-failure semantics.

@YunchuWang YunchuWang merged commit a683e90 into main Mar 24, 2026
28 checks passed
@YunchuWang YunchuWang deleted the copilot-finds/bug/retry-handler-exception-crashes-orchestration branch March 24, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[copilot-finds] Bug: User-provided retry handler/predicate exception crashes entire orchestration

3 participants