Skip to content

remote capabilities errors are properly handled as caperrors#21746

Open
fernandezlautaro wants to merge 3 commits intodevelopfrom
fixRemoteErrorsUnwrapping
Open

remote capabilities errors are properly handled as caperrors#21746
fernandezlautaro wants to merge 3 commits intodevelopfrom
fixRemoteErrorsUnwrapping

Conversation

@fernandezlautaro
Copy link
Contributor

@fernandezlautaro fernandezlautaro commented Mar 27, 2026

Remote capabilities errors' are not being unmarshalled/handled properly in the WF side of it, loosing the correct type (user/system) and the error code (making all of them be of Unknown type)

This PR fixes that.
More context: https://chainlink-core.slack.com/archives/C07GQNPVBB5/p1774552056515259

Requires

Supports

@github-actions
Copy link
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 27, 2026

✅ No conflicts with other open PRs targeting develop

@trunk-io
Copy link

trunk-io bot commented Mar 27, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@fernandezlautaro fernandezlautaro marked this pull request as ready for review March 27, 2026 04:32
@fernandezlautaro fernandezlautaro requested review from a team as code owners March 27, 2026 04:32
Copilot AI review requested due to automatic review settings March 27, 2026 04:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: MEDIUM

Fixes workflow-side handling of remote capability execution errors so that typed caperrors.Error (origin + code) survive the RPC boundary and can be classified/metric’d correctly.

Changes:

  • Wrap remote execution error responses with an Unwrap() chain containing caperrors.DeserializeErrorFromString(ErrorMsg) while preserving the legacy display string.
  • Add unit tests asserting remote error responses can be errors.As’d into caperrors.Error (including fallback behavior for non-serialized error strings).
  • Adjust workflow v2 capability-executor failure metric labeling for the “non-caperrors” error path.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
core/services/workflows/v2/capability_executor.go Updates failure metric labeling for errors that don’t implement caperrors.Error.
core/capabilities/remote/executable/request/client_request.go Wraps remote error responses to preserve legacy error strings while enabling errors.As(..., caperrors.Error) post-RPC.
core/capabilities/remote/executable/request/client_request_test.go Adds tests validating correct caperrors unwrapping and fallback behavior.

_ = events.EmitCapabilityFinishedEvent(ctx, loggerLabels, c.WorkflowExecutionID, request.Id, meteringRef, store.StatusErrored, request.Method, err)
c.metrics.With(platform.KeyCapabilityID, request.Id, platform.KeyCapabilityErrorCode, caperrors.Unknown.String()).IncrementCapabilityFailureCounter(ctx)
// TODO shouldn't all capabilities *always* return a typed error, and if so shouldn't the following metric alert us there's a bug we need to fix?
c.metrics.With(platform.KeyCapabilityID, request.Id, platform.KeyCapabilityErrorCode, "BUG").IncrementCapabilityFailureCounter(ctx)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric label value "BUG" for capabilityErrorCode deviates from the established caperrors code set and will change metric series unexpectedly. Consider keeping the label within caperrors (e.g., Unknown) and separately logging/recording that the returned error did not implement caperrors.Error (or add a dedicated metric for untyped errors).

Suggested change
c.metrics.With(platform.KeyCapabilityID, request.Id, platform.KeyCapabilityErrorCode, "BUG").IncrementCapabilityFailureCounter(ctx)
c.metrics.With(platform.KeyCapabilityID, request.Id, platform.KeyCapabilityErrorCode, "Unknown").IncrementCapabilityFailureCounter(ctx)

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@cl-sonarqube-production
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants