Skip to content

Ignore unsupported and Nexus events#355

Open
jeffschoner wants to merge 4 commits intocoinbase:masterfrom
jeffschoner:no-nexus
Open

Ignore unsupported and Nexus events#355
jeffschoner wants to merge 4 commits intocoinbase:masterfrom
jeffschoner:no-nexus

Conversation

@jeffschoner
Copy link
Copy Markdown
Contributor

Summary

  • Add graceful handling for event types the Ruby SDK doesn't support (Nexus operations, WORKFLOW_EXECUTION_OPTIONS_UPDATED). These events are logged and skipped rather than crashing the worker.
  • Honor the worker_may_ignore bit on history events, allowing the server to introduce new event types without breaking older workers.
  • Upgrade Temporal API proto definitions to v1.62.9, which includes enum values for Nexus and other newer event types.
  • Add a replay test verifying workflows replay correctly when histories contain ignored events.
  • Fix pre-existing test failures: GRPC poll mock keyword arg mismatch and a flaky clock-skew assertion in the metadata integration spec.

Details

When a workflow history contains event types the SDK doesn't handle (e.g. Nexus operations or WORKFLOW_EXECUTION_OPTIONS_UPDATED), the state manager now:

  1. Checks against a known list of unsupported-but-ignorable event types and skips them with a warning log.
  2. For truly unknown event types not in the list, checks the worker_may_ignore flag from the server — if set, logs and skips; otherwise raises UnsupportedEvent.

The proto upgrade to v1.62.9 ensures these event types are properly deserialized (previously they mapped to EVENT_TYPE_UNSPECIFIED since the enum didn't include them).

Test plan

  • bundle exec rspec — 662 examples, 0 failures
  • examples/spec/replay/signal_with_start_ignored_event_spec.rb — replay test passes with ignored eve

Add support for ignoring events that the worker doesn't handle, including
Nexus operation events and WORKFLOW_EXECUTION_OPTIONS_UPDATED. Events with
the worker_may_ignore bit set are also gracefully skipped with a warning log.

Committed-By-Agent: claude
Allow -1s lower bound for clock drift between Docker dev server and host.

Committed-By-Agent: claude
Copy link
Copy Markdown
Contributor

@jazev-stripe jazev-stripe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ruby changes all match what we are running at Stripe in our fork (and have successfully deployed to our services).

The Protobuf generation changes should be good; in particular the add_serialized_file API used by more modern protoc is compatible going back to google-protobuf v3.18: source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants