-
Notifications
You must be signed in to change notification settings - Fork 87
[oadp-1.4] Fix IBU delay on SNO by waiting for CRDs and disabling leader election #2086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-1.4
Are you sure you want to change the base?
[oadp-1.4] Fix IBU delay on SNO by waiting for CRDs and disabling leader election #2086
Conversation
openshift#2082) During Image-Based Upgrade (IBU) on Single Node OpenShift (SNO) clusters, the OADP controller was experiencing an ~8 minute delay before reaching DPA Reconciled=True. This was caused by: 1. Controller crashing when Route/SCC CRDs weren't available during cluster initialization (2 min cache sync timeout) 2. New instance waiting for leader lease to expire (4.5 min on SNO) This commit implements three complementary fixes: 1. CRD availability wait: Before starting the controller, poll the discovery API for Route and SCC CRDs. This prevents the crash by waiting until external OpenShift operators have registered their CRDs. 2. LeaderElectionReleaseOnCancel: Release the leader lease when the controller crashes, allowing immediate restart without waiting for lease expiry. 3. Disable leader election on SNO: Since SNO has only one node and the operator runs with replicas=1, leader election provides no benefit and only adds overhead. Combined effect: SNO IBU delay reduced from ~8 min to < 1 min. Fixes: https://issues.redhat.com/browse/OADP-7419 Co-authored-by: Claude <noreply@anthropic.com>
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai, shubham-pampattiwar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@shubham-pampattiwar: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@shubham-pampattiwar noting oadp-1.4 is currently under test for release. If the current QE build fails and this merges we MAY need to revert... |
|
@weshayutin sounds good ! |
Summary
Cherry-pick of #2082 to oadp-1.4 branch.
Fixes ~8 minute delay during Image-Based Upgrade (IBU) on SNO clusters by:
See https://issues.redhat.com/browse/OADP-7419
Test plan
🤖 Generated with Claude Code