Skip to content

reconcile: removed nested retry logic, which led to parallel pods rollout#1790

Open
AndrewChubatiuk wants to merge 2 commits intomasterfrom
fixed-parallel-pods-rollout
Open

reconcile: removed nested retry logic, which led to parallel pods rollout#1790
AndrewChubatiuk wants to merge 2 commits intomasterfrom
fixed-parallel-pods-rollout

Conversation

@AndrewChubatiuk
Copy link
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Feb 5, 2026

fixes #1693


Summary by cubic

Fixes swallowed reconcile timeouts that caused parallel pod rollouts. Pods now roll out sequentially for StatefulSet, Deployment, and DaemonSet. Addresses #1693.

  • Bug Fixes
    • Moved readiness waits outside retryOnConflict so errors are surfaced and handled.
    • Deployment/DaemonSet: perform create/update, then wait for readiness.
    • StatefulSet: simplified update flow, correct recreate detection, rolling update honors MaxUnavailable; PVC resizing runs after update.

Written for commit 7166e39. Summary will update on new commits.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/controller/operator/factory/reconcile/deploy.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/deploy.go:94">
P2: The new unconditional wait triggers a second readiness poll when the no-change branch already waits inside retryOnConflict, causing redundant wait/poll cycles. Consider returning nil in the no-change branch and rely on the outer wait, or track a flag to avoid double waiting.</violation>
</file>

<file name="docs/CHANGELOG.md">

<violation number="1" location="docs/CHANGELOG.md:32">
P3: Use the correct Kubernetes kind capitalization (`DaemonSet`) in the changelog entry to avoid confusion and match official naming.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the fixed-parallel-pods-rollout branch 3 times, most recently from fe8227b to 041a71c Compare February 5, 2026 21:06
@VictoriaMetrics VictoriaMetrics deleted a comment from cubic-dev-ai bot Feb 5, 2026
@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Feb 5, 2026

@cubic-dev-ai review this PR

@AndrewChubatiuk I have started the AI code review. It will take a few minutes to complete.

@AndrewChubatiuk
Copy link
Contributor Author

@cubic-dev-ai review this PR

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Feb 5, 2026

@cubic-dev-ai review this PR

@AndrewChubatiuk I have started the AI code review. It will take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/controller/operator/factory/reconcile/statefulset.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset.go:163">
P0: Nil pointer dereference: `cr.UpdateBehavior` is accessed without a nil check. When `UpdateBehavior` is nil (as in callers like `vmalertmanager`), this will panic. The old code guarded this with `if cr.UpdateBehavior != nil`.</violation>
</file>

<file name="docs/CHANGELOG.md">

<violation number="1" location="docs/CHANGELOG.md:32">
P2: Rule violated: **Changelog Review Agent**

Changelog entry includes internal implementation details (“timeout errors…during reconcile were just swallowed”), which violates the rule’s requirement to avoid implementation details in user-facing explanations. Rewrite to describe only the user-visible rollout behavior change.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the fixed-parallel-pods-rollout branch from 041a71c to 60bc81f Compare February 5, 2026 21:12
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/controller/operator/factory/reconcile/statefulset.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset.go:161">
P1: Guard `cr.UpdateBehavior` before dereferencing it in the OnDelete update strategy; it is optional and the current code will panic when it is nil.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

@AndrewChubatiuk AndrewChubatiuk force-pushed the fixed-parallel-pods-rollout branch 2 times, most recently from eded31d to 146c20e Compare February 5, 2026 21:59
@AndrewChubatiuk
Copy link
Contributor Author

@cubic-dev-ai review this PR

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Feb 5, 2026

@cubic-dev-ai review this PR

@AndrewChubatiuk I have started the AI code review. It will take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 6 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go:29">
P3: The updated comment is now inaccurate: this function no longer performs the recreate; it only reports whether a recreate (and pod recreation) is required. Clarify the comment to match the new behavior.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the fixed-parallel-pods-rollout branch from 146c20e to d306572 Compare February 5, 2026 22:10
@AndrewChubatiuk
Copy link
Contributor Author

@cubic-dev-ai review this PR

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Feb 6, 2026

@cubic-dev-ai review this PR

@AndrewChubatiuk I have started the AI code review. It will take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: vmcluster parallel rollout of components

1 participant