Honor retry after by taliandre49 · Pull Request #10421 · openshift/installer

taliandre49 · 2026-03-23T18:54:08Z

Honor Retry-After Header During PowerVS Cluster Destruction

This PR adds support for honoring the Retry-After response header when IBM Cloud API calls are rate limited (HTTP 429) during PowerVS cluster destruction.

Changes include:

Added waitForRetryAfter() helper function in powervs.go that reads the Retry-After header from a rate-limited response and sleeps for the specified duration before retrying, while also respecting context cancellation.
Updated loadSDKServices() to wrap ListZonesWithContext and ListDnszones calls in retry loops that invoke waitForRetryAfter() on 429 responses instead of immediately returning an error.
Added unit tests in powervs_test.go covering waitForRetryAfter() behavior (nil response, missing header, valid wait, context cancellation) and an integration-style test for the ListZones retry loop using a mock HTTP server.

Corresponding Jira ticket: https://redhat.atlassian.net/browse/MULTIARCH-5854

…ubnet destruction Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

…on error that failed before destroyPowerSubnets() logic ran Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

…guards for when to trigger logic Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

…roy powervs Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

coderabbitai · 2026-03-23T18:54:35Z

Walkthrough

Adds PowerVS destroy-time recovery for resources blocked by attached network interfaces (DHCP servers and subnets), and implements Retry-After-driven retry handling for SDK List APIs returning HTTP 429; includes unit tests for the Retry-After behavior.

Changes

Cohort / File(s)	Summary
DHCP network deletion & attachment handling `pkg/destroy/powervs/power-dhcp.go`	Introduce helpers: `isDHCPNetworkAttachedError`, `extractNetworkIDFromError`, `findNetworkIDByName`. Update `destroyDHCPNetwork` to fetch DHCP server object, proactively delete NICs for the DHCP server's network before deletion, attempt NIC cleanup when deletion fails due to attachments, and remove the prior subnet-existence backoff.
Subnet NIC cleanup & error classification `pkg/destroy/powervs/power-subnet.go`	Add `deleteNetworkInterfaces(subnetID string) error` to enumerate/delete NICs and `isNetworkInterfaceError(err error) bool` to detect NIC-blocking errors. Modify `deletePowerSubnet` to delete subnet-attached NICs on deletion failures caused by attached interfaces and return a wrapped error to trigger existing retry/backoff.
Rate-limit retry / Retry-After handling `pkg/destroy/powervs/powervs.go`	Add `ClusterUninstaller.waitForRetryAfter(ctx, response)` to parse and wait on `Retry-After` (seconds). Wrap SDK list calls (`zonesSvc.ListZonesWithContext`, `dnsZonesSvc.ListDnszones`) with retry logic that, on HTTP 429, uses `waitForRetryAfter` to decide whether to wait and retry. Introduce local `*core.DetailedResponse` variables and import `strconv`.
Unit tests for Retry-After and retry loop `pkg/destroy/powervs/powervs_test.go`	Add `TestWaitForRetryAfter` and `TestListZonesRetryLogic` to validate `waitForRetryAfter` behavior (nil response, missing header, positive wait, context cancellation) and to exercise a retry loop against an httptest server returning two 429+Retry-After responses followed by a 200, asserting timing and retry count.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-03-23T18:55:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign prashanth684 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/destroy/powervs/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/destroy/powervs/power-dhcp.go`:
- Around line 292-296: The current error return in the DHCP server deletion
block causes wait.ExponentialBackoffWithContext to abort because it receives
(false, error) — change the closure to signal a retry by returning (false, nil)
after attempting NIC deletion; keep the o.Logger.Warnf call and remove/replace
the fmt.Errorf(...) return with a retry signal so the backoff loop continues
(see the closure around wait.ExponentialBackoffWithContext and the call to
o.deleteNetworkInterfaces and the DHCP server deletion logic).

In `@pkg/destroy/powervs/power-subnet.go`:
- Around line 100-104: The current callback used with
ExponentialBackoffWithContext returns a non-nil error after calling
deleteNetworkInterfaces (via the fmt.Errorf(...) return), which causes the
backoff to stop; change the callback so that when NIC cleanup is attempted but
subnet is still blocked you return the retriable signal instead (i.e., return
(false, nil) from the callback) rather than a non-nil error — or alternatively
return a sentinel error value that your backoff wrapper maps to (false, nil).
Update the code path around deleteNetworkInterfaces and the fmt.Errorf(...)
return so the backoff receives the retriable indication and will retry the
subnet deletion.

In `@pkg/destroy/powervs/powervs.go`:
- Around line 414-419: The select that waits on ctx.Done() vs time.After(d)
treats a cancelled context as a successful wait by returning true; change the
ctx.Done() branch to return false (or otherwise propagate the cancellation) so
callers (e.g., the DNS zones loop that calls ListDnszones) stop retrying when
the context is cancelled. Locate the boolean wait helper (the select using
ctx.Done() and time.After(d)) and update it to return false on ctx.Done()
(optionally include ctx.Err() if you want to surface why), and update callers to
stop retrying when the helper returns false.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 30bb2d50-314b-4f3c-b28f-7655fbbc6b16

📥 Commits

Reviewing files that changed from the base of the PR and between ac9a370 and c1c68b0.

📒 Files selected for processing (4)

pkg/destroy/powervs/power-dhcp.go
pkg/destroy/powervs/power-subnet.go
pkg/destroy/powervs/powervs.go
pkg/destroy/powervs/powervs_test.go

coderabbitai · 2026-03-23T19:10:37Z

pkg/destroy/powervs/power-dhcp.go

+				if nicErr := o.deleteNetworkInterfaces(networkID); nicErr != nil {
+					o.Logger.Warnf("Failed to delete network interfaces for DHCP subnet %q: %v", item.name, nicErr)
+				}
+				// Return error to trigger retry after NIC deletion
+				return fmt.Errorf("DHCP server deletion blocked by attached network interfaces: %w", err)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the DHCP code around lines 292-296 cd /tmp && find . -name "power-dhcp.go" -type f 2>/dev/null | head -5

Repository: openshift/installer

Length of output: 45

🏁 Script executed:

# List relevant files git ls-files | grep -E "(power-dhcp|power-subnet)" | head -10

Repository: openshift/installer

Length of output: 132

🏁 Script executed:

# Read the DHCP code around the flagged lines cat -n pkg/destroy/powervs/power-dhcp.go | sed -n '285,305p'

Repository: openshift/installer

Length of output: 1179

🏁 Script executed:

# Get more context around destroyDHCPNetwork function cat -n pkg/destroy/powervs/power-dhcp.go | sed -n '250,300p'

Repository: openshift/installer

Length of output: 2773

🏁 Script executed:

# Get the full destroyDHCPNetwork function cat -n pkg/destroy/powervs/power-dhcp.go | sed -n '260,390p'

Repository: openshift/installer

Length of output: 5369

🏁 Script executed:

# Check the subnet code for comparison cat -n pkg/destroy/powervs/power-subnet.go | grep -A 50 "deleteNetworkInterfaces\|ExponentialBackoff" | head -80

Repository: openshift/installer

Length of output: 3405

🏁 Script executed:

# Check imports in power-dhcp.go to confirm which wait package is used head -50 pkg/destroy/powervs/power-dhcp.go | grep -E "^import|wait"

Repository: openshift/installer

Length of output: 108

🏁 Script executed:

# Get all imports from power-dhcp.go cat -n pkg/destroy/powervs/power-dhcp.go | sed -n '1,30p'

Repository: openshift/installer

Length of output: 1064

🌐 Web query:

k8s.io/apimachinery/pkg/util/wait ExponentialBackoffWithContext return value semantics bool error

💡 Result:

The function ExponentialBackoffWithContext in k8s.io/apimachinery/pkg/util/wait has the signature: func ExponentialBackoffWithContext(ctx context.Context, backoff Backoff, condition ConditionWithContextFunc) error It returns a single error value, not (bool, error). The condition function (ConditionWithContextFunc) is called repeatedly with exponential backoff until: - the condition returns (true, nil): function returns nil - the condition returns (false, err): function returns that err - context is cancelled: returns ctx.Err - backoff.Steps exhausted: returns ErrWaitTimeout There is no bool return value from the function itself; success is indicated by nil error, timeout by ErrWaitTimeout, condition error by the condition's error, or context error by ctx.Err.

Citations:

1: https://pkg.go.dev/k8s.io/apimachinery/pkg/util/wait

2: https://github.com/kubernetes/kubernetes/blob/353f0a5eabe4bd8d31bb67275ee4beeb4655be3f/staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go

🏁 Script executed:

# Verify the actual k8s source code to double-check the semantics web_search "kubernetes wait.ExponentialBackoffWithContext condition function false error exit immediately"

Repository: openshift/installer

Length of output: 113

🌐 Web query:

k8s.io/apimachinery/pkg/util/wait ConditionWithContextFunc (false, error) behavior retry

💡 Result:

In the k8s.io/apimachinery/pkg/util/wait package, ConditionWithContextFunc is defined as func(context.Context) (done bool, err error). It returns true if the condition is satisfied, or an error if the loop should be aborted. This type was added in v0.22.0. When used in polling functions like ExponentialBackoffWithContext(ctx context.Context, backoff Backoff, condition ConditionWithContextFunc) error (or similar functions like PollImmediateUntilWithContext), the behavior for returning (false, error) is: the error is immediately propagated and returned by the polling function, terminating the retry loop without further attempts. Returning (false, nil) continues retrying according to the backoff policy until success (true, nil), timeout, context cancellation, or max steps. This mirrors the non-context ConditionFunc behavior: errors abort immediately, false continues polling. Context variants respect ctx.Done for cancellation. Example usage pattern from client-go/util/retry: err := wait.ExponentialBackoff(backoff, func (bool, error) { err := fn if err == nil { return true, nil } if retriable(err) { return false, nil } // retry return false, err // abort }) Returning (false, error) aborts retry immediately with that error.

Citations:

1: https://pkg.go.dev/k8s.io/apimachinery/pkg/util/wait

2: https://github.com/kubernetes/kubernetes/blob/353f0a5eabe4bd8d31bb67275ee4beeb4655be3f/staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go

3: https://github.com/kubernetes/client-go/blob/648d82831a4db2e9d07c0dad53603aa15c45a968/util/retry/util.go

4: https://pkg.go.dev/k8s.io/client-go/util/retry

5: apimachinery: add context bound polling kubernetes/kubernetes#101668

Return value exits retry loop immediately instead of triggering retry

Lines 295-296 return an error expecting the retry loop to continue, but wait.ExponentialBackoffWithContext aborts on (false, error) instead of retrying. The error propagates immediately, bypassing the intended retry after NIC deletion. Use (false, nil) to signal retry continuation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/destroy/powervs/power-dhcp.go` around lines 292 - 296, The current error return in the DHCP server deletion block causes wait.ExponentialBackoffWithContext to abort because it receives (false, error) — change the closure to signal a retry by returning (false, nil) after attempting NIC deletion; keep the o.Logger.Warnf call and remove/replace the fmt.Errorf(...) return with a retry signal so the backoff loop continues (see the closure around wait.ExponentialBackoffWithContext and the call to o.deleteNetworkInterfaces and the DHCP server deletion logic).

coderabbitai · 2026-03-23T19:10:37Z

pkg/destroy/powervs/power-subnet.go

+			if nicErr := o.deleteNetworkInterfaces(item.id); nicErr != nil {
+				o.Logger.Warnf("Failed to delete network interfaces for subnet %q: %v", item.name, nicErr)
+			}
+			// Return error to trigger retry after NIC deletion
+			return fmt.Errorf("subnet deletion blocked by network interfaces: %w", err)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail version=$(sed -nE 's/^[[:space:]]*k8s\.io\/apimachinery[[:space:]]+(v[^[:space:]]+).*/\1/p' go.mod | head -n1) if [[ -z "${version}" ]]; then echo "Could not determine k8s.io/apimachinery version from go.mod" >&2 exit 1 fi for path in pkg/util/wait/backoff.go pkg/util/wait/wait.go; do url="https://raw.githubusercontent.com/kubernetes/apimachinery/${version}/${path}" if curl -fsSL "$url" | rg -n -C2 'func ExponentialBackoffWithContext|err != nil \|\| ok'; then exit 0 fi done echo "Unable to locate ExponentialBackoffWithContext source for ${version}" >&2 exit 1

Repository: openshift/installer

Length of output: 740

Fix retry semantics: returning a non-nil error terminates the backoff instead of retrying

ExponentialBackoffWithContext stops immediately when the condition returns a non-nil error. Line 104 returns fmt.Errorf(...) after attempting NIC cleanup, which will exit the backoff loop and proceed directly to the fatal error handler on line 155-157, skipping any retry. To make this state retriable, signal retry intent by returning (false, nil) instead—use a sentinel value or error type that the callback maps to this return pair.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/destroy/powervs/power-subnet.go` around lines 100 - 104, The current callback used with ExponentialBackoffWithContext returns a non-nil error after calling deleteNetworkInterfaces (via the fmt.Errorf(...) return), which causes the backoff to stop; change the callback so that when NIC cleanup is attempted but subnet is still blocked you return the retriable signal instead (i.e., return (false, nil) from the callback) rather than a non-nil error — or alternatively return a sentinel error value that your backoff wrapper maps to (false, nil). Update the code path around deleteNetworkInterfaces and the fmt.Errorf(...) return so the backoff receives the retriable indication and will retry the subnet deletion.

pkg/destroy/powervs/powervs.go

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

taliandre49 · 2026-03-24T16:54:38Z

update:

resolved golint test failures
resolving code rabbit review comments
todo squash commits

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

taliandre49 · 2026-03-24T21:07:42Z

Note

Current code limitation: if the API keeps returning 429 indefinitely, this loops forever. There is no maximum number of retries. It would be a good design to give the retry loop a cap, will dig into what a reasonable cap would be.

Other things TODO consider: Code duplication, Sleep + signal tangled

…tley does not retry after Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

pkg/destroy/powervs/powervs.go (2)
617-621: Use the context-aware DNS zones call if the SDK provides one.

Right now ctx only governs the backoff sleep. If ListDnszones itself blocks, cancellation/deadlines still won't interrupt the HTTP call. If the generated client has ListDnszonesWithContext, prefer it here so the request and the wait share the same cancellation behavior.
Possible fix if the SDK exposes a context-aware variant
-			listZonesResponse, detailedResponse, err = o.dnsZonesSvc.ListDnszones(options)
+			listZonesResponse, detailedResponse, err = o.dnsZonesSvc.ListDnszonesWithContext(ctx, options)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/destroy/powervs/powervs.go` around lines 617 - 621, The call to
o.dnsZonesSvc.ListDnszones should use a context-aware variant so the HTTP
request honors cancellation/deadlines; replace the invoke with
o.dnsZonesSvc.ListDnszonesWithContext(ctx, options) (or call that overload when
it exists) and keep assigning to listZonesResponse, detailedResponse, err;
retain the same rate-limit handling with o.waitForRetryAfter(ctx,
detailedResponse) and the Debugf log. If the SDK doesn't expose
ListDnszonesWithContext, add a small wrapper that uses the context-aware method
when available, otherwise fall back to the existing ListDnszones to preserve
behavior.
558-568: Deduplicate the 429 retry policy.

These branches already need the same wait/log/exit behavior, and the next policy change (cap, jitter, or clearer cancellation errors) will have to be kept in sync twice. A small shared helper would keep this flow consistent.

As per coding guidelines, Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Also applies to: 616-625
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/destroy/powervs/powervs.go` around lines 558 - 568, The 429
retry/log/wait logic in loadSDKServices around o.zonesSvc.ListZonesWithContext
is duplicated; extract that policy into a small shared helper (e.g., a method
like o.handleRateLimitRetry(ctx, detailedResponse, err, "ListZones") or
o.retryOnTooManyRequests(ctx, func() (res any, resp *http.Response, err error) {
... })) and replace both blocks (the ListZonesWithContext block and the similar
block at lines ~616-625) to call the helper; the helper should inspect
detailedResponse.StatusCode for StatusTooManyRequests, call
o.waitForRetryAfter(ctx, detailedResponse), log via o.Logger.Debugf with a clear
operation name, and return a boolean or wrapped error so the caller can either
continue the loop or return the formatted error consistently from
loadSDKServices.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/destroy/powervs/powervs.go`:
- Around line 401-413: The Retry-After handling in waitForRetryAfter incorrectly
rejects valid HTTP-date values and treats 0 as invalid; update the logic that
reads headers.Get("Retry-After") so it first tries strconv.Atoi and accepts 0
(remove the secs <= 0 rejection), and if Atoi fails then attempt to parse the
header as an HTTP-date (use time.Parse with RFC1123/RFC1123Z as appropriate) and
compute secs as the max of 0 and int(parsedTime.Sub(time.Now()).Seconds()); then
use secs to build the duration (d := time.Duration(secs) * time.Second) and log
the header and computed wait. Reference symbols: retryAfterHeader, headers.Get,
strconv.Atoi, time.Parse, secs, d, and the waitForRetryAfter logging calls.

---

Nitpick comments:
In `@pkg/destroy/powervs/powervs.go`:
- Around line 617-621: The call to o.dnsZonesSvc.ListDnszones should use a
context-aware variant so the HTTP request honors cancellation/deadlines; replace
the invoke with o.dnsZonesSvc.ListDnszonesWithContext(ctx, options) (or call
that overload when it exists) and keep assigning to listZonesResponse,
detailedResponse, err; retain the same rate-limit handling with
o.waitForRetryAfter(ctx, detailedResponse) and the Debugf log. If the SDK
doesn't expose ListDnszonesWithContext, add a small wrapper that uses the
context-aware method when available, otherwise fall back to the existing
ListDnszones to preserve behavior.
- Around line 558-568: The 429 retry/log/wait logic in loadSDKServices around
o.zonesSvc.ListZonesWithContext is duplicated; extract that policy into a small
shared helper (e.g., a method like o.handleRateLimitRetry(ctx, detailedResponse,
err, "ListZones") or o.retryOnTooManyRequests(ctx, func() (res any, resp
*http.Response, err error) { ... })) and replace both blocks (the
ListZonesWithContext block and the similar block at lines ~616-625) to call the
helper; the helper should inspect detailedResponse.StatusCode for
StatusTooManyRequests, call o.waitForRetryAfter(ctx, detailedResponse), log via
o.Logger.Debugf with a clear operation name, and return a boolean or wrapped
error so the caller can either continue the loop or return the formatted error
consistently from loadSDKServices.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aad51eef-6a47-4fb7-a285-4583ee2ad366

📥 Commits

Reviewing files that changed from the base of the PR and between b2251b5 and d487dc2.

📒 Files selected for processing (1)

pkg/destroy/powervs/powervs.go

coderabbitai · 2026-03-24T21:44:28Z

pkg/destroy/powervs/powervs.go

+	retryAfterHeader := headers.Get("Retry-After")
+
+	if retryAfterHeader == "" {
+		o.Logger.Debugf("waitForRetryAfter: Retry-After header is not present, or could not be retrieved")
+		return false
+	}
+	secs, err := strconv.Atoi(retryAfterHeader)
+	if err != nil || secs <= 0 {
+		o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader)
+		return false
+	}
+	d := time.Duration(secs) * time.Second
+	o.Logger.Debugf("waitForRetryAfter: Rate limited, honoring Retry-After header, waiting for %v", secs)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pkg/destroy/powervs/powervs.go | sed -n '390,420p'

Repository: openshift/installer

Length of output: 1218

🌐 Web query:

RFC 7231 Retry-After header format specification delta-seconds HTTP-date

💡 Result:

The RFC 7231 Retry-After header format specification defines it as either an HTTP-date or delta-seconds (number of seconds to delay). ABNF: Retry-After = HTTP-date / delay-seconds delay-seconds = 1*DIGIT HTTP-date is defined in RFC 7231 Section 7.1.1.1 as IMF-fixdate / obs-date (preferred fixed-length format or obsolete formats for compatibility). delta-seconds is a non-negative decimal integer (1 or more digits) representing seconds after the response is received. It matches the general delta-seconds rule used across HTTP/1.1 specs (defined in RFC 7230 Section 3.2.6 or equivalent, but locally in Retry-After context). Examples from RFC: Retry-After: Fri, 31 Dec 1999 23:59:59 GMT Retry-After: 120 (2 minutes) Note: RFC 7231 is obsoleted by RFC 9110 (HTTP Semantics, 2022), but the format remains unchanged.

Citations:

1: https://datatracker.ietf.org/doc/html/rfc7231

2: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After

3: https://datatracker.ietf.org/doc/html/rfc7231.html

4: https://rfcinfo.com/rfc-7231/7-response-header-fields

5: https://rfcinfo.com/rfc-7231/7-response-header-fields/

🏁 Script executed:

# Check if net/http provides ParseTime function python3 << 'EOF' import subprocess result = subprocess.run(['go', 'doc', 'net/http', 'ParseTime'], capture_output=True, text=True, cwd='.') print(result.stdout) print(result.stderr) EOF

Repository: openshift/installer

Length of output: 319

🏁 Script executed:

head -30 pkg/destroy/powervs/powervs.go

Repository: openshift/installer

Length of output: 964

🏁 Script executed:

rg "gohttp|import" pkg/destroy/powervs/powervs.go | head -40

Repository: openshift/installer

Length of output: 549

Support the full Retry-After header format per RFC 7231.

The Retry-After header can be sent as either an HTTP-date or delta-seconds format, and 0 is a valid delay value. The current implementation using strconv.Atoi() followed by secs <= 0 check only handles delta-seconds. This causes valid HTTP-date responses (e.g., Retry-After: Fri, 31 Dec 1999 23:59:59 GMT) to be silently dropped as errors, breaking HTTP specification compliance.

Suggested fix

retryAfterHeader := headers.Get("Retry-After") if retryAfterHeader == "" { o.Logger.Debugf("waitForRetryAfter: Retry-After header is not present, or could not be retrieved") return false } - secs, err := strconv.Atoi(retryAfterHeader) - if err != nil || secs <= 0 { - o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader) - return false - } - d := time.Duration(secs) * time.Second - o.Logger.Debugf("waitForRetryAfter: Rate limited, honoring Retry-After header, waiting for %v", secs) + retryAfterHeader = strings.TrimSpace(retryAfterHeader) + var d time.Duration + if secs, err := strconv.Atoi(retryAfterHeader); err == nil { + if secs < 0 { + o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader) + return false + } + d = time.Duration(secs) * time.Second + } else if t, err := gohttp.ParseTime(retryAfterHeader); err == nil { + d = time.Until(t) + if d < 0 { + d = 0 + } + } else { + o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader) + return false + } + o.Logger.Debugf("waitForRetryAfter: Rate limited, honoring Retry-After header, waiting for %v", d)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/destroy/powervs/powervs.go` around lines 401 - 413, The Retry-After handling in waitForRetryAfter incorrectly rejects valid HTTP-date values and treats 0 as invalid; update the logic that reads headers.Get("Retry-After") so it first tries strconv.Atoi and accepts 0 (remove the secs <= 0 rejection), and if Atoi fails then attempt to parse the header as an HTTP-date (use time.Parse with RFC1123/RFC1123Z as appropriate) and compute secs as the max of 0 and int(parsedTime.Sub(time.Now()).Seconds()); then use secs to build the duration (d := time.Duration(secs) * time.Second) and log the header and computed wait. Reference symbols: retryAfterHeader, headers.Get, strconv.Atoi, time.Parse, secs, d, and the waitForRetryAfter logging calls.

I agree need to incorporate something along these lineds

retryTime, err := time.Parse(time.RFC1123, Retryheader)

@taliandre49, great — that's the right direction. One suggestion: since the file already imports gohttp "net/http", you can use gohttp.ParseTime() instead of time.Parse(time.RFC1123, ...) directly. net/http.ParseTime handles all three HTTP date formats (RFC1123, RFC850, and ANSI C asctime) per the HTTP spec, making it more robust:

retryAfterHeader = strings.TrimSpace(retryAfterHeader) var d time.Duration if secs, err := strconv.Atoi(retryAfterHeader); err == nil { if secs < 0 { o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader) return false } d = time.Duration(secs) * time.Second } else if t, err := gohttp.ParseTime(retryAfterHeader); err == nil { d = time.Until(t) if d < 0 { d = 0 } } else { o.Logger.Debugf("waitForRetryAfter: Invalid Retry-After header value: %s", retryAfterHeader) return false } o.Logger.Debugf("waitForRetryAfter: Rate limited, honoring Retry-After header, waiting for %v", d) select { case <-ctx.Done(): return false case <-time.After(d): return true }

Key points vs. the original:

secs < 0 replaces secs <= 0 — a Retry-After: 0 is valid and means "retry immediately" (the time.After(0) fires right away and returns true).

gohttp.ParseTime already imports cleanly since gohttp is aliased in this file.

taliandre49 added 5 commits March 11, 2026 10:49

updating logic to include support for network interface deletion on s…

dc45528

…ubnet destruction Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

updating delete network interface to dhcp to catch dhcp subnet deleti…

e542202

…on error that failed before destroyPowerSubnets() logic ran Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

fixing golint errors: add periods to comments and remove redundant else

56d03fe

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

Added helper function to run honorig retry-after logic, adding check …

646b356

…guards for when to trigger logic Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

adding test cases for newly implemented logic for honor-retry in dest…

c1c68b0

…roy powervs Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

openshift-ci bot requested review from Prashanth684 and mkumatag March 23, 2026 18:55

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

updating files to resolve golint test failure errors

040fe05

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

updating lint errors

b2251b5

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

updating WaitForRetry function context done logic now returns immedia…

d487dc2

…tley does not retry after Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Honor retry after#10421

Honor retry after#10421
taliandre49 wants to merge 8 commits intoopenshift:mainfrom
taliandre49:honor-retry-after

taliandre49 commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

openshift-ci bot commented Mar 23, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

Uh oh!

taliandre49 commented Mar 24, 2026

Uh oh!

taliandre49 commented Mar 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 24, 2026 •

edited

Loading

Uh oh!

taliandre49 Mar 24, 2026

Uh oh!

coderabbitai bot Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

taliandre49 commented Mar 23, 2026

Honor Retry-After Header During PowerVS Cluster Destruction

Changes include:

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

openshift-ci bot commented Mar 23, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

taliandre49 commented Mar 24, 2026

Uh oh!

taliandre49 commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

taliandre49 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

taliandre49 commented Mar 24, 2026 •

edited

Loading

coderabbitai bot Mar 24, 2026 •

edited

Loading