Skip to content

USHIFT-6400: Rebase SR-IOV to v4.21 and re-enable RHEL 10 tests#6385

Merged
openshift-merge-bot[bot] merged 3 commits intoopenshift:mainfrom
ggiguash:fix-USHIFT-6400
Mar 23, 2026
Merged

USHIFT-6400: Rebase SR-IOV to v4.21 and re-enable RHEL 10 tests#6385
openshift-merge-bot[bot] merged 3 commits intoopenshift:mainfrom
ggiguash:fix-USHIFT-6400

Conversation

@ggiguash
Copy link
Contributor

@ggiguash ggiguash commented Mar 20, 2026

Summary

  • Rebases the SR-IOV network operator from v4.20 to v4.21
  • Re-enables SR-IOV tests on RHEL 10 test scenarios

Root Cause

The sriov-cni init container reads /host/etc/os-release to detect the host OS and select
the appropriate CNI binary (rhel8 vs rhel9). The v4.20 images did not recognize RHEL 10,
causing the init container to crash (CrashLoopBackOff). This prevented CNI binary installation
to /run/cni/bin, breaking all SR-IOV functionality on RHEL 10.

The v4.21 bundle includes sriov-cni images with RHEL 10 support (using rhel9-compiled binaries).

Changes

  1. SR-IOV rebase to v4.21: Updates all SR-IOV operator manifests, CRDs, RBAC, and container
    image references from the v4.21 operator bundle
  2. Test scenarios: Removes the USHIFT-6400 workaround that skipped SR-IOV tests and disabled
    the sriov network interface on RHEL 10 (el102-src@optional.sh and el102-lrel@optional.sh)

Test plan

  • Verify make verify-assets passes (done locally)
  • Verify make build succeeds (done locally)
  • Verify make test passes (done locally)
  • CI: el102-src@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el102-lrel@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el98-src@optional scenario continues to pass on RHEL 9 (no regression)

🤖 Generated with Claude Code via /jira:solve [USHIFT-6400](https://redhat.atlassian.net/browse/USHIFT-6400)

ggiguash and others added 2 commits March 20, 2026 20:00
The v4.21 SR-IOV operator bundle includes sriov-cni images that
properly support RHEL 10 hosts. The sriov-cni init container reads
/etc/os-release from the host to select the appropriate binary, and
older v4.20 images did not recognize RHEL 10, causing the init
container to crash with CrashLoopBackOff. This prevented CNI binary
installation to /run/cni/bin and broke all SR-IOV functionality on
RHEL 10.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With the SR-IOV operator rebased to v4.21, the sriov-cni init
container now supports RHEL 10 hosts. Remove the USHIFT-6400
workaround that skipped SR-IOV tests and disabled the sriov network
interface on RHEL 10 test scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 20, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 20, 2026

@ggiguash: This pull request references USHIFT-6400 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.21" instead.

Details

In response to this:

Summary

  • Rebases the SR-IOV network operator from v4.20 to v4.21
  • Re-enables SR-IOV tests on RHEL 10 test scenarios

Root Cause

The sriov-cni init container reads /host/etc/os-release to detect the host OS and select
the appropriate CNI binary (rhel8 vs rhel9). The v4.20 images did not recognize RHEL 10,
causing the init container to crash (CrashLoopBackOff). This prevented CNI binary installation
to /run/cni/bin, breaking all SR-IOV functionality on RHEL 10.

The v4.21 bundle includes sriov-cni images with RHEL 10 support (using rhel9-compiled binaries).

Changes

  1. SR-IOV rebase to v4.21: Updates all SR-IOV operator manifests, CRDs, RBAC, and container
    image references from the v4.21 operator bundle
  2. Test scenarios: Removes the USHIFT-6400 workaround that skipped SR-IOV tests and disabled
    the sriov network interface on RHEL 10 (el102-src@optional.sh and el102-lrel@optional.sh)

Test plan

  • Verify make verify-assets passes (done locally)
  • Verify make build succeeds (done locally)
  • Verify make test passes (done locally)
  • CI: el102-src@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el102-lrel@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el98-src@optional scenario continues to pass on RHEL 9 (no regression)

🤖 Generated with Claude Code via /jira:solve [USHIFT-6400](https://redhat.atlassian.net/browse/USHIFT-6400)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 20, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 20, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 20, 2026

@ggiguash: This pull request references USHIFT-6400 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.21" instead.

Details

In response to this:

Summary

  • Rebases the SR-IOV network operator from v4.20 to v4.21
  • Re-enables SR-IOV tests on RHEL 10 test scenarios

Root Cause

The sriov-cni init container reads /host/etc/os-release to detect the host OS and select
the appropriate CNI binary (rhel8 vs rhel9). The v4.20 images did not recognize RHEL 10,
causing the init container to crash (CrashLoopBackOff). This prevented CNI binary installation
to /run/cni/bin, breaking all SR-IOV functionality on RHEL 10.

The v4.21 bundle includes sriov-cni images with RHEL 10 support (using rhel9-compiled binaries).

Changes

  1. SR-IOV rebase to v4.21: Updates all SR-IOV operator manifests, CRDs, RBAC, and container
    image references from the v4.21 operator bundle
  2. Test scenarios: Removes the USHIFT-6400 workaround that skipped SR-IOV tests and disabled
    the sriov network interface on RHEL 10 (el102-src@optional.sh and el102-lrel@optional.sh)

Test plan

  • Verify make verify-assets passes (done locally)
  • Verify make build succeeds (done locally)
  • Verify make test passes (done locally)
  • CI: el102-src@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el102-lrel@optional scenario runs SR-IOV tests successfully on RHEL 10
  • CI: el98-src@optional scenario continues to pass on RHEL 9 (no regression)

🤖 Generated with Claude Code via /jira:solve [USHIFT-6400](https://redhat.atlassian.net/browse/USHIFT-6400)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 20, 2026

Walkthrough

Bumps SR-IOV operator release from 4.20 → 4.21, updates many pinned image digests, adds an optional failMode enum field to OVS bridge entries in SR-IOV CRD schemas, tightens spec.vlan max from 4096 → 4094, adjusts operator deployment env ordering, updates rebase scripts, and enables SR-IOV networks/tests by default except on aarch64.

Changes

Cohort / File(s) Summary
CRD Schema updates
assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodepolicies.yaml, assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodestates.yaml, assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworks.yaml
Added optional failMode field (string, enum: secure, standalone) under ovs.bridge in SriovNetworkNodePolicy and both spec.bridges[] and status.bridges[] in SriovNetworkNodeState. Updated SriovNetwork.spec.vlan maximum from 40964094.
Operator deployment manifest
assets/optional/sriov/deploy/operator.yaml
Bumped RELEASE_VERSION value 4.20.04.21.0. Reordered SRIOV_CNI_BIN_PATH env entry position relative to CLUSTER_TYPE (value unchanged).
Kustomize patches (image digests)
assets/optional/sriov/kustomization.aarch64.yaml, assets/optional/sriov/kustomization.x86_64.yaml
Replaced pinned operator image mapping and updated @sha256 digests for multiple SR-IOV component env var image references in the kustomize patches. No name/path changes.
Release metadata (image digests & base)
assets/optional/sriov/release-sriov-aarch64.json, assets/optional/sriov/release-sriov-x86_64.json
Updated release.base 4.20.0-...4.21.0-... and refreshed container digests for all listed SR-IOV images.
Rebase automation scripts
scripts/auto-rebase/last_rebase_sriov.sh, scripts/auto-rebase/rebase_job_entrypoint.sh, scripts/auto-rebase/rebase_sriov.sh
Incremented bundle reference from ...:v4.20...:v4.21 in scripts. rebase_sriov.sh yq expression changed to update existing CLUSTER_TYPE env entry instead of appending a new one.
Test scenarios
test/scenarios-bootc/el10/periodics/el102-src@optional.sh, test/scenarios-bootc/el10/releases/el102-lrel@optional.sh
Default VM networks now include sriov and default skip_args no longer excludes SR-IOV tests; aarch64 branch still excludes sriov (and tls-scanner) for that platform.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2026
@ggiguash
Copy link
Contributor Author

/test ?
@coderabbitai perform full review

@coderabbitai
Copy link

coderabbitai bot commented Mar 20, 2026

@ggiguash Sure, triggering a full review now!

✅ Actions performed

Full review triggered.

@ggiguash
Copy link
Contributor Author

/test e2e-aws-tests-bootc-periodic-arm-el10
/test e2e-aws-tests-bootc-periodic-arm-el9
/test e2e-aws-tests-bootc-periodic-el10
/test e2e-aws-tests-bootc-periodic-el9

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
assets/optional/sriov/deploy/operator.yaml (1)

58-65: ⚠️ Potential issue | 🟠 Major

Duplicate CLUSTER_TYPE env var — second occurrence wins.

Lines 60–61 define CLUSTER_TYPE=openshift, but lines 64–65 redefine it as CLUSTER_TYPE=kubernetes. Kubernetes uses the last definition, so the operator will run with kubernetes.

If openshift is intended (per MicroShift's basis), remove the duplicate at lines 64–65. The kustomization patches only add new variables and won't resolve this.

Proposed fix
            - name: RELEASE_VERSION
              value: 4.21.0
            - name: CLUSTER_TYPE
              value: openshift
            - name: SRIOV_CNI_BIN_PATH
              value: /run/cni/bin
-            - name: CLUSTER_TYPE
-              value: kubernetes
            image: quay.io/openshift/sriov-network-operator:latest
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@assets/optional/sriov/deploy/operator.yaml` around lines 58 - 65, The
manifest contains a duplicate environment variable CLUSTER_TYPE (first set to
"openshift", later overwritten to "kubernetes"); remove the incorrect duplicate
entry so CLUSTER_TYPE=openshift remains (look for the env block containing
RELEASE_VERSION, SRIOV_CNI_BIN_PATH and the two CLUSTER_TYPE entries) or, if k8s
is intended, change the first occurrence instead—ensure only one CLUSTER_TYPE
env var exists to avoid the last-wins overwrite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@assets/optional/sriov/deploy/operator.yaml`:
- Around line 58-65: The manifest contains a duplicate environment variable
CLUSTER_TYPE (first set to "openshift", later overwritten to "kubernetes");
remove the incorrect duplicate entry so CLUSTER_TYPE=openshift remains (look for
the env block containing RELEASE_VERSION, SRIOV_CNI_BIN_PATH and the two
CLUSTER_TYPE entries) or, if k8s is intended, change the first occurrence
instead—ensure only one CLUSTER_TYPE env var exists to avoid the last-wins
overwrite.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 02f35df7-e49c-43cf-8ce8-ed5686ff951b

📥 Commits

Reviewing files that changed from the base of the PR and between 3270ee8 and a18a02b.

📒 Files selected for processing (12)
  • assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodepolicies.yaml
  • assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodestates.yaml
  • assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworks.yaml
  • assets/optional/sriov/deploy/operator.yaml
  • assets/optional/sriov/kustomization.aarch64.yaml
  • assets/optional/sriov/kustomization.x86_64.yaml
  • assets/optional/sriov/release-sriov-aarch64.json
  • assets/optional/sriov/release-sriov-x86_64.json
  • scripts/auto-rebase/last_rebase_sriov.sh
  • scripts/auto-rebase/rebase_job_entrypoint.sh
  • test/scenarios-bootc/el10/periodics/el102-src@optional.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@optional.sh

@ggiguash
Copy link
Contributor Author

/test rebase
/test verify

value: openshift
- name: SRIOV_CNI_BIN_PATH
value: /run/cni/bin
- name: CLUSTER_TYPE
Copy link
Contributor Author

@ggiguash ggiguash Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pacevedom , what's the right cluster type here?
Should we patch our rebase scripts to select one of those?

See this AI comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempting to fix it in 85dd996

@ggiguash
Copy link
Contributor Author

/test e2e-aws-tests-bootc-periodic-arm-el10
/test e2e-aws-tests-bootc-periodic-arm-el9
/test e2e-aws-tests-bootc-periodic-el10
/test e2e-aws-tests-bootc-periodic-el9

/test rebase
/test verify

@ggiguash ggiguash marked this pull request as ready for review March 21, 2026 08:41
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 21, 2026
@openshift-ci openshift-ci bot requested review from copejon and pacevedom March 21, 2026 08:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
scripts/auto-rebase/rebase_sriov.sh (1)

342-346: Use upsert pattern for CLUSTER_TYPE to handle upstream manifest changes

The select-based update at line 342 is update-only. If upstream manifests ever omit CLUSTER_TYPE from the env list, this silently no-ops without error, leaving the automation incomplete. Use an explicit check and append pattern instead to ensure the environment variable is always present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/auto-rebase/rebase_sriov.sh` around lines 342 - 346, The current jq
update uses select(.name == "CLUSTER_TYPE") which only updates existing entries
and silently no-ops if CLUSTER_TYPE is missing; change the operation on
.spec.template.spec.containers[0].env to an upsert: if
any(.name=="CLUSTER_TYPE") then update that object's .value to "kubernetes" else
add {name:"CLUSTER_TYPE", value:"kubernetes"} to the env array. Locate the jq
expression that targets .spec.template.spec.containers[0].env and replace the
select-based assignment with a conditional that performs the update-or-append
upsert for CLUSTER_TYPE so the environment variable is always present regardless
of upstream manifest changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/auto-rebase/rebase_sriov.sh`:
- Around line 342-346: The current jq update uses select(.name ==
"CLUSTER_TYPE") which only updates existing entries and silently no-ops if
CLUSTER_TYPE is missing; change the operation on
.spec.template.spec.containers[0].env to an upsert: if
any(.name=="CLUSTER_TYPE") then update that object's .value to "kubernetes" else
add {name:"CLUSTER_TYPE", value:"kubernetes"} to the env array. Locate the jq
expression that targets .spec.template.spec.containers[0].env and replace the
select-based assignment with a conditional that performs the update-or-append
upsert for CLUSTER_TYPE so the environment variable is always present regardless
of upstream manifest changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 8d78f2b7-272d-4871-bcd3-ec4f04a995ae

📥 Commits

Reviewing files that changed from the base of the PR and between a18a02b and 85dd996.

📒 Files selected for processing (2)
  • assets/optional/sriov/deploy/operator.yaml
  • scripts/auto-rebase/rebase_sriov.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • assets/optional/sriov/deploy/operator.yaml

Copy link
Contributor

@pacevedom pacevedom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 23, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggiguash, pacevedom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ggiguash
Copy link
Contributor Author

The auto-recovery test failures are not related to the current change
/override ci/prow/e2e-aws-tests ci/prow/e2e-aws-tests-arm ci/prow/e2e-aws-tests-bootc-arm-el10 ci/prow/e2e-aws-tests-bootc-el10
/verfied by ci

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

@ggiguash: Overrode contexts on behalf of ggiguash: ci/prow/e2e-aws-tests, ci/prow/e2e-aws-tests-arm, ci/prow/e2e-aws-tests-bootc-arm-el10, ci/prow/e2e-aws-tests-bootc-el10

Details

In response to this:

The auto-recovery test failures are not related to the current change
/override ci/prow/e2e-aws-tests ci/prow/e2e-aws-tests-arm ci/prow/e2e-aws-tests-bootc-arm-el10 ci/prow/e2e-aws-tests-bootc-el10
/verfied by ci

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

@ggiguash: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ggiguash
Copy link
Contributor Author

/verified by ci

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 23, 2026
@openshift-ci-robot
Copy link

@ggiguash: This PR has been marked as verified by ci.

Details

In response to this:

/verified by ci

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot bot merged commit c68798e into openshift:main Mar 23, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants