USHIFT-6400: Rebase SR-IOV to v4.21 and re-enable RHEL 10 tests#6385
USHIFT-6400: Rebase SR-IOV to v4.21 and re-enable RHEL 10 tests#6385openshift-merge-bot[bot] merged 3 commits intoopenshift:mainfrom
Conversation
The v4.21 SR-IOV operator bundle includes sriov-cni images that properly support RHEL 10 hosts. The sriov-cni init container reads /etc/os-release from the host to select the appropriate binary, and older v4.20 images did not recognize RHEL 10, causing the init container to crash with CrashLoopBackOff. This prevented CNI binary installation to /run/cni/bin and broke all SR-IOV functionality on RHEL 10. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With the SR-IOV operator rebased to v4.21, the sriov-cni init container now supports RHEL 10 hosts. Remove the USHIFT-6400 workaround that skipped SR-IOV tests and disabled the sriov network interface on RHEL 10 test scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@ggiguash: This pull request references USHIFT-6400 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.21" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
@ggiguash: This pull request references USHIFT-6400 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.21" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughBumps SR-IOV operator release from 4.20 → 4.21, updates many pinned image digests, adds an optional Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/test ? |
|
✅ Actions performedFull review triggered. |
|
/test e2e-aws-tests-bootc-periodic-arm-el10 |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
assets/optional/sriov/deploy/operator.yaml (1)
58-65:⚠️ Potential issue | 🟠 MajorDuplicate
CLUSTER_TYPEenv var — second occurrence wins.Lines 60–61 define
CLUSTER_TYPE=openshift, but lines 64–65 redefine it asCLUSTER_TYPE=kubernetes. Kubernetes uses the last definition, so the operator will run withkubernetes.If
openshiftis intended (per MicroShift's basis), remove the duplicate at lines 64–65. The kustomization patches only add new variables and won't resolve this.Proposed fix
- name: RELEASE_VERSION value: 4.21.0 - name: CLUSTER_TYPE value: openshift - name: SRIOV_CNI_BIN_PATH value: /run/cni/bin - - name: CLUSTER_TYPE - value: kubernetes image: quay.io/openshift/sriov-network-operator:latest🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@assets/optional/sriov/deploy/operator.yaml` around lines 58 - 65, The manifest contains a duplicate environment variable CLUSTER_TYPE (first set to "openshift", later overwritten to "kubernetes"); remove the incorrect duplicate entry so CLUSTER_TYPE=openshift remains (look for the env block containing RELEASE_VERSION, SRIOV_CNI_BIN_PATH and the two CLUSTER_TYPE entries) or, if k8s is intended, change the first occurrence instead—ensure only one CLUSTER_TYPE env var exists to avoid the last-wins overwrite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@assets/optional/sriov/deploy/operator.yaml`:
- Around line 58-65: The manifest contains a duplicate environment variable
CLUSTER_TYPE (first set to "openshift", later overwritten to "kubernetes");
remove the incorrect duplicate entry so CLUSTER_TYPE=openshift remains (look for
the env block containing RELEASE_VERSION, SRIOV_CNI_BIN_PATH and the two
CLUSTER_TYPE entries) or, if k8s is intended, change the first occurrence
instead—ensure only one CLUSTER_TYPE env var exists to avoid the last-wins
overwrite.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 02f35df7-e49c-43cf-8ce8-ed5686ff951b
📒 Files selected for processing (12)
assets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodepolicies.yamlassets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworknodestates.yamlassets/optional/sriov/crd/sriovnetwork.openshift.io_sriovnetworks.yamlassets/optional/sriov/deploy/operator.yamlassets/optional/sriov/kustomization.aarch64.yamlassets/optional/sriov/kustomization.x86_64.yamlassets/optional/sriov/release-sriov-aarch64.jsonassets/optional/sriov/release-sriov-x86_64.jsonscripts/auto-rebase/last_rebase_sriov.shscripts/auto-rebase/rebase_job_entrypoint.shtest/scenarios-bootc/el10/periodics/el102-src@optional.shtest/scenarios-bootc/el10/releases/el102-lrel@optional.sh
|
/test rebase |
| value: openshift | ||
| - name: SRIOV_CNI_BIN_PATH | ||
| value: /run/cni/bin | ||
| - name: CLUSTER_TYPE |
There was a problem hiding this comment.
@pacevedom , what's the right cluster type here?
Should we patch our rebase scripts to select one of those?
See this AI comment.
|
/test e2e-aws-tests-bootc-periodic-arm-el10 /test rebase |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/auto-rebase/rebase_sriov.sh (1)
342-346: Use upsert pattern forCLUSTER_TYPEto handle upstream manifest changesThe select-based update at line 342 is update-only. If upstream manifests ever omit
CLUSTER_TYPEfrom the env list, this silently no-ops without error, leaving the automation incomplete. Use an explicit check and append pattern instead to ensure the environment variable is always present.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/auto-rebase/rebase_sriov.sh` around lines 342 - 346, The current jq update uses select(.name == "CLUSTER_TYPE") which only updates existing entries and silently no-ops if CLUSTER_TYPE is missing; change the operation on .spec.template.spec.containers[0].env to an upsert: if any(.name=="CLUSTER_TYPE") then update that object's .value to "kubernetes" else add {name:"CLUSTER_TYPE", value:"kubernetes"} to the env array. Locate the jq expression that targets .spec.template.spec.containers[0].env and replace the select-based assignment with a conditional that performs the update-or-append upsert for CLUSTER_TYPE so the environment variable is always present regardless of upstream manifest changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/auto-rebase/rebase_sriov.sh`:
- Around line 342-346: The current jq update uses select(.name ==
"CLUSTER_TYPE") which only updates existing entries and silently no-ops if
CLUSTER_TYPE is missing; change the operation on
.spec.template.spec.containers[0].env to an upsert: if
any(.name=="CLUSTER_TYPE") then update that object's .value to "kubernetes" else
add {name:"CLUSTER_TYPE", value:"kubernetes"} to the env array. Locate the jq
expression that targets .spec.template.spec.containers[0].env and replace the
select-based assignment with a conditional that performs the update-or-append
upsert for CLUSTER_TYPE so the environment variable is always present regardless
of upstream manifest changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8d78f2b7-272d-4871-bcd3-ec4f04a995ae
📒 Files selected for processing (2)
assets/optional/sriov/deploy/operator.yamlscripts/auto-rebase/rebase_sriov.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- assets/optional/sriov/deploy/operator.yaml
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ggiguash, pacevedom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
The auto-recovery test failures are not related to the current change |
|
@ggiguash: Overrode contexts on behalf of ggiguash: ci/prow/e2e-aws-tests, ci/prow/e2e-aws-tests-arm, ci/prow/e2e-aws-tests-bootc-arm-el10, ci/prow/e2e-aws-tests-bootc-el10 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@ggiguash: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/verified by ci |
|
@ggiguash: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Summary
Root Cause
The sriov-cni init container reads
/host/etc/os-releaseto detect the host OS and selectthe appropriate CNI binary (rhel8 vs rhel9). The v4.20 images did not recognize RHEL 10,
causing the init container to crash (CrashLoopBackOff). This prevented CNI binary installation
to
/run/cni/bin, breaking all SR-IOV functionality on RHEL 10.The v4.21 bundle includes sriov-cni images with RHEL 10 support (using rhel9-compiled binaries).
Changes
image references from the v4.21 operator bundle
the sriov network interface on RHEL 10 (
el102-src@optional.shandel102-lrel@optional.sh)Test plan
make verify-assetspasses (done locally)make buildsucceeds (done locally)make testpasses (done locally)🤖 Generated with Claude Code via
/jira:solve [USHIFT-6400](https://redhat.atlassian.net/browse/USHIFT-6400)