HIVE:3148: [WIP]hive extension binary test#78242
HIVE:3148: [WIP]hive extension binary test#78242miyadav wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new Changes
Sequence Diagram(s)sequenceDiagram
participant Prow
participant CI as ci-operator
participant Image as hive-tests
participant CloudAPI as Cloud Provider API
participant Cluster
Prow->>CI: trigger presubmit (/test ...-ote)
CI->>CI: prepare job (select `from: hive-tests`, mount creds, set envs)
CI->>CloudAPI: run provider workflow (ipi-aws/gcp/azure/vsphere)
CloudAPI->>Cluster: provision cluster (claims/profiles)
CI->>Image: execute `openshift-tests-extension run-suite openshift/hive`
Image->>Cluster: run e2e tests
Image->>CI: return test results
CI->>Prow: post job result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
ci-operator/jobs/openshift/hive/openshift-hive-master-presubmits.yaml (1)
249-332: Missing timeout fore2e-azure-otejob.The original
e2e-azurejob (lines 164-248) specifiestimeout: 4h30m0sindecoration_config:decoration_config: skip_cloning: true timeout: 4h30m0sThe new
e2e-azure-otejob doesn't include this timeout, which may cause premature job termination for Azure's longer provisioning times.Proposed fix
decoration_config: skip_cloning: true + timeout: 4h30m0s labels:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/jobs/openshift/hive/openshift-hive-master-presubmits.yaml` around lines 249 - 332, The new job pull-ci-openshift-hive-master-e2e-azure-ote is missing the decoration_config timeout, which can cause premature termination during long Azure provisioning; add a timeout: 4h30m0s entry under the job's decoration_config (next to skip_cloning: true) so the job uses the same 4h30m0s timeout as the original e2e-azure job.ci-operator/config/openshift/hive/openshift-hive-master.yaml (2)
393-411: Azure OTE test is missing timeout and BASE_DOMAIN configuration.Compared to the original
e2e-azuretest (lines 205-226):
Missing timeout: The original test specifies
timeout: 4h30m0sat the test level. Long-running Azure provisioning may exceed the default timeout.Missing BASE_DOMAIN: The original test exports
BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt).Proposed fix
- as: e2e-azure-ote run_if_changed: azure steps: cluster_profile: openshift-org-azure test: - as: test cli: latest commands: | + export BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt) /usr/bin/openshift-tests-extension run-suite openshift/hive dependencies: - env: HIVE_IMAGE name: hive - env: RELEASE_IMAGE name: release:latest from: hive-tests resources: requests: cpu: 100m + timeout: 2h30m0s workflow: ipi-azure + timeout: 4h30m0s🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml` around lines 393 - 411, The e2e-azure-ote job (as: e2e-azure-ote, run_if_changed: azure) lacks the test-level timeout and BASE_DOMAIN export present in the original e2e-azure job; update the test stanza (the test: - as: test block and its commands) to add timeout: 4h30m0s at the same level as test and ensure BASE_DOMAIN is exported before running openshift-tests-extension (e.g., add an env/dependency or prepend a command to set BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt) in the commands section) so the job uses the same timeout and base domain configuration as the original e2e-azure job.
374-392: GCP OTE test lacks cloud-specific environment setup.The original
e2e-gcptest setsCLOUD=gcp:commands: GO_COMPLIANCE_INFO=0 CLOUD=gcp make test-e2eThe OTE version doesn't set any cloud indicator. If the test suite needs to know which cloud provider to target, consider adding the
CLOUDenvironment variable or verify the OTE framework handles this through the cluster profile.Proposed fix if CLOUD env is needed
commands: | + export CLOUD=gcp /usr/bin/openshift-tests-extension run-suite openshift/hive🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml` around lines 374 - 392, The OTE test's test step is missing the CLOUD=gcp indicator so the suite may not target GCP; update the test step's commands (the commands key under the test block that currently runs "/usr/bin/openshift-tests-extension run-suite openshift/hive") to set CLOUD=gcp (e.g. prepend "CLOUD=gcp" or add an env: CLOUD entry under the same test step) so the Hive tests run with the GCP cloud context; alternatively, if the cluster_profile or workflow already injects cloud info, explicitly verify that before leaving the commands unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml`:
- Around line 412-433: The e2e-vsphere-ote job is missing the vSphere
environment setup; update the job (as: e2e-vsphere-ote) to source the same
shared scripts and export the same env vars as the original e2e-vsphere job
before running tests: source $SHARED_DIR/vsphere_context.sh, $SHARED_DIR/govc.sh
and $SHARED_DIR/additional_cluster.sh, then export
CLUSTER_NAME=$ADDITIONAL_CLUSTER_NAME,
VSPHERE_API_VIP=$ADDITIONAL_CLUSTER_API_VIP,
VSPHERE_INGRESS_VIP=$ADDITIONAL_CLUSTER_INGRESS_VIP,
VSPHERE_MACHINE_NETWORK=$(<"${SHARED_DIR}/machinecidr.txt"),
BASE_DOMAIN=$(<"${SHARED_DIR}/basedomain.txt") and
VSPHERE_INSTALLER_PLATFORM_SPEC_JSON=$(<"${SHARED_DIR}/platform.json") (place
these commands at the start of the test.commands block that runs
/usr/bin/openshift-tests-extension run-suite openshift/hive so the hive tests
receive the required vSphere configuration).
---
Nitpick comments:
In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml`:
- Around line 393-411: The e2e-azure-ote job (as: e2e-azure-ote, run_if_changed:
azure) lacks the test-level timeout and BASE_DOMAIN export present in the
original e2e-azure job; update the test stanza (the test: - as: test block and
its commands) to add timeout: 4h30m0s at the same level as test and ensure
BASE_DOMAIN is exported before running openshift-tests-extension (e.g., add an
env/dependency or prepend a command to set
BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt) in the commands section) so the
job uses the same timeout and base domain configuration as the original
e2e-azure job.
- Around line 374-392: The OTE test's test step is missing the CLOUD=gcp
indicator so the suite may not target GCP; update the test step's commands (the
commands key under the test block that currently runs
"/usr/bin/openshift-tests-extension run-suite openshift/hive") to set CLOUD=gcp
(e.g. prepend "CLOUD=gcp" or add an env: CLOUD entry under the same test step)
so the Hive tests run with the GCP cloud context; alternatively, if the
cluster_profile or workflow already injects cloud info, explicitly verify that
before leaving the commands unchanged.
In `@ci-operator/jobs/openshift/hive/openshift-hive-master-presubmits.yaml`:
- Around line 249-332: The new job pull-ci-openshift-hive-master-e2e-azure-ote
is missing the decoration_config timeout, which can cause premature termination
during long Azure provisioning; add a timeout: 4h30m0s entry under the job's
decoration_config (next to skip_cloning: true) so the job uses the same 4h30m0s
timeout as the original e2e-azure job.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: af59f246-1537-4e66-b160-bf870b89f616
📒 Files selected for processing (2)
ci-operator/config/openshift/hive/openshift-hive-master.yamlci-operator/jobs/openshift/hive/openshift-hive-master-presubmits.yaml
2uasimojo
left a comment
There was a problem hiding this comment.
As discussed here, we're going to need to figure out whether OTE suites should be defined as presubmits, periodics, or both. Summarizing, we should consider our ability to trigger these tests:
- On demand on a big/risky PR for more aggressive regression coverage.
- On demand on the tip of
mastere.g. as pre-release/pre-promotion validation. - Periodically/automatically for ongoing regression peace-of-mind.
Also as indicated inline, I need to understand how platform- (or other-characteristic-) specific test behavior is triggered. Even if a single invocation can exercise all platforms, I reckon we have a requirement to at least be able to filter down to a subset. This could be where @newtonheath's tagging suggestion comes into play. For example, "This PR adds a knob to GCP MachinePools, so run all tests with platform=gcp and scope=machinepool."
| requests: | ||
| cpu: 100m | ||
| workflow: generic-claim | ||
| - as: e2e-pool-ote |
There was a problem hiding this comment.
As written, this is not different than the above e2e-ote. The presubmits for e2e and e2e-pool actually run different test scripts. I don't think such a split is necessary or even possible here unless the OTE suite supports some kind of differentiator input.
| requests: | ||
| cpu: 100m | ||
| workflow: generic-claim | ||
| - as: e2e-gcp-ote |
There was a problem hiding this comment.
As written, this and e2e-azure-ote and e2e-vsphere-ote appear to differ from e2e-ote only in the platform of the hub cluster.
In general for hive we care much less about the platform of the hub -- which technically is allowed to be any k8s and runs almost identically -- and more about that of the spoke, as that informs:
- A completely different provisioning code path
- Platform-specific knobs in hive CRDs (primarily ClusterDeployment/ClusterPool & MachinePool) and corresponding behavior.
Is the test suite somehow detecting the cloud provider of the hub and creating spokes accordingly? And also filtering platform-specific test cases?
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@2uasimojo: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml`:
- Around line 382-403: The e2e-vsphere-ote job sources govc.sh/govc-related
scripts but is missing the vSphere credential mount (vsphere-ibmcloud-ci) so
govc.sh cannot authenticate; restore the secret mount by adding the same vsphere
credential dependency/secret used in the working e2e-vsphere job to the
e2e-vsphere-ote job (where the scripts are sourced), ensuring the secret name
(vsphere-ibmcloud-ci) is mounted/available to govc.sh and the sourced scripts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d0b8e4da-af2f-45c1-9ed6-9c303eb00664
📒 Files selected for processing (2)
ci-operator/config/openshift/hive/openshift-hive-master.yamlci-operator/jobs/openshift/hive/openshift-hive-master-presubmits.yaml
| test: | ||
| - as: test | ||
| cli: latest | ||
| commands: | | ||
| source $SHARED_DIR/vsphere_context.sh | ||
| source $SHARED_DIR/govc.sh | ||
| source $SHARED_DIR/additional_cluster.sh | ||
|
|
||
| export CLUSTER_NAME=$ADDITIONAL_CLUSTER_NAME | ||
| export VSPHERE_API_VIP=$ADDITIONAL_CLUSTER_API_VIP | ||
| export VSPHERE_INGRESS_VIP=$ADDITIONAL_CLUSTER_INGRESS_VIP | ||
| export VSPHERE_MACHINE_NETWORK=$(<"${SHARED_DIR}"/machinecidr.txt) | ||
| export BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt) | ||
| export VSPHERE_INSTALLER_PLATFORM_SPEC_JSON=$(<"${SHARED_DIR}"/platform.json) | ||
|
|
||
| /usr/bin/openshift-tests-extension run-suite openshift/hive | ||
| dependencies: | ||
| - env: HIVE_IMAGE | ||
| name: hive | ||
| - env: RELEASE_IMAGE | ||
| name: release:latest | ||
| from: hive-tests |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Compare existing e2e-vsphere with e2e-vsphere-ote:"
sed -n '227,263p;375,407p' ci-operator/config/openshift/hive/openshift-hive-master.yaml
echo
echo "Locate the shared vSphere setup scripts referenced by the jobs:"
fd -i 'govc\.sh$|vsphere_context\.sh$|additional_cluster\.sh$' -t f | while read -r f; do
echo "--- $f ---"
sed -n '1,200p' "$f"
doneRepository: openshift/release
Length of output: 2427
Restore the vSphere secret mount in e2e-vsphere-ote.
The e2e-vsphere-ote job sources the same vSphere setup scripts (vsphere_context.sh, govc.sh, additional_cluster.sh) as the existing e2e-vsphere job, but it omits the vsphere-ibmcloud-ci credential mount that the working job includes. The govc.sh script requires vSphere credentials for authentication, so this job will fail during the sourcing phase without that secret.
Suggested fix
- as: test
cli: latest
commands: |
source $SHARED_DIR/vsphere_context.sh
source $SHARED_DIR/govc.sh
source $SHARED_DIR/additional_cluster.sh
export CLUSTER_NAME=$ADDITIONAL_CLUSTER_NAME
export VSPHERE_API_VIP=$ADDITIONAL_CLUSTER_API_VIP
export VSPHERE_INGRESS_VIP=$ADDITIONAL_CLUSTER_INGRESS_VIP
export VSPHERE_MACHINE_NETWORK=$(<"${SHARED_DIR}"/machinecidr.txt)
export BASE_DOMAIN=$(<"${SHARED_DIR}"/basedomain.txt)
export VSPHERE_INSTALLER_PLATFORM_SPEC_JSON=$(<"${SHARED_DIR}"/platform.json)
/usr/bin/openshift-tests-extension run-suite openshift/hive
+ credentials:
+ - mount_path: /var/run/vault/vsphere-ibmcloud-ci
+ name: vsphere-ibmcloud-ci
+ namespace: test-credentials
dependencies:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml` around lines
382 - 403, The e2e-vsphere-ote job sources govc.sh/govc-related scripts but is
missing the vSphere credential mount (vsphere-ibmcloud-ci) so govc.sh cannot
authenticate; restore the secret mount by adding the same vsphere credential
dependency/secret used in the working e2e-vsphere job to the e2e-vsphere-ote job
(where the scripts are sourced), ensuring the secret name (vsphere-ibmcloud-ci)
is mounted/available to govc.sh and the sourced scripts.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ci-operator/config/openshift/hive/openshift-hive-master.yaml (1)
378-407:⚠️ Potential issue | 🟠 Major
e2e-vsphere-oteis missing the vSphere credential mount.This job sources
vsphere_context.shandgovc.shbut does not mountvsphere-ibmcloud-ci; vSphere auth bootstrap is likely to fail before the suite starts.Suggested patch
- as: e2e-vsphere-ote run_if_changed: vsphere steps: @@ - as: test cli: latest commands: | export KUBECONFIG="${SHARED_DIR}/kubeconfig" source $SHARED_DIR/vsphere_context.sh source $SHARED_DIR/govc.sh source $SHARED_DIR/additional_cluster.sh @@ export VSPHERE_INSTALLER_PLATFORM_SPEC_JSON=$(<"${SHARED_DIR}"/platform.json) /usr/bin/openshift-tests-extension run-suite openshift/hive + credentials: + - mount_path: /var/run/vault/vsphere-ibmcloud-ci + name: vsphere-ibmcloud-ci + namespace: test-credentials dependencies: - env: HIVE_IMAGE name: hive🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml` around lines 378 - 407, The e2e-vsphere-ote job is missing the vSphere credential mount causing vsphere_context.sh/govc.sh to fail; add a secrets mount for the vsphere-ibmcloud-ci secret to the job's steps so those scripts can access credentials at runtime. Specifically, update the e2e-vsphere-ote job definition to include the vsphere-ibmcloud-ci secret (mounting it under the shared directory or the expected path the scripts use) so that sourcing vsphere_context.sh and govc.sh succeeds when the test commands run. Ensure the secret name matches vsphere-ibmcloud-ci and is referenced in the job's steps alongside the existing dependencies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@ci-operator/config/openshift/hive/openshift-hive-master.yaml`:
- Around line 378-407: The e2e-vsphere-ote job is missing the vSphere credential
mount causing vsphere_context.sh/govc.sh to fail; add a secrets mount for the
vsphere-ibmcloud-ci secret to the job's steps so those scripts can access
credentials at runtime. Specifically, update the e2e-vsphere-ote job definition
to include the vsphere-ibmcloud-ci secret (mounting it under the shared
directory or the expected path the scripts use) so that sourcing
vsphere_context.sh and govc.sh succeeds when the test commands run. Ensure the
secret name matches vsphere-ibmcloud-ci and is referenced in the job's steps
alongside the existing dependencies.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 24ceabe1-07db-46ff-86e8-692be9c41214
📒 Files selected for processing (1)
ci-operator/config/openshift/hive/openshift-hive-master.yaml
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj rehearse abort |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/retest |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: miyadav The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Add e2e-ote, e2e-gcp-ote, e2e-azure-ote, and e2e-vsphere-ote test jobs using the OpenShift Tests Extension framework. Temporarily uses dockerfile_literal to build hive-tests from miyadav/hive:fix-ote-init branch while the hive PR is validated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse pull-ci-openshift-hive-master-e2e-ote |
|
@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@miyadav: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/close |
|
/close |
|
@miyadav: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@miyadav: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/hold
depends on - openshift/origin#31059
Summary by CodeRabbit