Skip to content

[WIP] feat(ci): add KubeVirt OADP backup/restore e2e test workflow #76534

Open
mgencur wants to merge 5 commits intoopenshift:mainfrom
mgencur:backup_restore_kubevirt
Open

[WIP] feat(ci): add KubeVirt OADP backup/restore e2e test workflow #76534
mgencur wants to merge 5 commits intoopenshift:mainfrom
mgencur:backup_restore_kubevirt

Conversation

@mgencur
Copy link
Contributor

@mgencur mgencur commented Mar 19, 2026

mgencur and others added 4 commits March 10, 2026 15:57
Create a new hypershift-mce-agent-oadp-v2-setup step that provisions OADP
infrastructure (DPA with noDefaultBackupLocation, separate BSL/VSL objects)
without creating Backup/Restore objects inline. The backup/restore lifecycle
is now handled by the hypershift-e2e-backuprestore chain.

Changes:
- New step: hypershift-mce-agent-oadp-v2-setup (setup-only, 20m timeout)
- Update periodic e2e-agent-connected-ovn-ipv4-metal-oadp to use the new
  setup step + hypershift-e2e-backuprestore chain
- Add on-demand presubmit e2e-agent-connected-ovn-ipv4-metal-oadp
- Make E2E_HOSTED_CLUSTER_NAMESPACE configurable in e2e-backuprestore chain
  (defaults to "clusters", overridden to "local-cluster" for agent metal)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create hypershift-kubevirt-e2e-backuprestore workflow that provisions
an AWS nested management cluster with KubeVirt, creates a hosted
cluster, and runs OADP backup/restore E2E tests with S3 storage.

Adds presubmit (optional, not always_run) on main and a weekly
periodic on release-4.22 (Wednesday 4:00 UTC). Reuses existing
hypershift-aws-oadp-setup/destroy steps since the S3 backup storage
backend is the same regardless of hosted cluster platform.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e2e test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 19, 2026
@openshift-ci openshift-ci bot requested review from bear-redhat and csrwng March 19, 2026 10:49
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mgencur
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mgencur
Copy link
Contributor Author

mgencur commented Mar 19, 2026

/pj-rehearse periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-v2-aws-backuprestore

@openshift-ci-robot
Copy link
Contributor

@mgencur: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@mgencur: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-hypershift-main-e2e-v2-aws-backuprestore openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-main-e2e-v2-kubevirt-aws-backuprestore openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-release-4.21-e2e-agent-connected-ovn-ipv4-metal-oadp openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-release-4.22-e2e-v2-aws-backuprestore openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-release-4.23-e2e-v2-aws-backuprestore openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-origin-release-4.16-e2e-agent-connected-ovn-ipv4-metal3 openshift/origin presubmit Registry content changed
pull-ci-openshift-origin-release-4.16-e2e-agent-connected-ovn-dualstack-metal3 openshift/origin presubmit Registry content changed
periodic-ci-openshift-hypershift-release-4.19-periodics-mce-e2e-agent-connected-ovn-dualstack-metal-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.20-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-compact-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.16-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-compact-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.21-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-compact-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.17-periodics-mce-e2e-agent-connected-ovn-dualstack-metal-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.19-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-compact-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.17-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-oadp N/A periodic Registry content changed
periodic-ci-mgencur-release-backup_restore_kubevirt-periodics-e2e-v2-kubevirt-aws-backuprestore N/A periodic Periodic changed
periodic-ci-openshift-hypershift-release-4.18-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-oadp N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.19-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.20-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-oadp N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.16-periodics-mce-e2e-agent-connected-ovn-dualstack-metal-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.17-periodics-mce-e2e-agent-critical N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.16-periodics-mce-e2e-agent-critical N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.17-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-conformance N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.21-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-oadp N/A periodic Ci-operator config changed
periodic-ci-openshift-hypershift-release-4.21-periodics-mce-e2e-agent-critical N/A periodic Registry content changed
periodic-ci-openshift-hypershift-release-4.19-periodics-mce-e2e-agent-connected-ovn-ipv4-metal-oadp N/A periodic Registry content changed

A total of 40 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@mgencur
Copy link
Contributor Author

mgencur commented Mar 19, 2026

/pj-rehearse periodic-ci-mgencur-release-backup_restore_kubevirt-periodics-e2e-v2-kubevirt-aws-backuprestore

@openshift-ci-robot
Copy link
Contributor

@mgencur: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

@mgencur: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/check-gh-automation d18a553 link true /test check-gh-automation
ci/rehearse/periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-v2-aws-backuprestore d18a553 link unknown /pj-rehearse periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-v2-aws-backuprestore
ci/prow/owners d18a553 link true /test owners
ci/prow/step-registry-shellcheck d18a553 link true /test step-registry-shellcheck

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@mgencur
Copy link
Contributor Author

mgencur commented Mar 19, 2026

At the moment, the backup doesn't succeed:

ᐅ oc get backup 023f63366fafcd85280c-clusters-wtm64r -n openshift-adp -oyaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    velero.io/resource-timeout: 2h0m0s
    velero.io/source-cluster-k8s-gitversion: v1.34.2
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "34"
  creationTimestamp: "2026-03-19T12:23:40Z"
  generation: 9
  labels:
    velero.io/storage-location: 023f63366fafcd85280c
  name: 023f63366fafcd85280c-clusters-wtm64r
  namespace: openshift-adp
  resourceVersion: "38460"
  uid: 32f7bbe7-7ab6-4ee3-b9a8-f6d699bd040d
spec:
  csiSnapshotTimeout: 10m0s
  defaultVolumesToFsBackup: false
  includedNamespaces:
  - clusters
  - clusters-023f63366fafcd85280c
  includedResources:
  - serviceaccounts
  - roles
  - rolebindings
  - pods
  - persistentvolumeclaims
  - persistentvolumes
  - configmaps
  - priorityclasses
  - poddisruptionbudgets
  - hostedclusters.hypershift.openshift.io
  - nodepools.hypershift.openshift.io
  - secrets
  - services
  - deployments
  - statefulsets
  - hostedcontrolplanes.hypershift.openshift.io
  - clusters.cluster.x-k8s.io
  - machinedeployments.cluster.x-k8s.io
  - machinesets.cluster.x-k8s.io
  - machines.cluster.x-k8s.io
  - routes.route.openshift.io
  - clusterdeployments.hive.openshift.io
  - kubevirtclusters.infrastructure.cluster.x-k8s.io
  - kubevirtmachinetemplates.infrastructure.cluster.x-k8s.io
  - datavolumes.cdi.kubevirt.io
  itemOperationTimeout: 4h0m0s
  labelSelector:
    matchExpressions:
    - key: hypershift.openshift.io/is-kubevirt-rhcos
      operator: DoesNotExist
  snapshotMoveData: true
  snapshotVolumes: true
  storageLocation: 023f63366fafcd85280c
  ttl: 2h0m0s
  volumeSnapshotLocations:
  - 023f63366fafcd85280c
status:
  backupItemOperationsAttempted: 4
  expiration: "2026-03-19T14:23:40Z"
  formatVersion: 1.1.0
  hookStatus:
    hooksAttempted: 6
  phase: WaitingForPluginOperations
  progress:
    itemsBackedUp: 369
    totalItems: 369
  startTimestamp: "2026-03-19T12:23:40Z"
  version: 1

And Velero logs show this:

time="2026-03-19T12:27:32Z" level=info msg="Setting up backup store to persist the backup" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 logSource="/workspace/pkg/controller/backup_controller.go:747"
time="2026-03-19T12:27:32Z" level=debug msg="looking for plugin in registry" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 kind=ObjectStore logSource="/workspace/pkg/plugin/clientmgmt/manager.go:141" name=velero.io/aws
time="2026-03-19T12:27:32Z" level=debug msg="found preexisting restartable plugin process" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 command=/plugins/velero-plugin-for-aws kind=ObjectStore logSource="/workspace/pkg/plugin/clientmgmt/manager.go:152" name=velero.io/aws
time="2026-03-19T12:27:32Z" level=info msg="Initial backup processing complete, moving to WaitingForPluginOperations" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 logSource="/workspace/pkg/controller/backup_controller.go:761"
time="2026-03-19T12:27:32Z" level=debug msg="received EOF, stopping recv loop" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 cmd=/plugins/velero-plugins err="rpc error: code = Unavailable desc = error reading from server: EOF" logSource="/workspace/pkg/plugin/clientmgmt/process/logrus_adapter.go:75" pluginName=stdio
time="2026-03-19T12:27:32Z" level=info msg="plugin process exited" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 cmd=/plugins/velero-plugins id=2674 logSource="/workspace/pkg/plugin/clientmgmt/process/logrus_adapter.go:80" plugin=/plugins/velero-plugins
time="2026-03-19T12:27:32Z" level=debug msg="plugin exited" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 cmd=/plugins/velero-plugins logSource="/workspace/pkg/plugin/clientmgmt/process/logrus_adapter.go:75"
time="2026-03-19T12:27:32Z" level=debug msg="plugin exited" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 cmd=/plugins/kubevirt-velero-plugin logSource="/workspace/pkg/plugin/clientmgmt/process/logrus_adapter.go:75"
time="2026-03-19T12:27:32Z" level=info msg="Updating backup's status" backuprequest=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 controller=backup logSource="/workspace/pkg/controller/backup_controller.go:317"
time="2026-03-19T12:27:32Z" level=debug msg="Getting backup" backuprequest=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 controller=backup logSource="/workspace/pkg/controller/backup_controller.go:216"
time="2026-03-19T12:27:32Z" level=debug msg="Getting Backup" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 controller=backup-finalizer logSource="/workspace/pkg/controller/backup_finalizer_controller.go:96"
time="2026-03-19T12:27:32Z" level=debug msg="Backup is not handled" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 logSource="/workspace/pkg/controller/backup_controller.go:244" phase=WaitingForPluginOperations
time="2026-03-19T12:27:32Z" level=debug msg="Backup is not awaiting finalizing, skipping" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 controller=backup-finalizer logSource="/workspace/pkg/controller/backup_finalizer_controller.go:112"
time="2026-03-19T12:27:32Z" level=debug msg="Getting backup" backuprequest=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 controller=backup logSource="/workspace/pkg/controller/backup_controller.go:216"
time="2026-03-19T12:27:32Z" level=debug msg="Backup is not handled" backup=openshift-adp/023f63366fafcd85280c-clusters-n7hsg4-20260319122708 logSource="/workspace/pkg/controller/backup_controller.go:244" phase=WaitingForPluginOperations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants