From 2042841141baab7d55962f9706caca2406b4fbff Mon Sep 17 00:00:00 2001 From: barbacbd Date: Fri, 8 May 2026 13:53:43 -0400 Subject: [PATCH] OCPBUGS-85346: Revert 4.22 and 4.23 from C4A to T2A instances to avoid hyperdisk costs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts 4.22 and 4.23 multi-arch GCP jobs from C4A (Axion) instances back to T2A (Tau) instances to avoid the cost increase associated with hyperdisk-balanced storage. ## Root Cause of StatefulSet Failures C4A instances only support Hyperdisk storage types and do NOT support Persistent Disk types (pd-standard, pd-balanced, pd-ssd). When StatefulSet tests create PVCs using the cluster's default StorageClass (standard-csi with pd-standard), volume attach fails on C4A nodes: ``` googleapi: Error 400: pd-standard disk type cannot be used by c4a-standard-2 machine type., badRequest ``` ## The Fix: Revert to T2A Instead of adding hyperdisk-balanced StorageClass (which increases costs), revert 4.22 and 4.23 to T2A instances which support pd-standard. ### Cost Considerations - **pd-standard:** ~$0.04/GB/month (T2A compatible) - **hyperdisk-balanced:** ~$0.10-0.12/GB/month + IOPS charges (C4A required) For CI jobs with many PVCs (monitoring, logging, registry, StatefulSets), using hyperdisk-balanced across all test runs would significantly increase costs. T2A with pd-standard is more cost-effective. ### Quota Mitigation While reverting to T2A means 4.21, 4.22, and 4.23 will compete for the same T2A quota, PR #77809 included other mitigations that remain in place: 1. **Zone randomization** - Distributes instances across zones 2. **Interval scheduling (168h)** - Prevents simultaneous execution 3. **Smaller instances** - Uses t2a-standard-2 (not standard-4) 4. **Balanced worker layout** - 2+2 workers instead of 3+2 These mitigations should reduce quota pressure even with multiple releases using T2A. ## Changes Made **4.22 nightly config (6 jobs):** - ocp-e2e-gcp-ovn-multi-a-a: c4a-standard-2 → t2a-standard-2 - ocp-e2e-gcp-ovn-multi-x-x-to-a-x: c4a-standard-2 → t2a-standard-2 - ocp-e2e-gcp-ovn-multi-a-a-to-x-a: c4a-standard-2 → t2a-standard-2 - ocp-e2e-upgrade-gcp-ovn-multi-a-a: c4a-standard-2 → t2a-standard-2 - ocp-e2e-gcp-ovn-multi-x-ax: c4a-standard-2 → t2a-standard-2 - ocp-e2e-upgrade-gcp-ovn-multi-x-ax: c4a-standard-2 → t2a-standard-2 **4.23 nightly config (5 jobs):** - ocp-e2e-gcp-ovn-multi-a-a: c4a-standard-2 → t2a-standard-2 - ocp-e2e-gcp-ovn-multi-x-x-to-a-x: c4a-standard-2 → t2a-standard-2 - ocp-e2e-gcp-ovn-multi-a-a-to-x-a: c4a-standard-2 → t2a-standard-2 - ocp-e2e-upgrade-gcp-ovn-multi-a-a: c4a-standard-2 → t2a-standard-2 - Heterogeneous jobs: c4a-standard-2 → t2a-standard-2 **4.22 upgrade configs (2 files):** - nightly-4.22-upgrade-from-nightly-4.21: c4a-standard-4 → t2a-standard-4 - nightly-4.22-upgrade-from-stable-4.21: c4a-standard-4 → t2a-standard-4 **4.23 upgrade configs (2 files):** - nightly-4.23-upgrade-from-nightly-4.22: c4a-standard-2 → t2a-standard-2 - nightly-4.23-upgrade-from-stable-4.22: c4a-standard-2 → t2a-standard-2 Removed `ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced` from all heterogeneous jobs (no longer needed as T2A supports pd-standard). ## Release Distribution After This Change - **4.21:** T2A standard-2 - **4.22:** T2A standard-2 (reverted from C4A) - **4.23:** T2A standard-2 (reverted from C4A) - **5.0:** T2A standard-4 ## References - JIRA: https://redhat.atlassian.net/browse/OCPBUGS-85346 - Failed job: periodic-ci-openshift-multiarch-main-nightly-4.22-ocp-e2e-gcp-ovn-multi-x-ax - Original PR #77809: https://github.com/openshift/release/pull/77809 - GCP C4A disk requirements: https://cloud.google.com/blog/products/compute/first-google-axion-processor-c4a-now-ga-with-titanium-ssd --- ...ightly-4.22-upgrade-from-nightly-4.21.yaml | 2 +- ...nightly-4.22-upgrade-from-stable-4.21.yaml | 3 +-- ...penshift-multiarch-main__nightly-4.22.yaml | 22 +++++++++---------- ...ightly-4.23-upgrade-from-nightly-4.22.yaml | 3 +-- ...nightly-4.23-upgrade-from-stable-4.22.yaml | 3 +-- ...penshift-multiarch-main__nightly-4.23.yaml | 22 +++++++++---------- 6 files changed, 24 insertions(+), 31 deletions(-) diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-nightly-4.21.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-nightly-4.21.yaml index 145f2cdd88476..4ce5a7ea2dc81 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-nightly-4.21.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-nightly-4.21.yaml @@ -101,7 +101,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-4 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-4 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-stable-4.21.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-stable-4.21.yaml index 45784bebeb5e6..aa9d10cd9ec03 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-stable-4.21.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22-upgrade-from-stable-4.21.yaml @@ -108,8 +108,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-4 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-4 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22.yaml index 9795518d998c4..b1068dffd4497 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.22.yaml @@ -777,8 +777,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 OCP_ARCH: arm64 ZONES_EXCLUSION_PATTERN: (asia-southeast1-a|us-central1-c) workflow: openshift-e2e-gcp-ovn @@ -787,8 +787,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - MIGRATION_CP_MACHINE_TYPE: c4a-standard-2 - MIGRATION_INFRA_MACHINE_TYPE: c4a-standard-2 + MIGRATION_CP_MACHINE_TYPE: t2a-standard-2 + MIGRATION_INFRA_MACHINE_TYPE: t2a-standard-2 TEST_SKIPS: deploymentconfigs\| should expose cluster services outside the cluster\| FIPS TestFIPS\| Multi-stage image builds should succeed\| Optimized image builds should succeed\| build can reference a cluster service\| custom build @@ -812,8 +812,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 MIGRATION_ARCHITECTURE: x86_64 MIGRATION_CP_MACHINE_TYPE: n2-standard-4 MIGRATION_INFRA_MACHINE_TYPE: n2-standard-4 @@ -841,8 +841,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 OCP_ARCH: arm64 TEST_TYPE: upgrade-conformance ZONES_EXCLUSION_PATTERN: (asia-southeast1-a|us-central1-c) @@ -853,8 +853,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SKIPS: deploymentconfigs\| should expose cluster services outside the cluster\| @@ -904,8 +903,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-nightly-4.22.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-nightly-4.22.yaml index 40c8293ec38d8..a2751865082db 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-nightly-4.22.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-nightly-4.22.yaml @@ -101,8 +101,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-stable-4.22.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-stable-4.22.yaml index 8d5e645b6477a..5482b52a65218 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-stable-4.22.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23-upgrade-from-stable-4.22.yaml @@ -108,8 +108,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance diff --git a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23.yaml b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23.yaml index dc73211e9920f..d9d47f82bb605 100644 --- a/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23.yaml +++ b/ci-operator/config/openshift/multiarch/openshift-multiarch-main__nightly-4.23.yaml @@ -746,8 +746,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 OCP_ARCH: arm64 ZONES_EXCLUSION_PATTERN: (asia-southeast1-a|us-central1-c) workflow: openshift-e2e-gcp-ovn @@ -756,8 +756,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - MIGRATION_CP_MACHINE_TYPE: c4a-standard-2 - MIGRATION_INFRA_MACHINE_TYPE: c4a-standard-2 + MIGRATION_CP_MACHINE_TYPE: t2a-standard-2 + MIGRATION_INFRA_MACHINE_TYPE: t2a-standard-2 TEST_SKIPS: deploymentconfigs\| should expose cluster services outside the cluster\| FIPS TestFIPS\| Multi-stage image builds should succeed\| Optimized image builds should succeed\| build can reference a cluster service\| custom build @@ -781,8 +781,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 MIGRATION_ARCHITECTURE: x86_64 MIGRATION_CP_MACHINE_TYPE: n2-standard-4 MIGRATION_INFRA_MACHINE_TYPE: n2-standard-4 @@ -810,8 +810,8 @@ tests: steps: cluster_profile: openshift-org-gcp env: - COMPUTE_NODE_TYPE: c4a-standard-2 - CONTROL_PLANE_NODE_TYPE: c4a-standard-2 + COMPUTE_NODE_TYPE: t2a-standard-2 + CONTROL_PLANE_NODE_TYPE: t2a-standard-2 OCP_ARCH: arm64 TEST_TYPE: upgrade-conformance ZONES_EXCLUSION_PATTERN: (asia-southeast1-a|us-central1-c) @@ -822,8 +822,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SKIPS: deploymentconfigs\| should expose cluster services outside the cluster\| @@ -873,8 +872,7 @@ tests: cluster_profile: openshift-org-gcp env: ADDITIONAL_WORKER_ARCHITECTURE: aarch64 - ADDITIONAL_WORKER_DISK_TYPE: hyperdisk-balanced - ADDITIONAL_WORKER_VM_TYPE: c4a-standard-2 + ADDITIONAL_WORKER_VM_TYPE: t2a-standard-2 COMPUTE_NODE_REPLICAS: "2" OCP_ARCH: amd64 TEST_SUITE: upgrade-conformance