trusted-execution-clusters: Intro Azure upstream CI by Jakob-Naucke · Pull Request #79035 · openshift/release

Jakob-Naucke · 2026-05-08T09:11:11Z

Create a Kind VM (also on Azure) as container platform. Run integration tests on Azure.

Azure Upstream CI for trusted-execution-clusters Operator

This PR introduces Azure upstream CI infrastructure for the trusted-execution-clusters/operator repository, enabling integration testing on Azure cloud infrastructure.

CI Configuration Changes

Added ci-operator/config/trusted-execution-clusters/operator/azure.yaml to define the main test job operator-lifecycle-azure-verify. This job:

Runs the trusted-execution-clusters-operator-azure-lifecycle test chain
Executes the trusted-execution-clusters-operator-azure-cleanup cleanup chain afterward
Requires intranet capability
Skips execution for documentation-only changes
Uses telco-runner:latest container image for test execution

Test Infrastructure

Added two main CI step chains:

Azure Lifecycle Chain (trusted-execution-clusters-operator-azure-lifecycle): Provisioning and testing flow that:

Provisions an ephemeral Azure VM (Standard_D2s_v3, in eastus region)
Creates a Kind Kubernetes cluster on the VM
Builds and pushes the operator images to a local registry on the VM
Runs the operator's integration tests against the Kind cluster
Authenticates to Azure using service principal credentials

Azure Cleanup Chain (trusted-execution-clusters-operator-azure-cleanup): Resource cleanup that:

Removes the provisioned Azure resource group after testing completes

Implementation Details

The test step (test.sh) handles the full provisioning workflow:

Clones the operator repository if not already present
Installs required tooling (azure-cli, cargo, jq, rustfmt)
Creates Azure resource group and VM using subscription credentials
Configures networking (opens ports 6443, 8000, 8080)
Sets up Kind cluster with Docker registry
Builds and deploys operator images to the local registry
Executes integration tests with proper environment configuration

The deprovision step (deprovision.sh) performs cleanup:

Authenticates to Azure using service principal
Deletes the Azure resource group created during testing

Governance

Added OWNERS files specifying alicefr, Jakob-Naucke, yalzhang, and fangge1212 as reviewers/approvers for the Azure test infrastructure components, along with corresponding metadata JSON files for tracking.

coderabbitai · 2026-05-08T09:11:20Z

Warning

Rate limit exceeded

@Jakob-Naucke has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 31 minutes and 50 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 6da3c054-9a4b-4f96-ae98-0162a61604ab

📥 Commits

Reviewing files that changed from the base of the PR and between 904bd57 and c6d2529.

⛔ Files ignored due to path filters (1)

ci-operator/jobs/trusted-execution-clusters/operator/trusted-execution-clusters-operator-main-presubmits.yaml is excluded by !ci-operator/jobs/**

📒 Files selected for processing (16)

ci-operator/config/trusted-execution-clusters/operator/azure.yaml
ci-operator/step-registry/trusted-execution-clusters/operator-azure/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup-chain.metadata.json
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup-chain.yaml
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/trusted-execution-clusters-operator-azure-lifecycle-chain.metadata.json
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/trusted-execution-clusters-operator-azure-lifecycle-chain.yaml
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/OWNERS
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/trusted-execution-clusters-ref-operator-azure-deprovision-commands.sh
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/trusted-execution-clusters-ref-operator-azure-deprovision-ref.metadata.json
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/trusted-execution-clusters-ref-operator-azure-deprovision-ref.yaml
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/OWNERS
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/trusted-execution-clusters-ref-operator-azure-test-commands.sh
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/trusted-execution-clusters-ref-operator-azure-test-ref.metadata.json
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/trusted-execution-clusters-ref-operator-azure-test-ref.yaml

Walkthrough

This PR adds Azure-based CI infrastructure for the trusted-execution-clusters operator, including step registry entries for provisioning an Azure VM with Kind Kubernetes cluster and running integration tests, along with cleanup automation and ownership metadata.

Changes

Azure Operator CI Infrastructure

Layer / File(s)	Summary
Step Registry - Test Ref `ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/{OWNERS, test.metadata.json, test.yaml}` `ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh`	Defines `trusted-execution-clusters-ref-operator-azure-test` step ref that provisions an ephemeral Azure VM, creates a Kind cluster, builds/pushes operator images to an intra-VM registry, and executes integration tests. The test.sh script handles repository checkout, tooling installation, Azure resource provisioning, SSH networking, Kind cluster setup, and test execution.
Step Registry - Deprovision Ref `ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/{OWNERS, deprovision.metadata.json, deprovision.yaml, deprovision.sh}`	Defines `trusted-execution-clusters-ref-operator-azure-deprovision` step ref for cleanup. The deprovision.sh script validates environment, authenticates to Azure via service principal credentials, and deletes the provisioned resource group.
Chain Definitions `ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/{OWNERS, azure.metadata.json, azure.yaml}` `ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/{OWNERS, trusted-execution-clusters-operator-azure-cleanup.metadata.json, trusted-execution-clusters-operator-azure-cleanup.yaml}`	Defines `trusted-execution-clusters-operator-lifecycle-azure` chain (runs test step) and `trusted-execution-clusters-operator-azure-cleanup` chain (runs deprovision step) with ownership metadata.
Operator Configuration & Ownership `ci-operator/config/trusted-execution-clusters/operator/azure.yaml` `ci-operator/step-registry/trusted-execution-clusters/operator-azure/OWNERS`	Main Azure CI operator config that sets base image stream (rhel-9-golang-1.25), resource limits (4Gi memory), and registers the lifecycle-azure verification test with cleanup. Top-level OWNERS file assigns reviewers and approvers.

Sequence Diagram

sequenceDiagram
    participant Prow as Prow CI
    participant Test as Test Step
    participant Azure as Azure Cloud
    participant VM as Azure VM
    participant Registry as Local Registry
    participant Kind as Kind Cluster
    participant Cleanup as Cleanup Step

    Prow->>Test: Trigger test step
    Test->>Test: Validate SHARED_DIR
    Test->>Test: Checkout repository
    Test->>Test: Install tooling (azure-cli, kind, jq)
    Test->>Azure: Authenticate with service principal
    Test->>Azure: Create resource group & VM
    Test->>Azure: Write resource group name to SHARED_DIR
    Test->>VM: Wait for VM readiness
    Test->>VM: Configure SSH & NSG inbound rules
    Test->>VM: Transfer source & provisioning script
    VM->>VM: Install Docker & extend LVM
    VM->>VM: Install kind, kubectl, yq
    Test->>Kind: Create Kind cluster on VM
    Kind->>Kind: Wait for cluster readiness
    Test->>Test: Retrieve kubeconfig & update API endpoint
    Test->>Registry: Build & push operator images
    Registry->>Registry: Store images on VM
    Test->>Kind: Run integration tests
    Kind->>Kind: Execute test suite

    Prow->>Cleanup: Trigger cleanup step
    Cleanup->>Cleanup: Validate SHARED_DIR & resource group
    Cleanup->>Cleanup: Authenticate to Azure
    Cleanup->>Azure: Delete resource group
    Azure->>VM: Deprovision VM & resources

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (11 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding Azure upstream CI infrastructure for the trusted-execution-clusters operator.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	This PR adds CI infrastructure and shell scripts only, not Ginkgo test definitions. Actual test code resides in the external operator repository and is invoked via make. The check is not applicable.
Test Structure And Quality	✅ Passed	Custom check for Ginkgo test quality is not applicable to this PR. The PR adds CI configuration, shell scripts, and metadata files only—no Ginkgo test code is present.
Microshift Test Compatibility	✅ Passed	No new Ginkgo e2e tests are added in this PR. The PR only adds CI infrastructure (config, metadata, OWNERS, and provisioning scripts). The custom check is not applicable to CI configuration files.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	This PR adds CI infrastructure (shell scripts, YAML configs, OWNERS files) for Azure upstream CI. No Ginkgo e2e test definitions are added. The custom check is not applicable.
Topology-Aware Scheduling Compatibility	✅ Passed	PR adds only CI/CD configuration and test scripts. No deployment manifests, operator code, or controllers are present. Topology-aware scheduling check is not applicable.
Ote Binary Stdout Contract	✅ Passed	Check not applicable. PR adds CI config (YAML, JSON, shell scripts) to openshift/release, not OTE binaries. No Go process-level entry points are added.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No new Ginkgo e2e tests are added in this PR. PR adds CI infrastructure (YAML, shell scripts) to run existing tests on Azure. Check not applicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-merge-bot · 2026-05-08T09:16:24Z

@Jakob-Naucke, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: invalid file name: /var/tmp/gitrepo131959482/ci-operator/step-registry/trusted-execution-clusters/operator/cleanup/trusted-execution-clusters-operator-cleanup-azure.yaml

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

openshift-merge-bot · 2026-05-08T09:22:10Z

@Jakob-Naucke, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: invalid file name: /var/tmp/gitrepo1196305233/ci-operator/step-registry/trusted-execution-clusters/operator/cleanup/trusted-execution-clusters-operator-cleanup-azure.yaml

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

openshift-ci · 2026-05-08T13:51:22Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jakob-Naucke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~ci-operator/config/trusted-execution-clusters/operator/OWNERS~~ [Jakob-Naucke]
~~ci-operator/jobs/trusted-execution-clusters/operator/OWNERS~~ [Jakob-Naucke]
~~ci-operator/step-registry/trusted-execution-clusters/OWNERS~~ [Jakob-Naucke]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-merge-bot · 2026-05-08T13:53:51Z

@Jakob-Naucke, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: invalid file name: /var/tmp/gitrepo3321177835/ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup.yaml

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (1)

ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh (1)
48-48: 💤 Low value

Brittle parse of KIND_HOST_URN from the Makefile.

grep KIND_HOST_URN operator/Makefile | cut -d= -f2 | tr -d ' ' will silently misbehave if the variable is referenced anywhere else (a comment, $(KIND_HOST_URN) usage, a derived variable, etc.) — the result becomes a multiline blob fed straight to az vm create --image. Same caveat applies to the awk '/kindest/ {print $NF}' Cargo.toml on line 142.

Consider asking make to print the resolved value, e.g. make -s -C operator print-KIND_HOST_URN with a small print-%: ; @echo $($*) rule, or anchor the grep to the start of the line (grep '^KIND_HOST_URN[[:space:]]*=').
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh`
at line 48, The current extraction of KIND_HOST_URN into kind_vm_image using a
loose grep is brittle; update the script to obtain the resolved Makefile
variable instead of grepping (e.g. invoke make -s -C operator
print-KIND_HOST_URN relying on a simple print-%: ; `@echo` $($*) rule added to the
Makefile) or at minimum anchor the grep to the line start (grep
'^KIND_HOST_URN[[:space:]]*=') to avoid matching comments/usages; likewise
replace the fragile awk '/kindest/ {print $NF}' Cargo.toml usage with a more
robust query or anchored grep/Make print target that returns the single intended
kindest image value so that kind_vm_image and the Cargo.toml-derived image are
always single, resolved strings passed to az vm create.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/azure.yaml`:
- Line 2: The chain name in the YAML uses
"trusted-execution-clusters-operator-lifecycle-azure" which doesn't match the
consumer's expected chain name
"trusted-execution-clusters-operator-azure-lifecycle"; update the `as:` value to
the consumer-facing identifier
"trusted-execution-clusters-operator-azure-lifecycle" (or alternatively update
the consumer to use the current name) so chain resolution succeeds—look for the
`as:` entry in this file and replace the existing string with the exact
consumer-config name.

In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh`:
- Around line 4-7: The guard that checks SHARED_DIR uses direct expansion under
nounset mode (set -u), which can cause an unbound-variable error; update the if
condition to use a safe default expansion like ${SHARED_DIR:-} (e.g., if [ -z
"${SHARED_DIR:-}" ]; then ...) so the check works even when SHARED_DIR is unset,
and leave the existing error/exit behavior unchanged.
- Around line 9-27: Make the cleanup script idempotent: if the marker file
"${SHARED_DIR}/az-resource-group" is missing, exit 0 instead of failing; if
present, read az_resource_group and call "az group exists --name
\"$az_resource_group\"" and use its stdout ("true"/"false") to decide whether to
delete; only run "az group delete --name \"$az_resource_group\" --yes" when
exists returns true and print a success message for both "deleted" and "already
absent" cases; keep the existing az login (secret_base/client-id, client-secret,
tenant-id) flow but ensure missing marker file is treated as no-op and use "az
group exists" to guard deletion.

In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh`:
- Around line 43-61: The script writes the resource group name to
$SHARED_DIR/az-resource-group before authentication and creation, which can
cause cleanup to try deleting a non-existent RG; move the write so it happens
only after az login and a successful az group create (i.e., after the az group
create call that uses az_resource_group), or alternatively make the downstream
deprovision tolerant of a missing group (check az group exists or handle "not
found" in azure-deprovision/deprovision.sh). Update the sequence around az
login, az group create and the SHARED_DIR/az-resource-group write so the file is
persisted only on success.
- Around line 20-23: The guard checking SHARED_DIR fails under nounset (-u)
because ${SHARED_DIR} is expanded when unset; change the conditional to use a
safe default expansion such as if [ -z "${SHARED_DIR:-}" ]; then so the test can
run even if SHARED_DIR is unset, keep the existing log_error and exit 1
behavior, and update any other checks that reference SHARED_DIR in this script
(e.g., replace bare ${SHARED_DIR} expansions with ${SHARED_DIR:-} or
${SHARED_DIR:+...} as appropriate) so the script respects the #!/bin/bash -eu
shebang.
- Around line 186-193: The ssh-agent is started by eval "$(ssh-agent)" but no
key is added, so either remove the ssh-agent startup or add an ssh-add step
before calling make integration-tests; update the test script around eval
"$(ssh-agent)" and the make integration-tests invocation to either (1) drop the
eval line entirely (matching
trusted-execution-clusters-ref-operator-test-commands.sh) if agent auth is not
required, or (2) call ssh-add with the correct private key (e.g., ssh-add
"$SSH_KEY_PATH" or ssh-add "$secret_base/id_rsa") and ensure SSH_AUTH_SOCK is
set so make integration-tests can use agent auth. Ensure the change touches the
block containing eval "$(ssh-agent)" and the make integration-tests invocation
so the agent is either populated or removed.
- Around line 91-104: The NSG rule loop currently opens port 6443 to the
internet by using --source-address-prefixes "*" in the az network nsg rule
create call; change the logic in the ports/loop (ports=(6443 8000 8080) and the
for loop that calls az network nsg rule create) to scope
--source-address-prefixes for the kube-apiserver port (6443) to a specific
origin (e.g., fetch the CI host or Prow egress public IP and use it as CIDR/32)
while keeping the other ports as-is; locate the az network nsg rule create
invocation and add conditional handling so when port == 6443 you substitute the
fetched public IP (or a configured allowed CIDR) instead of "*" for
--source-address-prefixes.

---

Nitpick comments:
In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh`:
- Line 48: The current extraction of KIND_HOST_URN into kind_vm_image using a
loose grep is brittle; update the script to obtain the resolved Makefile
variable instead of grepping (e.g. invoke make -s -C operator
print-KIND_HOST_URN relying on a simple print-%: ; `@echo` $($*) rule added to the
Makefile) or at minimum anchor the grep to the line start (grep
'^KIND_HOST_URN[[:space:]]*=') to avoid matching comments/usages; likewise
replace the fragile awk '/kindest/ {print $NF}' Cargo.toml usage with a more
robust query or anchored grep/Make print target that returns the single intended
kindest image value so that kind_vm_image and the Cargo.toml-derived image are
always single, resolved strings passed to az vm create.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1f4cff19-27c8-448b-97ac-a00ce96abb6c

📥 Commits

Reviewing files that changed from the base of the PR and between be1c46c and 904bd57.

⛔ Files ignored due to path filters (1)

ci-operator/jobs/trusted-execution-clusters/operator/trusted-execution-clusters-operator-main-presubmits.yaml is excluded by !ci-operator/jobs/**

📒 Files selected for processing (16)

ci-operator/config/trusted-execution-clusters/operator/azure.yaml
ci-operator/step-registry/trusted-execution-clusters/operator-azure/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup.metadata.json
ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup.yaml
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/OWNERS
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/azure.metadata.json
ci-operator/step-registry/trusted-execution-clusters/operator-azure/lifecycle/azure.yaml
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/OWNERS
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.metadata.json
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.yaml
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/OWNERS
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.metadata.json
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh
ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.yaml

coderabbitai · 2026-05-08T13:57:06Z

+if [ -z "${SHARED_DIR}" ]; then
+  echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."
+  exit 1
+fi


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the pattern is nounset-safe rg -n '^\#!/bin/bash -eu|SHARED_DIR' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh # Repro of current behavior with nounset: env -i bash -eu -c '[ -z "${SHARED_DIR}" ]' || true

Repository: openshift/release

Length of output: 444

Guard SHARED_DIR safely under set -u.

Line 4 expands ${SHARED_DIR} directly under nounset mode (-u), causing an unbound-variable error before the intended guard check runs. Use a default expansion to allow the condition to evaluate properly when the variable is unset.

Suggested fix

-if [ -z "${SHARED_DIR}" ]; then +if [ -z "${SHARED_DIR:-}" ]; then echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment." exit 1 fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ -z "${SHARED_DIR}" ]; then

echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."

exit 1

fi

if [ -z "${SHARED_DIR:-}" ]; then

echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."

exit 1

fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh` around lines 4 - 7, The guard that checks SHARED_DIR uses direct expansion under nounset mode (set -u), which can cause an unbound-variable error; update the if condition to use a safe default expansion like ${SHARED_DIR:-} (e.g., if [ -z "${SHARED_DIR:-}" ]; then ...) so the check works even when SHARED_DIR is unset, and leave the existing error/exit behavior unchanged.

coderabbitai · 2026-05-08T13:57:06Z

+if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
+  echo "[ERROR] az-resource-group was not placed in SHARED_DIR"
+  exit 1
+fi
+
+rpm --import https://packages.microsoft.com/keys/microsoft.asc
+dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm
+dnf install -y azure-cli
+
+secret_base=/var/run/azure-upstream-ci
+az login --service-principal \
+  --username "$(cat $secret_base/client-id)" \
+  --password "$(cat $secret_base/client-secret)" \
+  --tenant "$(cat $secret_base/tenant-id)"
+
+az_resource_group=$(cat "${SHARED_DIR}/az-resource-group")
+echo "[INFO] Delete Kind VM resource group $az_resource_group"
+az group delete --name "$az_resource_group" --yes
+echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

`For Azure CLI, what are the documented outputs/exit semantics of:

az group exists --name <group>

az group delete --name <group> --yes
especially when the resource group does not exist?`

💡 Result:

az group exists --name <group> - Documented output: emits the string true or false (default output format is JSON, but for human use it prints true/false).[1][2] - Exit semantics when the resource group does not exist: the command can return exit code 0 even when the output is false (i.e., scripts should not rely on exit code; they should inspect stdout).[3] - Note: if you’re using --output json then you’ll get a JSON boolean, but the “false vs true” meaning remains the same.[1] 2) az group delete --name <group> --yes - Documented behavior: --yes only suppresses the interactive confirmation prompt; it doesn’t document any special handling for “group does not exist”.[1] - Exit semantics when the resource group does not exist: the authoritative behavior (specific exit code) is not clearly stated in the official command docs we found; instead, the underlying Azure Resource Manager delete operation treats HTTP 404 responses as successful for deletions (Resource Manager issues GET and expects 404; it considers deletion successful when it gets 404).[4] - Practically, this means: if Azure returns “not found” for the delete, it is treated as successful by ARM deletion logic, so you should expect the CLI not to fail the operation purely due to “already missing”, but the exact CLI exit code isn’t explicitly documented in the sources above.[4][5] Bottom line for scripting - Use stdout (true/false) for az group exists rather than $?/exit code.[3] - For az group delete --yes, treat “resource group not found” as a successful/no-op at the ARM level (404 is treated as success), but if you need an exact exit code guarantee, you’ll have to test in your environment since the CLI docs we located don’t spell out the specific exit code for the not-found case.[4][1]

Citations:

1: https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest

2: https://stackoverflow.com/questions/46458034/azure-cli-how-to-check-if-a-resource-exists

3: az group exists does not return false Azure/azure-cli#8594

4: https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/delete-resource-group?tabs=azure-powershell

5: https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/delete-resource-group

Make the cleanup script idempotent to avoid failures in post-steps.

The script hard-fails when the marker file is missing (exits with code 1) and deletes unconditionally without checking if the resource group still exists. In post-step execution, this causes noisy failures when these conditions aren't met. The script should treat "nothing to delete" and "already deleted" as success states.

The suggested changes are valid:

Exit 0 when the marker file is missing (the operation has nothing to do, which is not an error in post-cleanup)

Check if the resource group exists before deletion using az group exists, which reliably returns true or false to stdout (exit code cannot be used reliably for this check)

Handle the case where the resource group is already absent with a success message

This ensures the post-step completes without error whether resources were previously provisioned or already cleaned up.

Suggested fix

-if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then - echo "[ERROR] az-resource-group was not placed in SHARED_DIR" - exit 1 +if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then + echo "[WARN] az-resource-group not found; skipping Azure cleanup." + exit 0 fi @@ -az_resource_group=$(cat "${SHARED_DIR}/az-resource-group") +az_resource_group="$(<"${SHARED_DIR}/az-resource-group")" +if [ -z "${az_resource_group}" ]; then + echo "[WARN] Empty az-resource-group; skipping Azure cleanup." + exit 0 +fi echo "[INFO] Delete Kind VM resource group $az_resource_group" -az group delete --name "$az_resource_group" --yes +if [ "$(az group exists --name "$az_resource_group")" = "true" ]; then + az group delete --name "$az_resource_group" --yes +else + echo "[INFO] Resource group $az_resource_group already absent; nothing to delete." +fi echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then

echo "[ERROR] az-resource-group was not placed in SHARED_DIR"

exit 1

fi

rpm --import https://packages.microsoft.com/keys/microsoft.asc

dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm

dnf install -y azure-cli

secret_base=/var/run/azure-upstream-ci

az login --service-principal \

--username "$(cat $secret_base/client-id)" \

--password "$(cat $secret_base/client-secret)" \

--tenant "$(cat $secret_base/tenant-id)"

az_resource_group=$(cat "${SHARED_DIR}/az-resource-group")

echo "[INFO] Delete Kind VM resource group $az_resource_group"

az group delete --name "$az_resource_group" --yes

echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"

if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then

echo "[WARN] az-resource-group not found; skipping Azure cleanup."

exit 0

fi

rpm --import https://packages.microsoft.com/keys/microsoft.asc

dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm

dnf install -y azure-cli

secret_base=/var/run/azure-upstream-ci

az login --service-principal \

--username "$(cat $secret_base/client-id)" \

--password "$(cat $secret_base/client-secret)" \

--tenant "$(cat $secret_base/tenant-id)"

az_resource_group="$(<"${SHARED_DIR}/az-resource-group")"

if [ -z "${az_resource_group}" ]; then

echo "[WARN] Empty az-resource-group; skipping Azure cleanup."

exit 0

fi

echo "[INFO] Delete Kind VM resource group $az_resource_group"

if [ "$(az group exists --name "$az_resource_group")" = "true" ]; then

az group delete --name "$az_resource_group" --yes

else

echo "[INFO] Resource group $az_resource_group already absent; nothing to delete."

fi

echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh` around lines 9 - 27, Make the cleanup script idempotent: if the marker file "${SHARED_DIR}/az-resource-group" is missing, exit 0 instead of failing; if present, read az_resource_group and call "az group exists --name \"$az_resource_group\"" and use its stdout ("true"/"false") to decide whether to delete; only run "az group delete --name \"$az_resource_group\" --yes" when exists returns true and print a success message for both "deleted" and "already absent" cases; keep the existing az login (secret_base/client-id, client-secret, tenant-id) flow but ensure missing marker file is treated as no-op and use "az group exists" to guard deletion.

coderabbitai · 2026-05-08T13:57:06Z

+if [ -z "${SHARED_DIR}" ]; then
+  log_error "SHARED_DIR is not set. This script must run in Prow CI environment."
+  exit 1
+fi


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

SHARED_DIR guard fails under set -u.

Because the shebang is #!/bin/bash -eu, ${SHARED_DIR} is expanded under nounset. If SHARED_DIR is unset, this line aborts with SHARED_DIR: unbound variable before the helpful log_error message ever runs, defeating the purpose of the check.

🛡️ Use a default expansion so the guard can actually fire

-if [ -z "${SHARED_DIR}" ]; then +if [ -z "${SHARED_DIR:-}" ]; then log_error "SHARED_DIR is not set. This script must run in Prow CI environment." exit 1 fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ -z "${SHARED_DIR}" ]; then

log_error "SHARED_DIR is not set. This script must run in Prow CI environment."

exit 1

fi

if [ -z "${SHARED_DIR:-}" ]; then

log_error "SHARED_DIR is not set. This script must run in Prow CI environment."

exit 1

fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh` around lines 20 - 23, The guard checking SHARED_DIR fails under nounset (-u) because ${SHARED_DIR} is expanded when unset; change the conditional to use a safe default expansion such as if [ -z "${SHARED_DIR:-}" ]; then so the test can run even if SHARED_DIR is unset, keep the existing log_error and exit 1 behavior, and update any other checks that reference SHARED_DIR in this script (e.g., replace bare ${SHARED_DIR} expansions with ${SHARED_DIR:-} or ${SHARED_DIR:+...} as appropriate) so the script respects the #!/bin/bash -eu shebang.

coderabbitai · 2026-05-08T13:57:07Z

+az_region=eastus
+az_resource_group=upstream-ci-$test_id
+echo "$az_resource_group" > "$SHARED_DIR/az-resource-group"
+kind_vm_user=ci
+kind_vm_name=kind-vm
+kind_vm_image=$(grep KIND_HOST_URN operator/Makefile | cut -d= -f2 | tr -d ' ')
+vm_size=Standard_D2s_v3
+
+AZURE_SUBSCRIPTION_ID=$(cat $secret_base/subscription-id)
+export AZURE_SUBSCRIPTION_ID
+az login --service-principal \
+  --username "$(cat $secret_base/client-id)" \
+  --password "$(cat $secret_base/client-secret)" \
+  --tenant "$(cat $secret_base/tenant-id)"
+
+log_info "Create Azure resource group $az_resource_group"
+az group create \
+  --location $az_region \
+  --resource-group "$az_resource_group"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Resource group name is recorded before the group actually exists.

echo "$az_resource_group" > "$SHARED_DIR/az-resource-group" (line 45) runs before az login and az group create (lines 53–61). If login or group creation fails, the cleanup step downstream will attempt to delete a resource group that was never created. Depending on how azure-deprovision/deprovision.sh is implemented, this can either be a benign 404 or a failing cleanup that masks the original error.

Recommend persisting the RG name only after a successful az group create, or making the deprovision step explicitly tolerate a missing group (e.g. az group exists first, or az group delete --no-wait and ignore "not found").

♻️ Move the SHARED_DIR write to after group creation

az_resource_group=upstream-ci-$test_id -echo "$az_resource_group" > "$SHARED_DIR/az-resource-group" kind_vm_user=ci @@ log_info "Create Azure resource group $az_resource_group" az group create \ --location $az_region \ --resource-group "$az_resource_group" +echo "$az_resource_group" > "$SHARED_DIR/az-resource-group"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

az_region=eastus

az_resource_group=upstream-ci-$test_id

echo "$az_resource_group" > "$SHARED_DIR/az-resource-group"

kind_vm_user=ci

kind_vm_name=kind-vm

kind_vm_image=$(grep KIND_HOST_URN operator/Makefile | cut -d= -f2 | tr -d ' ')

vm_size=Standard_D2s_v3

AZURE_SUBSCRIPTION_ID=$(cat $secret_base/subscription-id)

export AZURE_SUBSCRIPTION_ID

az login --service-principal \

--username "$(cat $secret_base/client-id)" \

--password "$(cat $secret_base/client-secret)" \

--tenant "$(cat $secret_base/tenant-id)"

log_info "Create Azure resource group $az_resource_group"

az group create \

--location $az_region \

--resource-group "$az_resource_group"

az_region=eastus

az_resource_group=upstream-ci-$test_id

kind_vm_user=ci

kind_vm_name=kind-vm

kind_vm_image=$(grep KIND_HOST_URN operator/Makefile | cut -d= -f2 | tr -d ' ')

vm_size=Standard_D2s_v3

AZURE_SUBSCRIPTION_ID=$(cat $secret_base/subscription-id)

export AZURE_SUBSCRIPTION_ID

az login --service-principal \

--username "$(cat $secret_base/client-id)" \

--password "$(cat $secret_base/client-secret)" \

--tenant "$(cat $secret_base/tenant-id)"

log_info "Create Azure resource group $az_resource_group"

az group create \

--location $az_region \

--resource-group "$az_resource_group"

echo "$az_resource_group" > "$SHARED_DIR/az-resource-group"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh` around lines 43 - 61, The script writes the resource group name to $SHARED_DIR/az-resource-group before authentication and creation, which can cause cleanup to try deleting a non-existent RG; move the write so it happens only after az login and a successful az group create (i.e., after the az group create call that uses az_resource_group), or alternatively make the downstream deprovision tolerant of a missing group (check az group exists or handle "not found" in azure-deprovision/deprovision.sh). Update the sequence around az login, az group create and the SHARED_DIR/az-resource-group write so the file is persisted only on success.

coderabbitai · 2026-05-08T13:57:07Z

+ports=(6443 8000 8080)
+for i in "${!ports[@]}"; do
+  port=${ports[$i]}
+  az network nsg rule create \
+    --resource-group "$az_resource_group" \
+    --nsg-name "$nsg" \
+    --name "allow-$port" \
+    --priority $((1001 + i)) \
+    --source-address-prefixes "*" \
+    --destination-port-ranges "$port" \
+    --protocol Tcp \
+    --access Allow \
+    --direction Inbound
+done


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

Azure CLI az network nsg rule create source-address-prefixes scope to single IP example

💡 Result:

In Azure CLI, you set the source match for an NSG rule with --source-address-prefixes. It takes a space-separated list of CIDR prefixes or IP ranges, so a single IP works when you pass just one value [1]. Single IP example: az network nsg rule create \ --resource-group \ --nsg-name \ --name \ --priority \ --direction Inbound \ --access Allow \ --protocol Tcp \ --source-address-prefixes 203.0.113.10 \ --destination-address-prefixes '*' \ --destination-port-ranges 22 [1] Multiple IPs/prefixes: pass multiple values as separate CLI arguments (space-separated) [1]. Common pitfall (variables): if you’re using a shell that passes the variable as a single token, ensure the CLI receives multiple tokens (e.g., an array expansion / separate arguments). If you pass a single string that contains embedded commas/spaces, it may be treated as one invalid address prefix and fail validation [2].

Citations:

1: https://learn.microsoft.com/en-us/cli/azure/network/nsg/rule?view=azure-cli-latest

2: https://stackoverflow.com/questions/62654936/azure-network-security-group-add-source-ip-prefixes-using-variable

🏁 Script executed:

# Read the test.sh file to understand the context head -110 ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh | tail -40

Repository: openshift/release

Length of output: 1168

🏁 Script executed:

# Get more context around the suspicious lines and understand what this script does head -20 ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 274

🏁 Script executed:

# Check if there's any documentation about this script ls -la ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/

Repository: openshift/release

Length of output: 413

🏁 Script executed:

# Read the test.yaml to understand the test setup cat ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.yaml

Repository: openshift/release

Length of output: 588

🏁 Script executed:

# Check more of the test.sh to understand what runs on ports 8000/8080 wc -l ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 150

🏁 Script executed:

# Read the full test.sh to understand what services run on these ports sed -n '105,150p' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 1328

🏁 Script executed:

# Search for what uses ports 8000 and 8080 in the test script grep -n "8000\|8080" ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 87

🏁 Script executed:

# Check if there's any authentication configuration mentioned grep -n "anonymous\|auth\|insecure" ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 151

Restrict Kubernetes API server access from the public internet.

Port 6443 (kube-apiserver) is exposed to the entire internet via --source-address-prefixes "*". The Kind cluster configuration explicitly binds the API server to 0.0.0.0 on this port, creating an unnecessary attack surface. Even for ephemeral CI VMs, this risks exposure to API CVEs, authn/authz misconfigurations, and probing attacks.

Scope --source-address-prefixes to known origins such as the Prow build cluster egress IP range or the public IP of the host running this script (e.g., curl -fsS https://api.ipify.org). Access from the CI host using kubeconfig is sufficient for cluster setup and testing without requiring world-open ingress.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh` around lines 91 - 104, The NSG rule loop currently opens port 6443 to the internet by using --source-address-prefixes "*" in the az network nsg rule create call; change the logic in the ports/loop (ports=(6443 8000 8080) and the for loop that calls az network nsg rule create) to scope --source-address-prefixes for the kube-apiserver port (6443) to a specific origin (e.g., fetch the CI host or Prow egress public IP and use it as CIDR/32) while keeping the other ports as-is; locate the az network nsg rule create invocation and add conditional handling so when port == 6443 you substitute the fetched public IP (or a configured allowed CIDR) instead of "*" for --source-address-prefixes.

coderabbitai · 2026-05-08T13:57:07Z

+log_info "Run integration tests"
+eval "$(ssh-agent)"
+PLATFORM=kind_public VIRT_PROVIDER=azure REGISTRY=localhost:5000 \
+  TEST_NAMESPACE_PREFIX="$az_resource_group-" \
+  TEST_IMAGE=$(cat $secret_base/test-image) \
+  CLUSTER_URL="$kind_vm_ip" \
+  make integration-tests
+log_success "Ran integration tests"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Look at the operator's integration-tests target to see whether it relies on ssh-agent. fd -i -t f 'Makefile|\.mk$' --exec rg -nP -C5 '\b(integration-tests|ssh-add|SSH_AUTH_SOCK)\b' {}

Repository: openshift/release

Length of output: 43

🏁 Script executed:

# First, verify the test.sh file exists and see context cd ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test 2>/dev/null && ls -la test.sh 2>/dev/null || echo "File not found in expected path"

Repository: openshift/release

Length of output: 118

🏁 Script executed:

# Search for the test.sh file more broadly find . -name "test.sh" -path "*/trusted-execution-clusters/*" -type f 2>/dev/null | head -20

Repository: openshift/release

Length of output: 148

🏁 Script executed:

# Look for any Makefile in the repo fd -i -t f '(Makefile|makefile)' 2>/dev/null | head -20

Repository: openshift/release

Length of output: 366

🏁 Script executed:

# Search for ssh-add and ssh-agent usage in the repository rg -i 'ssh-add|ssh-agent' --type-list | head -5 rg -i 'ssh-add|ssh-agent' -l 2>/dev/null | head -20

Repository: openshift/release

Length of output: 836

🏁 Script executed:

# Read the test.sh file to see the context around lines 186-193 cat -n ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh | sed -n '170,200p'

Repository: openshift/release

Length of output: 1213

🏁 Script executed:

# Check if ssh-add is called anywhere in test.sh rg -n 'ssh-add' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 43

🏁 Script executed:

# Check for SSH_AUTH_SOCK or other ssh-agent references in test.sh rg -n 'SSH_AUTH_SOCK|ssh-agent' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 85

🏁 Script executed:

# Find where the integration-tests make target is defined - check the root Makefile first rg -n 'integration-tests:' Makefile

Repository: openshift/release

Length of output: 43

🏁 Script executed:

# Search for integration-tests in all files to understand what this target does rg -n 'integration-tests' ci-operator/step-registry/trusted-execution-clusters/ --type-list=shell 2>/dev/null | head -20

Repository: openshift/release

Length of output: 43

🏁 Script executed:

# Better search for integration-tests rg -n 'integration-tests' ci-operator/step-registry/trusted-execution-clusters/ | head -30

Repository: openshift/release

Length of output: 920

🏁 Script executed:

# Look at the test-commands.sh file around line 551 where it calls make integration-tests sed -n '530,570p' ci-operator/step-registry/trusted-execution-clusters/ref/operator/test/trusted-execution-clusters-ref-operator-test-commands.sh

Repository: openshift/release

Length of output: 1804

🏁 Script executed:

# Check if ssh-agent is used in test-commands.sh at all rg -n 'ssh-agent\|ssh-add' ci-operator/step-registry/trusted-execution-clusters/ref/operator/test/trusted-execution-clusters-ref-operator-test-commands.sh

Repository: openshift/release

Length of output: 43

🏁 Script executed:

# Get more context from the beginning of test.sh to understand the overall flow head -50 ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 1281

🏁 Script executed:

# Look for any documentation or comments about ssh-agent usage rg -B5 -A5 'ssh-agent' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh

Repository: openshift/release

Length of output: 554

ssh-agent is started but no key is added.

eval "$(ssh-agent)" launches an agent but nothing ever calls ssh-add, so SSH_AUTH_SOCK points at an empty agent. If make integration-tests relies on agent-based auth (e.g., to ssh into the Kind VM), it will fail; if it doesn't, this line is dead code that just leaks a process. Notably, the comparable test suite in trusted-execution-clusters-ref-operator-test-commands.sh runs make integration-tests without any ssh-agent setup.

Please confirm whether the make target needs an agent and either add the corresponding ssh-add call with the appropriate key, or drop the line.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-test/test.sh` around lines 186 - 193, The ssh-agent is started by eval "$(ssh-agent)" but no key is added, so either remove the ssh-agent startup or add an ssh-add step before calling make integration-tests; update the test script around eval "$(ssh-agent)" and the make integration-tests invocation to either (1) drop the eval line entirely (matching trusted-execution-clusters-ref-operator-test-commands.sh) if agent auth is not required, or (2) call ssh-add with the correct private key (e.g., ssh-add "$SSH_KEY_PATH" or ssh-add "$secret_base/id_rsa") and ensure SSH_AUTH_SOCK is set so make integration-tests can use agent auth. Ensure the change touches the block containing eval "$(ssh-agent)" and the make integration-tests invocation so the agent is either populated or removed.

openshift-merge-bot · 2026-05-08T13:57:52Z

@Jakob-Naucke, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: invalid file name: /var/tmp/gitrepo3990203622/ci-operator/step-registry/trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup.yaml

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

Create a Kind VM (also on Azure) as container platform. Run integration tests on Azure. Signed-off-by: Jakob Naucke <jnaucke@redhat.com>

openshift-merge-bot · 2026-05-08T14:22:07Z

[REHEARSALNOTIFIER]
@Jakob-Naucke: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name	Repo	Type	Reason
pull-ci-trusted-execution-clusters-operator-main-operator-lifecycle-azure-verify	trusted-execution-clusters/operator	presubmit	Presubmit changed

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

Jakob-Naucke · 2026-05-08T14:24:41Z

/pj-rehearse

openshift-merge-bot · 2026-05-08T14:24:44Z

@Jakob-Naucke: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Jakob-Naucke · 2026-05-08T14:27:03Z

/pj-rehearse

openshift-merge-bot · 2026-05-08T14:27:06Z

@Jakob-Naucke: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Jakob-Naucke · 2026-05-08T15:32:23Z

/pj-rehearse

openshift-merge-bot · 2026-05-08T15:32:26Z

@Jakob-Naucke: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Jakob-Naucke · 2026-05-08T18:25:55Z

last attempt today (none are scheduling)

/pj-rehearse

openshift-merge-bot · 2026-05-08T18:25:58Z

@Jakob-Naucke: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Jakob-Naucke mentioned this pull request May 8, 2026

trusted-execution-clusters: Intro Azure upstream CI #75842

Closed

openshift-ci Bot requested review from alicefr and danilo-gemoli May 8, 2026 09:12

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026

Jakob-Naucke marked this pull request as draft May 8, 2026 09:12

openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. labels May 8, 2026

Jakob-Naucke force-pushed the tec-azure branch from 26d6b61 to d45be2a Compare May 8, 2026 09:18

openshift-ci Bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 8, 2026

Jakob-Naucke force-pushed the tec-azure branch from d45be2a to 904bd57 Compare May 8, 2026 13:50

openshift-ci Bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 8, 2026

Jakob-Naucke force-pushed the tec-azure branch from 904bd57 to c9276e7 Compare May 8, 2026 13:54

openshift-ci Bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 8, 2026

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

trusted-execution-clusters: Intro Azure upstream CI

c6d2529

Create a Kind VM (also on Azure) as container platform. Run integration tests on Azure. Signed-off-by: Jakob Naucke <jnaucke@redhat.com>

Jakob-Naucke force-pushed the tec-azure branch from c9276e7 to c6d2529 Compare May 8, 2026 14:18

Conversation

Jakob-Naucke commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Azure Upstream CI for trusted-execution-clusters Operator

CI Configuration Changes

Test Infrastructure

Implementation Details

Governance

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

openshift-ci Bot commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

Jakob-Naucke commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

Jakob-Naucke commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

Jakob-Naucke commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

Jakob-Naucke commented May 8, 2026

Uh oh!

openshift-merge-bot Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jakob-Naucke commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading