Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions ci-operator/config/trusted-execution-clusters/operator/azure.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
base_images:
telco-runner:
name: telco-runner
namespace: ci
tag: latest
build_root:
image_stream_tag:
name: builder
namespace: ocp
tag: rhel-9-golang-1.25-openshift-4.21
resources:
'*':
limits:
memory: 4Gi
requests:
cpu: 100m
memory: 200Mi
tests:
- as: operator-lifecycle-azure-verify
capabilities:
- intranet
skip_if_only_changed: ^(\.github|LICENSES|bundle|docs|examples)/|^(README\.md|\.gitignore)$
steps:
test:
- chain: trusted-execution-clusters-operator-azure-lifecycle
post:
- chain: trusted-execution-clusters-operator-azure-cleanup
zz_generated_metadata:
branch: main
org: trusted-execution-clusters
repo: operator
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,68 @@ presubmits:
secret:
secretName: result-aggregator
trigger: (?m)^/test( | .* )operator-lifecycle-verify,?($|\s.*)
- agent: kubernetes-azure
always_run: false
branches:
- ^main$
- ^main-
cluster: build07
context: ci/prow/operator-lifecycle-azure-verify
decorate: true
decoration_config:
skip_cloning: true
labels:
capability/intranet: intranet
ci.openshift.io/generator: prowgen
pj-rehearse.openshift.io/can-be-rehearsed: "true"
name: pull-ci-trusted-execution-clusters-operator-main-operator-lifecycle-azure-verify
rerun_command: /test operator-lifecycle-azure-verify
skip_if_only_changed: ^(\.github|LICENSES|bundle|docs|examples)/|^(README\.md|\.gitignore)$
spec:
containers:
- args:
- --gcs-upload-secret=/secrets/gcs/service-account.json
- --image-import-pull-secret=/etc/pull-secret/.dockerconfigjson
- --report-credentials-file=/etc/report/credentials
- --target=operator-lifecycle-azure-verify
command:
- ci-operator
env:
- name: HTTP_SERVER_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: quay-proxy.ci.openshift.org/openshift/ci:ci_ci-operator_latest
imagePullPolicy: Always
name: ""
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: 10m
volumeMounts:
- mountPath: /secrets/gcs
name: gcs-credentials
readOnly: true
- mountPath: /secrets/manifest-tool
name: manifest-tool-local-pusher
readOnly: true
- mountPath: /etc/pull-secret
name: pull-secret
readOnly: true
- mountPath: /etc/report
name: result-aggregator
readOnly: true
serviceAccountName: ci-operator
volumes:
- name: manifest-tool-local-pusher
secret:
secretName: manifest-tool-local-pusher
- name: pull-secret
secret:
secretName: registry-pull-credentials
- name: result-aggregator
secret:
secretName: result-aggregator
trigger: (?m)^/test( | .* )operator-lifecycle-azure-verify,?($|\s.*)
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
reviewers:
- alicefr
- Jakob-Naucke
approvers:
- alicefr
- Jakob-Naucke
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
reviewers:
- alicefr
- Jakob-Naucke
approvers:
- alicefr
- Jakob-Naucke
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"path": "trusted-execution-clusters/operator-azure/cleanup/trusted-execution-clusters-operator-azure-cleanup-chain.yaml",
"owners": {
"approvers": [
"alicefr",
"Jakob-Naucke"
],
"reviewers": [
"alicefr",
"Jakob-Naucke"
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
chain:
as: trusted-execution-clusters-operator-azure-cleanup
steps:
- ref: trusted-execution-clusters-ref-operator-azure-deprovision
documentation: |-
Azure tests create a Kind VM. Remove its resource group.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
reviewers:
- alicefr
- Jakob-Naucke
approvers:
- alicefr
- Jakob-Naucke
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"path": "trusted-execution-clusters/operator-azure/lifecycle/trusted-execution-clusters-operator-azure-lifecycle-chain.yaml",
"owners": {
"approvers": [
"alicefr",
"Jakob-Naucke"
],
"reviewers": [
"alicefr",
"Jakob-Naucke"
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
chain:
as: trusted-execution-clusters-operator-azure-lifecycle
steps:
- ref: trusted-execution-clusters-ref-operator-azure-test
documentation: |-
Create a VM for Kind on Azure. Run integration tests with Azure VMs, testing against the operator on that Kind cluster.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
reviewers:
- alicefr
- Jakob-Naucke
approvers:
- alicefr
- Jakob-Naucke
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash -eu
set -o pipefail

if [ -z "${SHARED_DIR}" ]; then
echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."
exit 1
fi
Comment on lines +4 to +7
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the pattern is nounset-safe
rg -n '^\#!/bin/bash -eu|SHARED_DIR' ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh
# Repro of current behavior with nounset:
env -i bash -eu -c '[ -z "${SHARED_DIR}" ]' || true

Repository: openshift/release

Length of output: 444


Guard SHARED_DIR safely under set -u.

Line 4 expands ${SHARED_DIR} directly under nounset mode (-u), causing an unbound-variable error before the intended guard check runs. Use a default expansion to allow the condition to evaluate properly when the variable is unset.

Suggested fix
-if [ -z "${SHARED_DIR}" ]; then
+if [ -z "${SHARED_DIR:-}" ]; then
   echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."
   exit 1
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [ -z "${SHARED_DIR}" ]; then
echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."
exit 1
fi
if [ -z "${SHARED_DIR:-}" ]; then
echo "[ERROR] SHARED_DIR is not set. This script must run in Prow CI environment."
exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh`
around lines 4 - 7, The guard that checks SHARED_DIR uses direct expansion under
nounset mode (set -u), which can cause an unbound-variable error; update the if
condition to use a safe default expansion like ${SHARED_DIR:-} (e.g., if [ -z
"${SHARED_DIR:-}" ]; then ...) so the check works even when SHARED_DIR is unset,
and leave the existing error/exit behavior unchanged.


if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
echo "[ERROR] az-resource-group was not placed in SHARED_DIR"
exit 1
fi

rpm --import https://packages.microsoft.com/keys/microsoft.asc
dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm
dnf install -y azure-cli

secret_base=/var/run/azure-upstream-ci
az login --service-principal \
--username "$(cat $secret_base/client-id)" \
--password "$(cat $secret_base/client-secret)" \
--tenant "$(cat $secret_base/tenant-id)"

az_resource_group=$(cat "${SHARED_DIR}/az-resource-group")
echo "[INFO] Delete Kind VM resource group $az_resource_group"
az group delete --name "$az_resource_group" --yes
echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"
Comment on lines +9 to +27
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

`For Azure CLI, what are the documented outputs/exit semantics of:

  1. az group exists --name <group>
  2. az group delete --name <group> --yes
    especially when the resource group does not exist?`

💡 Result:

  1. az group exists --name <group> - Documented output: emits the string true or false (default output format is JSON, but for human use it prints true/false).[1][2] - Exit semantics when the resource group does not exist: the command can return exit code 0 even when the output is false (i.e., scripts should not rely on exit code; they should inspect stdout).[3] - Note: if you’re using --output json then you’ll get a JSON boolean, but the “false vs true” meaning remains the same.[1] 2) az group delete --name <group> --yes - Documented behavior: --yes only suppresses the interactive confirmation prompt; it doesn’t document any special handling for “group does not exist”.[1] - Exit semantics when the resource group does not exist: the authoritative behavior (specific exit code) is not clearly stated in the official command docs we found; instead, the underlying Azure Resource Manager delete operation treats HTTP 404 responses as successful for deletions (Resource Manager issues GET and expects 404; it considers deletion successful when it gets 404).[4] - Practically, this means: if Azure returns “not found” for the delete, it is treated as successful by ARM deletion logic, so you should expect the CLI not to fail the operation purely due to “already missing”, but the exact CLI exit code isn’t explicitly documented in the sources above.[4][5] Bottom line for scripting - Use stdout (true/false) for az group exists rather than $?/exit code.[3] - For az group delete --yes, treat “resource group not found” as a successful/no-op at the ARM level (404 is treated as success), but if you need an exact exit code guarantee, you’ll have to test in your environment since the CLI docs we located don’t spell out the specific exit code for the not-found case.[4][1]

Citations:


Make the cleanup script idempotent to avoid failures in post-steps.

The script hard-fails when the marker file is missing (exits with code 1) and deletes unconditionally without checking if the resource group still exists. In post-step execution, this causes noisy failures when these conditions aren't met. The script should treat "nothing to delete" and "already deleted" as success states.

The suggested changes are valid:

  • Exit 0 when the marker file is missing (the operation has nothing to do, which is not an error in post-cleanup)
  • Check if the resource group exists before deletion using az group exists, which reliably returns true or false to stdout (exit code cannot be used reliably for this check)
  • Handle the case where the resource group is already absent with a success message

This ensures the post-step completes without error whether resources were previously provisioned or already cleaned up.

Suggested fix
-if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
-  echo "[ERROR] az-resource-group was not placed in SHARED_DIR"
-  exit 1
+if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
+  echo "[WARN] az-resource-group not found; skipping Azure cleanup."
+  exit 0
 fi
@@
-az_resource_group=$(cat "${SHARED_DIR}/az-resource-group")
+az_resource_group="$(<"${SHARED_DIR}/az-resource-group")"
+if [ -z "${az_resource_group}" ]; then
+  echo "[WARN] Empty az-resource-group; skipping Azure cleanup."
+  exit 0
+fi
 echo "[INFO] Delete Kind VM resource group $az_resource_group"
-az group delete --name "$az_resource_group" --yes
+if [ "$(az group exists --name "$az_resource_group")" = "true" ]; then
+  az group delete --name "$az_resource_group" --yes
+else
+  echo "[INFO] Resource group $az_resource_group already absent; nothing to delete."
+fi
 echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
echo "[ERROR] az-resource-group was not placed in SHARED_DIR"
exit 1
fi
rpm --import https://packages.microsoft.com/keys/microsoft.asc
dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm
dnf install -y azure-cli
secret_base=/var/run/azure-upstream-ci
az login --service-principal \
--username "$(cat $secret_base/client-id)" \
--password "$(cat $secret_base/client-secret)" \
--tenant "$(cat $secret_base/tenant-id)"
az_resource_group=$(cat "${SHARED_DIR}/az-resource-group")
echo "[INFO] Delete Kind VM resource group $az_resource_group"
az group delete --name "$az_resource_group" --yes
echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"
if [ ! -f "${SHARED_DIR}/az-resource-group" ]; then
echo "[WARN] az-resource-group not found; skipping Azure cleanup."
exit 0
fi
rpm --import https://packages.microsoft.com/keys/microsoft.asc
dnf install -y https://packages.microsoft.com/config/rhel/9.0/packages-microsoft-prod.rpm
dnf install -y azure-cli
secret_base=/var/run/azure-upstream-ci
az login --service-principal \
--username "$(cat $secret_base/client-id)" \
--password "$(cat $secret_base/client-secret)" \
--tenant "$(cat $secret_base/tenant-id)"
az_resource_group="$(<"${SHARED_DIR}/az-resource-group")"
if [ -z "${az_resource_group}" ]; then
echo "[WARN] Empty az-resource-group; skipping Azure cleanup."
exit 0
fi
echo "[INFO] Delete Kind VM resource group $az_resource_group"
if [ "$(az group exists --name "$az_resource_group")" = "true" ]; then
az group delete --name "$az_resource_group" --yes
else
echo "[INFO] Resource group $az_resource_group already absent; nothing to delete."
fi
echo "[SUCCESS] Deleted Kind VM resource group $az_resource_group"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/trusted-execution-clusters/ref/operator/azure-deprovision/deprovision.sh`
around lines 9 - 27, Make the cleanup script idempotent: if the marker file
"${SHARED_DIR}/az-resource-group" is missing, exit 0 instead of failing; if
present, read az_resource_group and call "az group exists --name
\"$az_resource_group\"" and use its stdout ("true"/"false") to decide whether to
delete; only run "az group delete --name \"$az_resource_group\" --yes" when
exists returns true and print a success message for both "deleted" and "already
absent" cases; keep the existing az login (secret_base/client-id, client-secret,
tenant-id) flow but ensure missing marker file is treated as no-op and use "az
group exists" to guard deletion.

Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"path": "trusted-execution-clusters/ref/operator/azure-deprovision/trusted-execution-clusters-ref-operator-azure-deprovision-ref.yaml",
"owners": {
"approvers": [
"alicefr",
"Jakob-Naucke"
],
"reviewers": [
"alicefr",
"Jakob-Naucke"
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
ref:
as: trusted-execution-clusters-ref-operator-azure-deprovision
from_image:
namespace: ci
name: telco-runner
tag: latest
commands: trusted-execution-clusters-ref-operator-azure-deprovision-commands.sh
credentials:
- namespace: test-credentials
name: azure-upstream-ci
mount_path: /var/run/azure-upstream-ci
resources:
requests:
cpu: 500m
memory: 500Mi
limits:
memory: 1Gi
documentation: |-
Azure tests create a Kind VM. Remove its resource group.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
reviewers:
- alicefr
- Jakob-Naucke
approvers:
- alicefr
- Jakob-Naucke
Loading