Skip to content

OpenShift: improve deployment reliability and egress access#443

Open
TomasTomecek wants to merge 7 commits intopackit:mainfrom
TomasTomecek:depl-3
Open

OpenShift: improve deployment reliability and egress access#443
TomasTomecek wants to merge 7 commits intopackit:mainfrom
TomasTomecek:depl-3

Conversation

@TomasTomecek
Copy link
Copy Markdown
Member

@TomasTomecek TomasTomecek commented May 4, 2026

Summary

Improve OpenShift deployment reliability and enable LLM access via Vertex AI.

Changes

Egress & Networking:

  • Allow egress to oauth2.googleapis.com, vertexai.googleapis.com, and googleapis.com for Vertex AI/Claude authentication

Deployment Script:

  • Add retry logic to oc import-image with exponential backoff (2s, 4s, 8s, 16s delays, max 5 attempts) to handle transient OpenShift conflicts

Documentation:

  • Update OpenShift README with deployment location (jotnar-ymir--jotnar-ymir on GPC cluster), list of all deployed components, and current deployment procedure

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the OpenShift deployment configuration by introducing a new deployment script with retry logic, enabling the Phoenix observability stack, and renaming the DRY_RUN environment variable to SILENT_RUN. It also updates the README with deployment locations and adds egress rules for Google OAuth and Vertex AI. Feedback includes addressing shell portability issues in the deployment script, removing redundant namespace flags, and ensuring the retry logic matches the intended exponential backoff. Additionally, it is recommended to reference environment variables from ConfigMaps for better maintainability and to refine broad egress rules for Google APIs.

Comment thread openshift/deploy.sh
Comment thread openshift/deploy.sh Outdated
Comment thread openshift/deploy.sh Outdated
retry=$((retry + 1))
if [ $retry -lt $max_retries ]; then
echo "Import failed, retrying in 2 seconds... (attempt $retry/$max_retries)"
sleep 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation uses a fixed 2-second delay, which contradicts the 'exponential backoff' mentioned in the pull request description. If exponential backoff is intended, the sleep duration should increase with each retry attempt.

Comment thread openshift/deployment-mcp-gateway.yml
Comment thread openshift/tenant-egress.yml
TomasTomecek and others added 7 commits May 4, 2026 19:06
Retries up to 5 times with exponential backoff (2s, 4s, 8s, 16s) when
oc import-image fails, handling transient conflicts during concurrent
ImageStream updates.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
Assisted-by: Claude
Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
that I just shamelessly copy-pasted from Anton's repo 🙈

we'll figure this out longterm

Related https://github.com/packit/ai-workflows/pull/409/changes

Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
Added oauth2.googleapis.com, vertexai.googleapis.com, and googleapis.com
to egress rules to enable authentication with Vertex AI for Claude API calls.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants