docs: improve GCP UPI install guide based on real-world verification#10515
docs: improve GCP UPI install guide based on real-world verification#10515rochacbruno wants to merge 2 commits intoopenshift:mainfrom
Conversation
Add practical guidance discovered during hands-on OCPSTRAT-2830 verification testing of the Terraform/Infrastructure Manager templates: - Increase signed URL duration from 1h to 2h to survive deployment retries - Add warning about manifest regeneration changing the Infrastructure ID - Explain ingress degradation when masters are schedulable - Add notes about zone capacity fallback and machine type alternatives - Warn that Infrastructure Manager does not recreate VMs on metadata changes - Add batch CSR approval command and background approval loop - Add troubleshooting section covering common failure scenarios - Add IAP tunnel firewall rule for debugging nodes without public IPs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
WalkthroughDocumentation for GCP UPI installation updated: documents ingress degradation symptoms, warns against regenerating manifests/ignition after extracting INFRA_ID, extends bootstrap signed URL lifetime from 1h to 2h, adds zone capacity and redeploy guidance, expands CSR approval commands, and adds a Troubleshooting section including IAP SSH tunneling. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
docs/user/gcp/install_upi.md (1)
906-947: Troubleshooting section is strong; consider adding firewall-rule cleanup guidance.Nice addition overall. Optional hardening: note that the temporary IAP SSH firewall rule should be removed after debugging to reduce long-lived administrative exposure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/user/gcp/install_upi.md` around lines 906 - 947, Add a brief note after the IAP SSH firewall-rule creation and usage that instructs operators to remove the temporary firewall rule when finished (reference the created rule name pattern `${INFRA_ID}-allow-iap-ssh` and the `gcloud compute firewall-rules` command used to create it) and to remove the target tags from instances if they were added for debugging, showing the cleanup action as a single step to reduce long-lived administrative exposure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/user/gcp/install_upi.md`:
- Around line 505-513: The two adjacent blockquotes starting with "**Note**: If
a zone..." and "**Warning**: If you need to redeploy..." are broken by a blank
line inside the blockquote; remove the empty line so the blockquote remains
continuous (or ensure each paragraph keeps the leading ">" on every line) so
markdownlint MD028 is satisfied and the "**Note**" and "**Warning**" lines
remain inside a single valid blockquote.
- Around line 737-751: The current bulk CSR approval commands approve every
pending CSR; change them to only select CSRs where the requestor username
matches known node bootstrap/node identities before piping to oc adm certificate
approve. Update the oc get csr invocation (keep using oc get csr and oc adm
certificate approve) to filter on the username field (e.g. via -o go-template or
-o jsonpath) by matching .spec.username (or the JSONPath expression that selects
usernames) against your expected node requestor list (for example
system:node:<name> or system:node-bootstrap) and only emit those .metadata.name
values; apply the same filtered selection inside your background loop so only
approved CSRs come from known node requestors.
---
Nitpick comments:
In `@docs/user/gcp/install_upi.md`:
- Around line 906-947: Add a brief note after the IAP SSH firewall-rule creation
and usage that instructs operators to remove the temporary firewall rule when
finished (reference the created rule name pattern `${INFRA_ID}-allow-iap-ssh`
and the `gcloud compute firewall-rules` command used to create it) and to remove
the target tags from instances if they were added for debugging, showing the
cleanup action as a single step to reduce long-lived administrative exposure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 7a8fb579-f0ef-4bb5-957a-fde6523efa2a
📒 Files selected for processing (1)
docs/user/gcp/install_upi.md
Fix blockquote continuity (MD028) between Note and Warning blocks, and filter CSR approval commands to only approve CSRs from known node requestors (node-bootstrapper and system:node:*) instead of all pending CSRs indiscriminately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/user/gcp/install_upi.md (1)
11-20:⚠️ Potential issue | 🟡 MinorAdd
jqto the prerequisites list.The document uses
jqextensively for JSON processing (extracting metadata at lines 236, 260-266, parsing CSR output at lines 740-748, 756-763, and many other locations), butjqis not listed in the required binaries. Users withoutjqinstalled will encounter command failures.📋 Suggested prerequisite addition
Add
jqto the binaries list:* the following binaries installed and in $PATH: * gcloud * python + * jq🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/user/gcp/install_upi.md` around lines 11 - 20, The prerequisites list is missing the jq binary which the guide repeatedly uses for JSON processing (e.g., alongside gcloud and python entries and the PyYAML bullet); update the prerequisites bullet list that currently lists "gcloud", "python", and "PyYAML" to also include "jq" so users are informed to install it before running the documented commands.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@docs/user/gcp/install_upi.md`:
- Around line 11-20: The prerequisites list is missing the jq binary which the
guide repeatedly uses for JSON processing (e.g., alongside gcloud and python
entries and the PyYAML bullet); update the prerequisites bullet list that
currently lists "gcloud", "python", and "PyYAML" to also include "jq" so users
are informed to install it before running the documented commands.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4399d398-1f31-4322-9ebe-fa20914a5bfc
📒 Files selected for processing (1)
docs/user/gcp/install_upi.md
Summary
mastersSchedulable: falseis critical (ingress LB cannot reach router pods on masters, causingconnection refused)e2-standard-4as alternative machine typeContext
These improvements come from hands-on verification testing of OCPSTRAT-2830 (GCP UPI migration from Deployment Manager to Terraform/Infrastructure Manager). All 8 Terraform template stages were verified successfully, but the documentation gaps led to several avoidable issues during the process.
Test plan
Generated with Claude Code
Summary by CodeRabbit