feat(gatewayapi): mirror tigera-ca-bundle into each Gateway namespace#4822
Closed
electricjesus wants to merge 9 commits into
Closed
feat(gatewayapi): mirror tigera-ca-bundle into each Gateway namespace#4822electricjesus wants to merge 9 commits into
electricjesus wants to merge 9 commits into
Conversation
- Swap the checked-in gateway_api_resources.yaml for the embedded gateway-helm.tgz rendered via the helm SDK at startup; K8SGatewayAPICRDs/GatewayAPICRDs now take a runtime.Scheme and return an error (istio_controller updated for the new signature) - Deploy two envoy-gateway controllers: legacy in tigera-gateway (user-declared classes via Spec.GatewayClasses) and a new one in calico-system with deploy.type=GatewayNamespace; auto-provision the tigera-gateway-class-ns GatewayClass bound to the new controller - Group the tigera-gateway install behind legacyObjects/legacyTeardownObjects so the eventual deprecation is a single delete - HasLegacyGateways classifier in the controller: build a className -> controllerName map seeded from Spec.GatewayClasses + existing GatewayClass resources, classify every live Gateway; when no Gateway targets the tigera-gateway controller, the install is torn down; during the teardown-then-redeploy race the legacy render is deferred to avoid a "Namespace is terminating, skipping creation" log flood - Legacy teardown queues only the Namespace + cluster-scoped objects + the Deployment (for status.RemoveDeployments); in-namespace RBAC/Secrets ride the cascade to avoid the tigera-operator-secrets RoleBinding race - Move the shared waf-http-filter ClusterRoles out of the legacy bundle so the calico-system-side proxies keep their cluster-scoped perms after tigera-gateway is retired - Per-namespace Enterprise resources (SA, RoleBindings, pull secret, shared CRB subject) for namespaces hosting a namespaced-class Gateway; reserved namespaces skip shared resource create/delete; Secret goes before RoleBinding on cleanup to avoid 403 - Gate v3 NetworkPolicies on the calico-system Tier; render calico-system.envoy-gateway allow for the controller and certgen - Update unit tests and Makefile/docs accordingly Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Cover the calico-system envoy-gateway controller lifecycle, per-namespace resource provisioning and cleanup, custom EnvoyProxy and EnvoyGateway ConfigMap watches, owning-gateway env vars in l7-log-collector, and the legacy-class teardown path - Teardown sequencing for tigera-gateway cascading Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lico-system - Render one envoy-gateway controller in calico-system with deploy.type=GatewayNamespace - Auto-provision tigera-gateway-class; honour user overrides if redeclared in Spec.GatewayClasses - Enumerate every operator-owned object from the legacy tigera-gateway install for cleanup (pull Secrets before tigera-operator-secrets); keep the Namespace itself in case users placed their own resources there - Point GatewayAPI finalizer at the calico-system envoy-gateway Deployment - Drop dual-controller fixtures and the legacy-undeploy test; consolidate FV tests to the calico-system layout
Upstream envoy-gateway rejects the combination of mergeGateways: true and GatewayNamespaceMode, so any user-supplied EnvoyProxy with merging enabled would cause its referenced Gateways to silently stop being programmed after the switch to GatewayNamespace (https://gateway.envoyproxy.io/docs/tasks/operations/gateway-namespace-mode/). In the GatewayAPI reconciler, when a Spec.GatewayClasses[].EnvoyProxyRef points at an EnvoyProxy with Spec.MergeGateways == true, force the field to false in our managed copy and log a warning naming the EnvoyProxy and GatewayClass. The user's source CR is not mutated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- remove controllerName param (never set by callers) - inline ReleaseName and GatewayNamespace deploy type - add DeploymentNamespace constant for the install namespace - drop now-unused helmGateway type
- parseManifest now errors on kinds it doesn't recognize so a chart bump that emits a new kind trips the existing render tests
Under deploy.type=GatewayNamespace (tigera#4690), envoy-proxy pods land in the Gateway's own namespace and mount the operator trust bundle at /etc/pki/tls/certs (added by tigera#4796). The mount references a ConfigMap in the proxy pod's own namespace, but tigera#4796 only writes the ConfigMap into calico-system (the controller's namespace), so the proxy Pod stops at Init:0/2 with: Warning FailedMount MountVolume.SetUp failed for volume "tigera-ca-bundle": configmap not found Mirror the trust bundle into each Gateway namespace alongside the existing per-namespace propagation of tigera-pull-secret and the waf-http-filter SA / RoleBindings. Reuses the existing reserved-NS guard and follows the same delete-before-RoleBinding ordering as the pull-secret cleanup. Reproduced live on seth-ez-a3b5 2026-05-19 with operator walter-merge-2026-05-18 (has both tigera#4690 and tigera#4796): fresh Gateway namespace -> everything else propagates but tigera-ca-bundle does not, proxy Pod stuck Init:0/2. Brief: tigera/gateway-extensions-controller/docs/planning/briefs/2026-05-19-ca-bundle-propagation-brief.md
Walter-supplied positive test: configure two Gateway namespaces
("default" and "app-ns") with a TrustedBundle, render, assert the
trust bundle ConfigMap (TrustedCertConfigMapName) lands in each
Gateway namespace.
Companion to the per-NS ConfigMap copy added in the previous commit.
Member
Author
|
Closing — Walter asked us to push directly to #4690 instead of stacking a follow-up. The two commits (mirror trust bundle into per-NS loop + the positive test Walter supplied) are now on |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Under
deploy.type=GatewayNamespace(#4690), envoy-proxy pods land in each Gateway's own namespace and mount the operator trust bundle at/etc/pki/tls/certs(added by #4796). The mount references a ConfigMap in the proxy pod's own namespace, but #4796 only writes the ConfigMap intocalico-system(the controller's namespace), so the proxy Pod stops atInit:0/2with:This PR mirrors the trust bundle into each Gateway namespace alongside the existing per-namespace propagation of
tigera-pull-secretand thewaf-http-filterSA / RoleBindings. Reuses the existing reserved-NS guard and follows the same delete-before-RoleBinding ordering as the pull-secret cleanup.Changes
pkg/render/gatewayapi/gateway_api.go: in both the create and delete per-NS loops, appendpr.cfg.TrustedBundle.ConfigMap(ns)alongside the existing pull-secret propagation. Gated by the existing!isReservedOperatorNamespace(ns)+ a nil-check onTrustedBundle.pkg/render/gatewayapi/gateway_api_test.go: positive test (should copy the trust bundle ConfigMap into each Gateway namespace) covering the create path with two Gateway namespaces.Verification
End-to-end reproducer + fix verified live on
seth-ez-a3b5:radixo:gatewayapi-deployment-enterpriseHEAD): fresh Gateway namespace → proxy pod stuckInit:0/2withFailedMount tigera-ca-bundle not found. Manually cloning the CM unblocks.tigera-ca-bundleConfigMap auto-created in NS, proxy pod reaches4/4 Running, GatewayAccepted=True. No manual cloning.Brief with full reproducer + observed-vs-expected table:
tigera/gateway-extensions-controller/docs/planning/briefs/2026-05-19-ca-bundle-propagation-brief.md.Test plan
go test ./pkg/render/gatewayapi/... -count=1— all greengo build ./...— cleanseth-ez-a3b5— see Verification aboveRelease Note
```release-note
Operator now mirrors the trusted CA bundle ConfigMap into every Gateway-hosting namespace under namespaced-mode (`deploy.type=GatewayNamespace`), so envoy-proxy pods in user namespaces can mount the bundle and successfully validate TLS to public upstreams (wasm OCI registries, OIDC providers).
```
Linked