Problem
Sei pods today are reachable for HTTP (RPC/EVM/REST) via the existing Istio L7 path → shared NLB → private pod, but they are not publicly dialable as P2P peers over TCP/26656. This blocks validators (in a deterministic set without sentries) and advertised full nodes from being addressable by ecosystem peers, and prevents internal SNDs from peering each other across clusters.
The design (platform/docs/designs/sei-publishable-p2p-nlb.md, merged in sei-protocol/platform#714) closes this gap with an opt-in networking.tcp: {} sub-struct: when set, the SeiNode reconciler stamps a per-pod Service type=LoadBalancer (AWS NLB, target-type: ip, scheme=internet-facing, cross-zone=true), reads Service.status.loadBalancer.ingress[0].hostname, and writes SeiNode.Status.ExternalAddress = "<hostname>:26656". The planner already reads ExternalAddress into p2p.external_address at planner.go:697; this issue makes that pipeline reachable.
Impact
Unblocks:
- Validator P2P publishability — deterministic set peers each other over stable NLB DNS endpoints; combined with
unconditional_peer_ids cross-config (ops, not this issue) gives stable cross-cluster validator mesh.
- Cross-SND peer discovery — new
PeerSource.seinodeDeployment variant resolves sibling SND children's Status.ExternalAddress into persistent_peers, replacing the EC2-tag-only discovery for K8s-resident publishers.
- Ecosystem-parity for advertised full nodes — partner peers can list us as a persistent peer.
Harbor stays excluded until the Cilium overlay is replaced (CGNAT pod IPs aren't VPC-routable → target-type: ip fails); controller fails closed via a VPC-CIDR routability check.
Proposed approach
Six independently-shippable PRs. Sequence dependencies: PR-1 → PR-2 → PR-3 (the chain that makes the field gateable and propagated). PR-4 lands in parallel with PR-3 (independent code path; needs PR-1's types only). PR-5 lands any time before PR-6. PR-6 waits on PR-3 + PR-4 + PR-5.
PR-1 — Types layer (contract-only, no behavior change)
api/v1alpha1/networking_types.go: add HTTP *HTTPConfig + TCP *TCPConfig to NetworkingConfig; add HTTPEnabled() accessor (legacy networking: {} → treated as http: {} for backcompat).
api/v1alpha1/seinode_types.go: add Networking *NetworkingConfig to SeiNodeSpec; flip ExternalAddress doc-comment from "SeiNodeDeployment controller" → "SeiNode controller".
api/v1alpha1/common_types.go: add SeiNodeDeploymentPeerSource to PeerSource union; bump XValidation CEL sum.
- CRD regen + smoke test: existing manifests (Networking-nil and legacy
networking: {}) still validate.
PR-2 — SND template propagation
internal/controller/seinodedeployment/: propagate Spec.Template.Networking into child.Spec.Networking at child-creation time (single line in the existing template-merge path).
- Envtest: SND with
networking.tcp: {} in template → child SeiNodes carry Spec.Networking.TCP != nil.
PR-3 — LB Service + ExternalAddress + Service-ready gate (the heavy PR)
internal/controller/node/external_address.go (NEW):
- Stamp
Service type=LoadBalancer (<seinode>-p2p) when Spec.Networking.TCP != nil and pod IP is inside SEI_VPC_CIDR env (VPC-CIDR routability gate).
- Annotations:
aws-load-balancer-type=nlb, scheme=internet-facing, nlb-target-type=ip, cross-zone-load-balancing-enabled=true; externalTrafficPolicy: Local; only TCP/26656.
- Watch Service status; on
loadBalancer.ingress[0].hostname populated → write Status.ExternalAddress = "<host>:26656" (bare host:port, no nodeId@).
- Empty-host guard:
if newHost == "" { return nil } — don't clear on transient absence.
- Service-ready gate: when
tcp != nil, hold STS creation until Status.ExternalAddress is written. (Without this, the first sidecar render has empty p2p.external_address; planner doesn't detect runtime override drift at planner.go:706.)
- Opt-out cleanup: when
Spec.Networking.TCP == nil, delete the Service inline.
internal/controller/node/controller.go: confirm existing Owns(&corev1.Service{}) at :204 covers the new Service (predicate audit); add RBAC verbs if missing (create/update/delete on services).
- Manager env:
SEI_VPC_CIDR plumbed via the controller Deployment (set in platform/clusters/prod/sei-k8s-controller/manager-patch.yaml — same pattern as SEI_GATEWAY_PUBLIC_DOMAIN).
- Envtest: bootstrap with
tcp: {} → STS doesn't appear until Service has a (fake) hostname; opt-out path deletes the Service.
PR-4 — seinodeDeployment peer-source variant
internal/controller/node/peers.go: extend reconcilePeers switch with the new variant; add resolveSNDPeers that lists matching SNDs → lists child SeiNodes → emits each child's Status.ExternalAddress as host:port into ResolvedPeers. The sidecar's existing CollectAndSetPeers task at internal/planner/group.go:59 prepends nodeID@ at task-build time (:26657/status query) — same path the label variant feeds. Not a new sidecar capability.
internal/controller/node/controller.go: Watches(&SeiNodeDeployment{}, mapperFn); naive mapper (re-reconcile all SeiNodes in namespace on any SND change) is acceptable at the ~20-publishable scale; tighten to selector-matching scope when fanout becomes a concern.
- RBAC: cluster-scoped
seinodedeployments get/list/watch.
- Envtest: consumer SND with
peers: [{ seinodeDeployment: ... }] → resolves to producer SND children's ExternalAddress.
PR-5 — Node SG opening (in sei-protocol/platform)
- Terraform:
tcp/26656 from 0.0.0.0/0 ingress on the Sei-node SG. Lands before the first publishable SND.
PR-6 — First publishable rollout (in sei-protocol/platform)
- Enable
networking.tcp: {} on one advertised full-node SND.
- Verify: external
nc <nlb-host> 26656; :26657/net_info shows DNS form in listen_addr; a remote node connects with <node_id>@<host>:26656.
- Validator rollout follows the same shape, one at a time across the deterministic set, with
unconditional_peer_ids cross-config landed first (runbook).
Out of scope
- Config-hash STS-template annotation for automatic runtime
tcp: {} add — v2. Runtime adds require a manual pod delete after Status.ExternalAddress populates (NLB hostnames don't change once allocated, so this doesn't recur).
- EIP-attached NLBs (stable IP, not just DNS) — DNS form only.
- NLB-attached SG (defense in depth) — relies on node-SG.
- Dedicated publishable EC2NodeClass / scoped SG — opens 26656 on shared node SG.
- Orphan-cleanup walker for SND scale-down events — opt-out (
tcp: {} → nil) cleanup IS in PR-3; scale-down (replica drop) orphans wait for v2.
- Harbor publishability — gated on Cilium overlay → VPC-routable CNI migration. Controller fails closed.
- Validator
unconditional_peer_ids cross-config — operational runbook, not a controller change.
- CometBFT sentry architecture — explicitly not adopted; deterministic validator set publishes directly.
Relevant experts
- kubernetes-specialist — PR-2, PR-3, PR-4 (controller-runtime, CRD types, envtest)
- platform-engineer — PR-3 manager env wiring, PR-5 (Terraform), PR-6 (manifest)
- sei-network-specialist — PR-6 verification (CometBFT NodeInfo + dial semantics)
References
- Design:
sei-protocol/platform, docs/designs/sei-publishable-p2p-nlb.md (merged in #714)
- Planner integration point:
internal/planner/planner.go:697 (commonOverrides already reads Status.ExternalAddress)
- Existing Service watch:
internal/controller/node/controller.go:204
- Drift-detection limit acknowledged:
internal/planner/planner.go:706 (buildRunningPlan only checks image + sidecar reapproval; ExternalAddress changes after Running don't propagate)
- AWS LBC annotation precedent: existing Istio gateway in
platform/clusters/prod/gateway/gateway.yaml
One-way doors (already approved in design)
Status.ExternalAddress format: "<host>:26656" (bare host:port; host is raw NLB *.elb.<region>.amazonaws.com, no vanity domain)
NetworkingConfig.HTTP / TCP sub-struct field names (presence-signals)
- Legacy
networking: {} interpreted as networking: { http: {} }
SeiNodeSpec.Networking field add (additive; propagated from SND template)
PeerSource.seinodeDeployment variant (additive; bump union CEL)
- Keep
peers field name (not renamed)
Problem
Sei pods today are reachable for HTTP (RPC/EVM/REST) via the existing Istio L7 path → shared NLB → private pod, but they are not publicly dialable as P2P peers over TCP/26656. This blocks validators (in a deterministic set without sentries) and advertised full nodes from being addressable by ecosystem peers, and prevents internal SNDs from peering each other across clusters.
The design (
platform/docs/designs/sei-publishable-p2p-nlb.md, merged insei-protocol/platform#714) closes this gap with an opt-innetworking.tcp: {}sub-struct: when set, the SeiNode reconciler stamps a per-podService type=LoadBalancer(AWS NLB,target-type: ip, scheme=internet-facing, cross-zone=true), readsService.status.loadBalancer.ingress[0].hostname, and writesSeiNode.Status.ExternalAddress = "<hostname>:26656". The planner already reads ExternalAddress intop2p.external_addressatplanner.go:697; this issue makes that pipeline reachable.Impact
Unblocks:
unconditional_peer_idscross-config (ops, not this issue) gives stable cross-cluster validator mesh.PeerSource.seinodeDeploymentvariant resolves sibling SND children'sStatus.ExternalAddressintopersistent_peers, replacing the EC2-tag-only discovery for K8s-resident publishers.Harbor stays excluded until the Cilium overlay is replaced (CGNAT pod IPs aren't VPC-routable →
target-type: ipfails); controller fails closed via a VPC-CIDR routability check.Proposed approach
Six independently-shippable PRs. Sequence dependencies: PR-1 → PR-2 → PR-3 (the chain that makes the field gateable and propagated). PR-4 lands in parallel with PR-3 (independent code path; needs PR-1's types only). PR-5 lands any time before PR-6. PR-6 waits on PR-3 + PR-4 + PR-5.
PR-1 — Types layer (contract-only, no behavior change)
api/v1alpha1/networking_types.go: addHTTP *HTTPConfig+TCP *TCPConfigtoNetworkingConfig; addHTTPEnabled()accessor (legacynetworking: {}→ treated ashttp: {}for backcompat).api/v1alpha1/seinode_types.go: addNetworking *NetworkingConfigtoSeiNodeSpec; flipExternalAddressdoc-comment from "SeiNodeDeployment controller" → "SeiNode controller".api/v1alpha1/common_types.go: addSeiNodeDeploymentPeerSourcetoPeerSourceunion; bump XValidation CEL sum.networking: {}) still validate.PR-2 — SND template propagation
internal/controller/seinodedeployment/: propagateSpec.Template.Networkingintochild.Spec.Networkingat child-creation time (single line in the existing template-merge path).networking.tcp: {}in template → child SeiNodes carrySpec.Networking.TCP != nil.PR-3 — LB Service + ExternalAddress + Service-ready gate (the heavy PR)
internal/controller/node/external_address.go(NEW):Service type=LoadBalancer(<seinode>-p2p) whenSpec.Networking.TCP != niland pod IP is insideSEI_VPC_CIDRenv (VPC-CIDR routability gate).aws-load-balancer-type=nlb,scheme=internet-facing,nlb-target-type=ip,cross-zone-load-balancing-enabled=true;externalTrafficPolicy: Local; onlyTCP/26656.loadBalancer.ingress[0].hostnamepopulated → writeStatus.ExternalAddress = "<host>:26656"(bare host:port, no nodeId@).if newHost == "" { return nil }— don't clear on transient absence.tcp != nil, hold STS creation untilStatus.ExternalAddressis written. (Without this, the first sidecar render has emptyp2p.external_address; planner doesn't detect runtime override drift atplanner.go:706.)Spec.Networking.TCP == nil, delete the Service inline.internal/controller/node/controller.go: confirm existingOwns(&corev1.Service{})at:204covers the new Service (predicate audit); add RBAC verbs if missing (create/update/delete on services).SEI_VPC_CIDRplumbed via the controller Deployment (set inplatform/clusters/prod/sei-k8s-controller/manager-patch.yaml— same pattern asSEI_GATEWAY_PUBLIC_DOMAIN).tcp: {}→ STS doesn't appear until Service has a (fake) hostname; opt-out path deletes the Service.PR-4 —
seinodeDeploymentpeer-source variantinternal/controller/node/peers.go: extendreconcilePeersswitch with the new variant; addresolveSNDPeersthat lists matching SNDs → lists child SeiNodes → emits each child'sStatus.ExternalAddressashost:portintoResolvedPeers. The sidecar's existingCollectAndSetPeerstask atinternal/planner/group.go:59prependsnodeID@at task-build time (:26657/statusquery) — same path thelabelvariant feeds. Not a new sidecar capability.internal/controller/node/controller.go:Watches(&SeiNodeDeployment{}, mapperFn); naive mapper (re-reconcile all SeiNodes in namespace on any SND change) is acceptable at the ~20-publishable scale; tighten to selector-matching scope when fanout becomes a concern.seinodedeploymentsget/list/watch.peers: [{ seinodeDeployment: ... }]→ resolves to producer SND children'sExternalAddress.PR-5 — Node SG opening (in
sei-protocol/platform)tcp/26656 from 0.0.0.0/0ingress on the Sei-node SG. Lands before the first publishable SND.PR-6 — First publishable rollout (in
sei-protocol/platform)networking.tcp: {}on one advertised full-node SND.nc <nlb-host> 26656;:26657/net_infoshows DNS form inlisten_addr; a remote node connects with<node_id>@<host>:26656.unconditional_peer_idscross-config landed first (runbook).Out of scope
tcp: {}add — v2. Runtime adds require a manual pod delete afterStatus.ExternalAddresspopulates (NLB hostnames don't change once allocated, so this doesn't recur).tcp: {}→ nil) cleanup IS in PR-3; scale-down (replica drop) orphans wait for v2.unconditional_peer_idscross-config — operational runbook, not a controller change.Relevant experts
References
sei-protocol/platform,docs/designs/sei-publishable-p2p-nlb.md(merged in #714)internal/planner/planner.go:697(commonOverridesalready readsStatus.ExternalAddress)internal/controller/node/controller.go:204internal/planner/planner.go:706(buildRunningPlanonly checks image + sidecar reapproval; ExternalAddress changes after Running don't propagate)platform/clusters/prod/gateway/gateway.yamlOne-way doors (already approved in design)
Status.ExternalAddressformat:"<host>:26656"(bare host:port; host is raw NLB*.elb.<region>.amazonaws.com, no vanity domain)NetworkingConfig.HTTP/TCPsub-struct field names (presence-signals)networking: {}interpreted asnetworking: { http: {} }SeiNodeSpec.Networkingfield add (additive; propagated from SND template)PeerSource.seinodeDeploymentvariant (additive; bump union CEL)peersfield name (not renamed)