Add --node-external-ip=$${CP_PUBLIC_IPV4} to the k3s server install in
infra/hetzner/cloudinit-control-plane.tftpl so every CP publishes BOTH
node.status.addresses[InternalIP=10.0.1.2] AND ExternalIP=<public ipv4>.
Bug evidence (Wave 28-E, t22-omantel-biz 2026-05-18):
hel/fsn/sin all advertise InternalIP=10.0.1.2 with NO ExternalIP.
After the 2026-05-15 per-region-network refactor every region's CP
sits in its OWN isolated hcloud_network, so 10.0.1.2 is locally
scoped on each VPS and NOT routable cross-region. Cilium picks the
InternalIP as its tunnel endpoint by default → cross-region VXLAN
tunnels resolve to 10.0.1.2 on every peer → inter-region pod traffic
blackholes (pod-to-pod 0/6 across regions).
docs/SOVEREIGN-MULTI-REGION-DOD.md A2 mandate:
"inter-region link = DMZ WireGuard over PUBLIC IPs ALWAYS
(never any provider's private network)".
Publishing the public IPv4 as ExternalIP lets Cilium promote it to the
tunnel endpoint when peer addresses include External + Internal, which
restores cross-region pod reachability without breaking intra-cluster
paths — InternalIP stays primary for kube-apiserver advertise + pod-to-
CP dial (the original reason --node-ip was pinned to private in
PR-#62-era; the comment at line 1370-1378 still holds and is preserved).
Effect:
- Only takes effect on FRESH provisions (t23+). t22 already deployed
cannot be remediated by a cloudinit change.
- Both primary CP and secondary CPs go through this same template
(main.tf templatefile() calls for primary at line 636 and per
secondary at line 1187), so a single template edit covers all
regions.
- Approach A (smaller / immediate). Approach B (DMZ WireGuard overlay
DaemonSet per platform/bp-dmz-vcluster/) follows as architectural
follow-up if A alone doesn't fully resolve cross-region pod
traffic on t23+.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>