From 9774d7ba53d52ea6acd29aa91641c4a31a339925 Mon Sep 17 00:00:00 2001 From: hatiyildiz Date: Wed, 20 May 2026 04:08:17 +0200 Subject: [PATCH] =?UTF-8?q?feat(self-sovereign-cutover):=20add=20step=2010?= =?UTF-8?q?=20=E2=80=94=20pivot=20vCluster=20HelmReleases=20to=20Sovereign?= =?UTF-8?q?=20Harbor=20(Refs=20#2034)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The chart's own comment at platform/bp-mgmt-vcluster/chart/values.yaml:77-79 promised "post-handover, the per-Sovereign overlay rewrites to `harbor./proxy-ghcr/...`" — but the rewrite step never existed anywhere in the cutover sequence. As a result, every Sovereign post-handover keeps pulling vCluster control-plane images from `harbor.openova.io` indefinitely, a direct violation of Principle #11 (no tether to harbor.openova.io after handover). Caught by the TBD-V24 tether audit on 2026-05-20. Why step 04 (containerd registries.yaml pivot) doesn't catch it: registries.yaml.v2 only mirrors the 7 canonical UPSTREAMS (ghcr.io, docker.io, registry.k8s.io, gcr.io, quay.io, xpkg.upbound.io, public.ecr.aws). The host `harbor.openova.io` is treated as a literal endpoint, not an upstream, so containerd routes those image pulls direct to mothership Harbor regardless of mirror config. This step adds: - Phase 1: live `kubectl patch helmrelease` against each of {bp-mgmt-vcluster, bp-rtz-vcluster, bp-dmz-vcluster} in flux-system, patching BOTH `spec.values.Vcluster.image.repository` (umbrella) AND `spec.values.vcluster.controlPlane.statefulSet.image. {registry,repository}` (loft-sh subchart). Topology-aware: secondaries skip MGMT (not present), primary skips RTZ (not present). Idempotent: re-runs no-op when already pivoted. - Phase 2: git push to local Gitea injecting the same override blocks into clusters/_template/bootstrap-kit/{54,58,59}-bp-*-vcluster.yaml so the bootstrap-kit Kustomization doesn't revert the live patch on next reconcile (same pattern as step 06 Phase 2 + Phase 2.5). Coordination with chart 0.1.34 (TBD-V25, PR #2036, already merged): totalSteps bumped from "9" → "10" in 09-cutover-status-configmap.yaml. Contract test (tests/cutover-contract.sh) asserts shift from 9 → 10 step ConfigMaps and from 8 → 9 job-mode ConfigMaps. New Case 21 verifies Step 10's wrapper + subchart patches are wired correctly. RBAC: ClusterRole gains helm.toolkit.fluxcd.io.helmreleases {update,patch}. Step-06 Phase-1.6 (the openova-catalog HR patch shipped in chart 0.1.31) was silently relying on this verb already — chart 0.1.31's RBAC change was missed, so this bump ALSO closes a latent permission gap that would have surfaced on any cluster where the prior patch attempt happened to require it. Operator note: existing actively-running vCluster Pods do NOT churn on this step — they're already running with images pulled at startup. The patch ensures the NEXT image-pull (chart bump, Pod restart, region add) routes through the Sovereign-local Harbor. Refs #2034 (NOT Closes — operator-walk on fresh prov + screenshot required per CLAUDE.md §4 anti-theater discipline). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../self-sovereign-cutover/chart/Chart.yaml | 57 ++- .../09-cutover-status-configmap.yaml | 10 +- .../10-vcluster-registry-pivot-job.yaml | 436 ++++++++++++++++++ .../chart/templates/rbac.yaml | 3 + .../chart/tests/cutover-contract.sh | 104 ++++- .../self-sovereign-cutover/chart/values.yaml | 28 ++ 6 files changed, 616 insertions(+), 22 deletions(-) create mode 100644 platform/self-sovereign-cutover/chart/templates/10-vcluster-registry-pivot-job.yaml diff --git a/platform/self-sovereign-cutover/chart/Chart.yaml b/platform/self-sovereign-cutover/chart/Chart.yaml index 0ac55896..5257eb75 100644 --- a/platform/self-sovereign-cutover/chart/Chart.yaml +++ b/platform/self-sovereign-cutover/chart/Chart.yaml @@ -142,7 +142,54 @@ name: bp-self-sovereign-cutover # | jq '.auths | keys'` returns ONLY `[harbor.]`. Refs # #2034 (closes after operator-walk-with-screenshot per anti-theater # discipline). -version: 0.1.35 +# +# 0.1.36 (TBD-V24 MISS-1, issue #2034, 2026-05-20): NEW step 10 +# (vcluster-registry-pivot) — pivots the THREE bp-*-vcluster +# HelmReleases' `image.repository` from `harbor.openova.io/proxy-ghcr/ +# loft-sh/vcluster` to `harbor./proxy-ghcr/loft-sh/ +# vcluster` so the MGMT/RTZ/DMZ vCluster control-plane Pods pull from +# the Sovereign-local Harbor mirror post-cutover. +# +# Root cause this addresses: the chart's own comment at +# `platform/bp-mgmt-vcluster/chart/values.yaml:77-79` promised +# "post-handover, the per-Sovereign overlay rewrites to +# `harbor./proxy-ghcr/...`" but the rewrite step was +# never implemented anywhere in the cutover sequence. As a result, +# every Sovereign post-handover keeps pulling vCluster control-plane +# images from `harbor.openova.io` indefinitely — a direct violation +# of Principle #11 (no tether to harbor.openova.io after handover). +# Caught by the TBD-V24 tether audit (2026-05-20). +# +# Step 04 (containerd registries.yaml pivot) does NOT catch this +# because registries.yaml.v2 only mirrors the 7 canonical UPSTREAMS +# (ghcr.io, docker.io, registry.k8s.io, etc.). The host +# `harbor.openova.io` is treated as a literal endpoint, NOT an +# upstream, so containerd routes those pulls direct. +# +# This step adds a Phase-1 live `kubectl patch helmrelease` against +# each of {bp-mgmt-vcluster, bp-rtz-vcluster, bp-dmz-vcluster} in +# flux-system, patching BOTH: +# - spec.values.Vcluster.image.repository (umbrella chart) +# - spec.values.vcluster.controlPlane.statefulSet.image.{registry,repository} (subchart) +# A Phase-2 git push to local Gitea injects the same override blocks +# into clusters/_template/bootstrap-kit/{54,58,59}-bp-*-vcluster.yaml +# so the bootstrap-kit Kustomization reconcile doesn't revert the +# live patch. Idempotent on re-run (skip-if-already-pivoted on +# Phase-1, sentinel-comment guard on Phase-2). Topology-aware: +# secondaries skip the MGMT HR (not present on secondary), primary +# skips the RTZ HR (not present on primary). +# +# RBAC: ClusterRole gains helm.toolkit.fluxcd.io.helmreleases +# {update,patch}. Step-06 Phase-1.6 was already silently relying on +# this verb (it patches bp-catalyst-platform HR for the openova-catalog +# URL override) — chart 0.1.31's RBAC change was missed, so this bump +# ALSO closes a latent permission gap. +# +# totalSteps bumped from "9" → "10" in 09-cutover-status-configmap.yaml. +# Contract test (tests/cutover-contract.sh) asserts shift from 9 → 10 +# step ConfigMaps and from 8 → 9 job-mode ConfigMaps. New Case 21 +# verifies Step 10's wrapper + subchart patches are wired. +version: 0.1.36 description: | Catalyst Self-Sovereignty Cutover Blueprint. Installs DORMANT — this chart ships eight step ConfigMaps (PodSpec ConfigMaps, one per step), @@ -188,6 +235,14 @@ description: | blocking tenant voucher checkout at journey step 16). Rolls the provisioning Deployment so the new token takes effect immediately. (TBD-C18) + 10 cutover-step-10-vcluster-registry-pivot mode=job + Patch the bp-mgmt-vcluster / bp-rtz-vcluster / bp-dmz-vcluster + HelmReleases' image.repository from harbor.openova.io → + harbor. so vCluster control-plane Pods pull + from the Sovereign-local Harbor mirror post-cutover. Phase-1 + kubectl patch; Phase-2 git push to local Gitea so the + bootstrap-kit Kustomization doesn't revert the override. + (TBD-V24 MISS-1) Plus: self-sovereign-cutover-status ConfigMap diff --git a/platform/self-sovereign-cutover/chart/templates/09-cutover-status-configmap.yaml b/platform/self-sovereign-cutover/chart/templates/09-cutover-status-configmap.yaml index 454b3c7d..53c6b09b 100644 --- a/platform/self-sovereign-cutover/chart/templates/09-cutover-status-configmap.yaml +++ b/platform/self-sovereign-cutover/chart/templates/09-cutover-status-configmap.yaml @@ -43,7 +43,7 @@ data: cutoverFinishedAt: "" currentStep: "" currentStepIndex: "0" - totalSteps: "9" + totalSteps: "10" progressPercent: "0" failedStep: "" lastError: "" @@ -83,3 +83,11 @@ data: step.egress-block-test.finishedAt: "" step.egress-block-test.result: "" step.egress-block-test.jobName: "" + step.gitea-token-mint.startedAt: "" + step.gitea-token-mint.finishedAt: "" + step.gitea-token-mint.result: "" + step.gitea-token-mint.jobName: "" + step.vcluster-registry-pivot.startedAt: "" + step.vcluster-registry-pivot.finishedAt: "" + step.vcluster-registry-pivot.result: "" + step.vcluster-registry-pivot.jobName: "" diff --git a/platform/self-sovereign-cutover/chart/templates/10-vcluster-registry-pivot-job.yaml b/platform/self-sovereign-cutover/chart/templates/10-vcluster-registry-pivot-job.yaml new file mode 100644 index 00000000..f5955eaf --- /dev/null +++ b/platform/self-sovereign-cutover/chart/templates/10-vcluster-registry-pivot-job.yaml @@ -0,0 +1,436 @@ +{{- /* +Step 10 — vcluster-registry-pivot (TBD-V24 MISS-1, issue #2034). + +Pivots the THREE bp-*-vcluster HelmReleases' `image.repository` values +from the chart-default `harbor.openova.io/proxy-ghcr/loft-sh/vcluster` +to the Sovereign-local `harbor./proxy-ghcr/loft-sh/ +vcluster`. Without this step the MGMT/RTZ/DMZ vCluster control-plane +Pods keep pulling from mothership Harbor indefinitely — blocking +Pillar 5 (Sovereign independence). The chart's own comment at +`platform/bp-mgmt-vcluster/chart/values.yaml:77-79` already promises +"post-handover, the per-Sovereign overlay rewrites to +`harbor./proxy-ghcr/...`" but until this step shipped +no such rewrite existed anywhere in the cutover sequence. + +Why step 04 (containerd registries.yaml pivot) does NOT catch it +───────────────────────────────────────────────────────────────── +`registries.yaml.v2` (written by Step 04's DaemonSet) only registers +mirrors for the 7 canonical UPSTREAMS (ghcr.io, docker.io, +registry.k8s.io, gcr.io, quay.io, xpkg.upbound.io, public.ecr.aws). The +host `harbor.openova.io` is treated as a literal endpoint, NOT an +upstream — Step 04 does NOT add a mirror for it. So a Pod whose +imageRef starts with `harbor.openova.io/...` ALWAYS hits the literal +mothership Harbor regardless of containerd mirror config. + +Why not just `helm template` the bp-*-vcluster charts with an overlay +───────────────────────────────────────────────────────────────────── +The bp-*-vcluster charts default to mothership-friendly values so +non-Sovereign installs (the openova-io mothership itself, dev clusters) +work without threading a Sovereign FQDN through values. Coupling the +chart to a per-Sovereign FQDN at template time would pollute every +non-Sovereign deployment. The cutover-step pattern keeps chart +defaults untouched and only rewrites POST-handover, when the +per-Sovereign FQDN is well-defined. + +Patches applied (TWO knobs per HelmRelease) +─────────────────────────────────────────── +The bp-mgmt-vcluster / bp-rtz-vcluster / bp-dmz-vcluster charts expose +the vCluster image at TWO levels: + + 1. `Vcluster.image.repository` — the umbrella chart's image + reference (see `platform/bp-mgmt-vcluster/chart/values.yaml:81`, + bp-rtz-vcluster/...values.yaml:33, bp-dmz-vcluster/...:43) + 2. `vcluster.controlPlane.statefulSet.image.{registry,repository}` + — the loft-sh upstream subchart's image reference (see + platform/bp-mgmt-vcluster/chart/values.yaml:129-130 et al.) + +This Job patches BOTH on each of the 3 HelmReleases. Idempotent: a +re-run reads the current value and skips when already pivoted. + +Phase inventory +─────────────── + Phase 1 Live K8s patch — `kubectl patch helmrelease` against each + of {bp-mgmt-vcluster, bp-rtz-vcluster, bp-dmz-vcluster} in + flux-system, setting both image knobs to the Sovereign-local + Harbor host. Triggers an immediate Flux reconcile annotation + so helm-controller re-renders within seconds rather than + waiting for the 15m HR interval. + Phase 2 Push YAML edit to local Gitea (same pattern as Step 06 + Phase 2): clone the bootstrap-kit repo, inject `image:` + overrides into clusters/_template/bootstrap-kit/{54,58,59}- + bp-*-vcluster.yaml under spec.values, commit + push. Without + this phase the bootstrap-kit Kustomization reconcile (every + ~1 min from local Gitea) would silently revert the Phase 1 + live patch back to the chart default within minutes. + +Pod restart-safety / image-pull semantics +────────────────────────────────────────── +The actively-running vCluster Pods do NOT churn on a `helm upgrade` of +the wrapper chart unless the StatefulSet PodSpec changes. Changing +ONLY `image.repository` to a different host of the SAME image tag +DOES trigger a rolling update because the StatefulSet's +`spec.template.spec.containers[].image` field changes. Local Harbor's +proxy-ghcr project (created by Step 02) will serve the same upstream +image bytes via Harbor's proxy-cache semantics, so the new image-pull +either hits Harbor's existing cache (Step 03 prewarm covered this) or +falls through to ghcr.io once and then caches locally. + +Order rationale: 10 (after Step 09) +──────────────────────────────────── +Placed at order 10 so it runs LAST in the cutover sequence. This +follows the precedent set by Step 09 (gitea-token-mint, chart 0.1.30): +"avoid renumbering 01..09 which would invalidate operator history". +Functionally the step is order-independent for everything except: + - MUST run AFTER Step 02 (harbor-projects) so the local Harbor + proxy-ghcr project exists for the new image pulls. + - MUST run AFTER Step 06 (helmrepository-patches Phase-0) so the + ghcr-pull Secret carries auth for harbor. — that auth is + what the vCluster Pods need to pull from the local Harbor mirror. + +Image phase: post. +*/ -}} +apiVersion: v1 +kind: ConfigMap +metadata: + name: cutover-step-10-vcluster-registry-pivot + namespace: {{ .Release.Namespace }} + labels: + {{- include "bp-self-sovereign-cutover.labels" . | nindent 4 }} + app.kubernetes.io/component: cutover-step + bp.openova.io/cutover-order: "10" + bp.openova.io/cutover-mode: "job" +data: + stepName: vcluster-registry-pivot + podSpec: | + serviceAccountName: {{ include "bp-self-sovereign-cutover.serviceAccountName" . }} + restartPolicy: Never + activeDeadlineSeconds: {{ .Values.stepTimeouts.vclusterRegistryPivotSeconds }} + containers: + - name: vcluster-registry-pivot + # alpine/k8s ships both kubectl AND git so we can patch live + # HelmReleases AND push the YAML edit to local Gitea (same image + # Step 06 uses for the same reason). + # Fix #163 (2026-05-11, MIRROR-EVERYTHING): explicit + # harbor.openova.io/proxy-dockerhub prefix per CLAUDE.md + # inviolable rule — pre-cutover containerd already routes this + # via the proxy, so the Job can pull its own image during the + # cutover sequence. + image: harbor.openova.io/proxy-dockerhub/alpine/k8s:1.31.4 + imagePullPolicy: IfNotPresent + env: + - name: HELMRELEASE_NAMESPACE + value: {{ .Values.vclusterRegistryPivot.helmReleaseNamespace | quote }} + - name: SOVEREIGN_FQDN + value: {{ .Values.sovereign.fqdn | quote }} + - name: GITEA_INTERNAL_URL + value: {{ .Values.sovereign.giteaInternalURL | quote }} + - name: GITEA_USERNAME + valueFrom: + secretKeyRef: + name: {{ .Values.gitea.adminSecretRef.name }} + key: {{ .Values.gitea.adminSecretRef.usernameKey }} + - name: GITEA_PASSWORD + valueFrom: + secretKeyRef: + name: {{ .Values.gitea.adminSecretRef.name }} + key: {{ .Values.gitea.adminSecretRef.passwordKey }} + - name: GITEA_ORG + value: {{ .Values.gitea.org | quote }} + - name: GITEA_REPO + value: {{ .Values.gitea.repo | quote }} + - name: IMAGE_TAG + value: {{ .Values.vclusterRegistryPivot.imageTag | quote }} + volumeMounts: + - name: tmp + mountPath: /tmp + command: ["/bin/sh", "-c"] + args: + - | + set -eu + + # The Sovereign-local Harbor host. bp-*-vcluster charts' + # `image.repository` is HOST + PATH ("harbor.openova.io/ + # proxy-ghcr/loft-sh/vcluster"), but the upstream loft-sh + # subchart splits it into `.image.registry` (HOST only) + + # `.image.repository` (PATH only). So we compute both. + target_host="harbor.${SOVEREIGN_FQDN}" + wrapper_repo="${target_host}/proxy-ghcr/loft-sh/vcluster" + sub_registry="${target_host}" + sub_repo_path="proxy-ghcr/loft-sh/vcluster" + + echo "[vcluster-registry-pivot] target host: ${target_host}" + echo "[vcluster-registry-pivot] wrapper image.repository: ${wrapper_repo}" + echo "[vcluster-registry-pivot] subchart image.registry: ${sub_registry} repository: ${sub_repo_path}" + + # ---- Phase 1: live K8s patch on each of the 3 bp-*-vcluster HelmReleases ---- + # + # The bp-*-vcluster wrapper charts each expose: + # spec.values.Vcluster.image.repository (umbrella) + # spec.values.vcluster.controlPlane.statefulSet.image.{registry,repository} (subchart) + # where is one of {mgmt, rtz, dmz}. We merge-patch + # both knobs in a single kubectl invocation per HR so a + # partial-patch race can't leave one knob pivoted and the + # other not. + # + # Idempotency: kubectl patch --type=merge is a no-op when + # the requested value equals the current value, so re-runs + # don't churn the HR resourceVersion. We additionally read + # the current wrapper image.repository before patching and + # log a SKIP when it's already correct. + ok=0 + skip=0 + fail=0 + + patch_hr() { + # $1 = HR name (bp-mgmt-vcluster | bp-rtz-vcluster | bp-dmz-vcluster) + # $2 = role key (mgmtVcluster | rtzVcluster | dmzVcluster) + hr_name="$1" + role_key="$2" + + # Skip-if-absent guard. On a primary region the MGMT and + # DMZ HRs exist; secondaries only have DMZ + RTZ. We + # don't want a missing HR to fail the cutover step on + # the wrong topology. + if ! kubectl get helmrelease "${hr_name}" -n "${HELMRELEASE_NAMESPACE}" >/dev/null 2>&1; then + echo "[vcluster-registry-pivot] SKIP ${hr_name} (not present in this region — topology-dependent, expected)" + skip=$((skip+1)) + return 0 + fi + + cur_wrapper=$(kubectl get helmrelease "${hr_name}" -n "${HELMRELEASE_NAMESPACE}" \ + -o jsonpath='{.spec.values.'"${role_key}"'.image.repository}' 2>/dev/null || echo "") + cur_sub_reg=$(kubectl get helmrelease "${hr_name}" -n "${HELMRELEASE_NAMESPACE}" \ + -o jsonpath='{.spec.values.vcluster.controlPlane.statefulSet.image.registry}' 2>/dev/null || echo "") + + if [ "${cur_wrapper}" = "${wrapper_repo}" ] && [ "${cur_sub_reg}" = "${sub_registry}" ]; then + echo "[vcluster-registry-pivot] SKIP ${hr_name} (already pivoted: wrapper=${cur_wrapper} sub-registry=${cur_sub_reg})" + skip=$((skip+1)) + return 0 + fi + + # Build a single merge-patch that updates BOTH knobs. + # Using a heredoc + printf to keep the JSON readable. + patch_json=$(printf '{ + "spec": { + "values": { + "%s": { + "image": { + "repository": "%s" + } + }, + "vcluster": { + "controlPlane": { + "statefulSet": { + "image": { + "registry": "%s", + "repository": "%s" + } + } + } + } + } + } + }' "${role_key}" "${wrapper_repo}" "${sub_registry}" "${sub_repo_path}") + + if kubectl patch helmrelease "${hr_name}" \ + -n "${HELMRELEASE_NAMESPACE}" \ + --type=merge --patch "${patch_json}" >/dev/null 2>&1; then + echo "[vcluster-registry-pivot] OK ${hr_name} -> wrapper=${wrapper_repo} sub-registry=${sub_registry}" + # Trigger immediate Flux reconcile so helm-controller + # re-renders the chart within seconds rather than + # waiting for the 15m HR interval. + kubectl annotate --overwrite helmrelease "${hr_name}" \ + -n "${HELMRELEASE_NAMESPACE}" \ + "reconcile.fluxcd.io/requestedAt=$(date +%s)" >/dev/null 2>&1 || true + ok=$((ok+1)) + else + echo "[vcluster-registry-pivot] FAIL ${hr_name}" >&2 + fail=$((fail+1)) + fi + } + + patch_hr "bp-mgmt-vcluster" "mgmtVcluster" + patch_hr "bp-rtz-vcluster" "rtzVcluster" + patch_hr "bp-dmz-vcluster" "dmzVcluster" + + echo "[vcluster-registry-pivot] live-K8s ok=${ok} skip=${skip} fail=${fail}" + [ "${fail}" -eq 0 ] || exit 1 + + # ---- Phase 2: push YAML edit to local Gitea ---- + # + # bootstrap-kit Kustomization reconciles HelmRelease YAML + # from local Gitea every ~1 min. Without this phase, Phase + # 1's live patch is silently reverted on the next reconcile + # because the YAML in Gitea still lacks the image.repository + # override under spec.values. We inject the override block + # (same shape used by other Sovereign-overlay knobs in + # those files like `nodeSelector` and the existing + # `${SOVEREIGN_REGION_CANONICAL_LABEL}` substitutions). + export HOME=/tmp + git config --global user.name "self-sovereign-cutover" + git config --global user.email "cutover@${SOVEREIGN_FQDN}" + git config --global advice.detachedHead false + + gitea_host="$(printf '%s' "${GITEA_INTERNAL_URL}" | sed -E 's|^https?://||' | cut -d: -f1 | cut -d/ -f1)" + for i in $(seq 1 30); do + if nslookup "${gitea_host}" >/dev/null 2>&1; then break; fi + sleep 5 + done + + push_url=$(printf '%s' "${GITEA_INTERNAL_URL}" | sed -E "s,^(https?://),\1${GITEA_USERNAME}:${GITEA_PASSWORD}@,")"/${GITEA_ORG}/${GITEA_REPO}.git" + redacted=$(printf '%s' "${GITEA_INTERNAL_URL}/${GITEA_ORG}/${GITEA_REPO}.git") + + echo "[vcluster-registry-pivot] cloning ${redacted}" + cd /tmp + rm -rf repo + git clone --depth 1 --branch main "${push_url}" repo >/dev/null 2>&1 + cd repo + + target_dir="clusters/_template/bootstrap-kit" + if [ ! -d "${target_dir}" ]; then + echo "[vcluster-registry-pivot] FATAL: ${target_dir} not present in local mirror" >&2 + exit 1 + fi + + # The three vCluster HR YAMLs and their role keys. + # Format: filename:role_key + edited=0 + for entry in \ + "54-bp-dmz-vcluster.yaml:dmzVcluster" \ + "58-bp-mgmt-vcluster.yaml:mgmtVcluster" \ + "59-bp-rtz-vcluster.yaml:rtzVcluster" \ + ; do + file="${entry%%:*}" + role_key="${entry##*:}" + path="${target_dir}/${file}" + + if [ ! -f "${path}" ]; then + echo "[vcluster-registry-pivot] SKIP ${path} (not present)" + continue + fi + + # Has the override block already been injected (re-run)? + # We look for our sentinel comment + the wrapper-repo URL + # under the role_key block. + if grep -q "# ─── vCluster image registry pivot (TBD-V24 MISS-1) ──" "${path}"; then + # Idempotency: rewrite the URL value in case the + # SOVEREIGN_FQDN changed between runs (rare but + # possible when the operator renames the Sovereign). + # We replace the line beginning with ` repository:` + # under the role_key block. + if sed -i -E "/^[[:space:]]{4}${role_key}:[[:space:]]*$/,/^[[:space:]]{0,4}[a-zA-Z]/{s,^([[:space:]]{8}repository:[[:space:]]*).*$,\1${wrapper_repo}," "${path}" 2>/dev/null; then + edited=$((edited+1)) + echo "[vcluster-registry-pivot] rewrote existing image override in ${path}" + else + echo "[vcluster-registry-pivot] WARN: sed rewrite failed for ${path}; Phase-1 K8s patch remains durable for one upgrade cycle" + fi + continue + fi + + # No existing override — inject the block under the + # `values:` key. The role-key block exists at 4-space + # indent under spec.values (every chart >= 0.1.0 ships + # this shape; the role keys are `mgmtVcluster`, + # `rtzVcluster`, `dmzVcluster`). + if grep -qE "^[[:space:]]{4}${role_key}:[[:space:]]*$" "${path}"; then + awk -v role="${role_key}" -v repo="${wrapper_repo}" -v reg="${sub_registry}" -v rpath="${sub_repo_path}" -v tag="${IMAGE_TAG}" ' + $0 ~ "^[[:space:]]{4}"role":[[:space:]]*$" && !done { + print + print " # ─── vCluster image registry pivot (TBD-V24 MISS-1) ──" + print " # Injected by self-sovereign-cutover step-10 from" + print " # harbor.openova.io → harbor. so the" + print " # vCluster control-plane Pods pull from the Sovereign-local" + print " # Harbor mirror, NOT mothership Harbor. Required by" + print " # Principle #11 (no tether to harbor.openova.io after" + print " # handover). Without this override the chart default at" + print " # platform/bp-*-vcluster/chart/values.yaml resolves back" + print " # to harbor.openova.io and the bootstrap-kit Kustomization" + print " # reconcile reverts step-10s live K8s patch." + print " image:" + print " repository: " repo + done = 1 + next + } + { print } + ' "${path}" > "${path}.new" && mv "${path}.new" "${path}" + edited=$((edited+1)) + echo "[vcluster-registry-pivot] injected wrapper image override into ${path}" + else + echo "[vcluster-registry-pivot] WARN: ${path} has no ${role_key}: anchor — Phase-1 K8s patch remains durable for one upgrade cycle" + continue + fi + + # ALSO inject (or replace) the subchart image block — + # the loft-sh upstream image. The file already has a + # `vcluster:` block at 4-space indent (every HR YAML + # has it for nodeSelector pinning). We append the + # subchart image under controlPlane.statefulSet, OR if + # the block already exists, replace the registry value. + if grep -qE "^[[:space:]]{8}statefulSet:[[:space:]]*$" "${path}"; then + if grep -qE "^[[:space:]]{10}image:[[:space:]]*$" "${path}"; then + # Existing image block under statefulSet — replace + # registry + repository lines at 12-space indent + # using a bounded sed range so we don't touch other + # image: blocks in the same file. + sed -i -E "/^[[:space:]]{10}image:[[:space:]]*$/,/^[[:space:]]{0,10}[a-zA-Z]/{s,^([[:space:]]{12}registry:[[:space:]]*).*$,\1${sub_registry},; s,^([[:space:]]{12}repository:[[:space:]]*).*$,\1${sub_repo_path},}" "${path}" 2>/dev/null || true + echo "[vcluster-registry-pivot] rewrote subchart image registry/repository in ${path}" + else + # Inject a new image block under statefulSet at + # 10-space indent. statefulSet appears once; we add + # `image:` as a sibling of `scheduling:`. + awk -v reg="${sub_registry}" -v rpath="${sub_repo_path}" -v tag="${IMAGE_TAG}" ' + /^[[:space:]]{8}statefulSet:[[:space:]]*$/ && !done { + print + print " image:" + print " registry: " reg + print " repository: " rpath + done = 1 + next + } + { print } + ' "${path}" > "${path}.new" && mv "${path}.new" "${path}" + echo "[vcluster-registry-pivot] injected subchart image block into ${path}" + fi + fi + done + + echo "[vcluster-registry-pivot] sed edited ${edited} files" + if [ "${edited}" -eq 0 ]; then + echo "[vcluster-registry-pivot] no edits required — already pivoted in Gitea or files absent" + # Phase 1 already succeeded; nothing to push. + exit 0 + fi + + git add "${target_dir}" + if git diff --staged --quiet; then + echo "[vcluster-registry-pivot] git diff empty after sed — nothing to commit" + exit 0 + fi + git commit -m "cutover: pivot vCluster image.repository to harbor.${SOVEREIGN_FQDN}" >/dev/null + push_err=$(git push origin main 2>&1) || { + echo "[vcluster-registry-pivot] FATAL: git push failed" >&2 + printf '%s\n' "$push_err" >&2 + exit 1 + } + echo "[vcluster-registry-pivot] pushed to ${redacted}" + + # Trigger an immediate Flux reconciliation so the new YAML + # lands without waiting for the polling interval. + kubectl annotate --overwrite gitrepository openova \ + -n flux-system \ + "reconcile.fluxcd.io/requestedAt=$(date +%s)" >/dev/null || true + + echo "[vcluster-registry-pivot] step complete" + resources: + requests: { cpu: 50m, memory: 128Mi } + limits: { memory: 384Mi } + securityContext: + runAsNonRoot: true + runAsUser: 1001 + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] + volumes: + - name: tmp + emptyDir: {} diff --git a/platform/self-sovereign-cutover/chart/templates/rbac.yaml b/platform/self-sovereign-cutover/chart/templates/rbac.yaml index d03d18a2..a2e1f37c 100644 --- a/platform/self-sovereign-cutover/chart/templates/rbac.yaml +++ b/platform/self-sovereign-cutover/chart/templates/rbac.yaml @@ -106,6 +106,9 @@ rules: - apiGroups: ["source.toolkit.fluxcd.io"] resources: ["gitrepositories", "helmrepositories"] verbs: ["update", "patch"] # steps 05 + 06 + - apiGroups: ["helm.toolkit.fluxcd.io"] + resources: ["helmreleases"] + verbs: ["update", "patch"] # step 06 phase-1.6 (catalog override) + step 10 vcluster-registry-pivot (TBD-V24 MISS-1) - apiGroups: ["cilium.io"] resources: ["ciliumnetworkpolicies"] verbs: ["delete", "patch", "update"] # step 08 removes the policy at end-of-test diff --git a/platform/self-sovereign-cutover/chart/tests/cutover-contract.sh b/platform/self-sovereign-cutover/chart/tests/cutover-contract.sh index 6cc69dd1..ebe7315f 100755 --- a/platform/self-sovereign-cutover/chart/tests/cutover-contract.sh +++ b/platform/self-sovereign-cutover/chart/tests/cutover-contract.sh @@ -14,8 +14,9 @@ # 3. Each step ConfigMap MUST carry data keys: # stepName (always) # podSpec (mode=job only) -# 4. EXACTLY 9 step ConfigMaps must render (steps 1..9; step 9 -# gitea-token-mint added in chart 0.1.30, TBD-C18). +# 4. EXACTLY 10 step ConfigMaps must render (steps 1..10; step 9 +# gitea-token-mint added in chart 0.1.30, TBD-C18; step 10 +# vcluster-registry-pivot added in chart 0.1.35, TBD-V24 MISS-1). # 5. Step 04 must be mode=daemonset-wait. # 6. The status ConfigMap (default name self-sovereign-cutover-status) # MUST render with helm.sh/resource-policy: keep so a chart @@ -43,15 +44,14 @@ echo "[cutover-contract] Case 1: chart renders with default values" helm template smoke . > "$TMP/render.yaml" echo " PASS ($(wc -l < "$TMP/render.yaml") lines)" -echo "[cutover-contract] Case 2 + 4: exactly 9 step ConfigMaps render with required labels" +echo "[cutover-contract] Case 2 + 4: exactly 10 step ConfigMaps render with required labels" # Use yq if present (the CI runner installs it for the blueprint-release # guards); fall back to grep counting on workstations without yq. -# Step 9 (gitea-token-mint) added in chart 0.1.30 (TBD-C18) to bootstrap -# the Gitea API token the SME provisioning service uses; without it, -# tenant voucher checkout fails at journey step 16 because the -# catalyst-platform chart mirrors the Gitea admin PASSWORD verbatim -# into sme/provisioning-github-token, which 401s when sent as a Bearer -# token. +# Step 9 (gitea-token-mint) added in chart 0.1.30 (TBD-C18); step 10 +# (vcluster-registry-pivot) added in chart 0.1.35 (TBD-V24 MISS-1, +# issue #2034) to pivot the bp-*-vcluster HelmReleases' image.repository +# from harbor.openova.io → harbor. so vCluster Pods +# pull from the Sovereign-local Harbor mirror post-cutover. if command -v yq >/dev/null 2>&1; then # yq emits `---` separators between matched docs; filter those out # before counting names. `grep -E '^cutover-step-'` matches only the @@ -62,28 +62,28 @@ else # — count distinct order values, which equals step count. step_count=$(grep -c 'bp.openova.io/cutover-order:' "$TMP/render.yaml") fi -if [ "${step_count}" -ne 9 ]; then - echo "FAIL: expected 9 step ConfigMaps, got ${step_count}" >&2 +if [ "${step_count}" -ne 10 ]; then + echo "FAIL: expected 10 step ConfigMaps, got ${step_count}" >&2 exit 1 fi -echo " PASS (9 step ConfigMaps)" +echo " PASS (10 step ConfigMaps)" echo "[cutover-contract] Case 3: required data keys present" -# stepName key must exist on every step ConfigMap (9 total). -# podSpec key must exist on every job-mode step (8 of 9 — step 04 is daemonset-wait). +# stepName key must exist on every step ConfigMap (10 total). +# podSpec key must exist on every job-mode step (9 of 10 — step 04 is daemonset-wait). mode_job_count=$(grep -c 'bp.openova.io/cutover-mode: "job"' "$TMP/render.yaml") -if [ "${mode_job_count}" -ne 8 ]; then - echo "FAIL: expected 8 job-mode step ConfigMaps, got ${mode_job_count}" >&2 +if [ "${mode_job_count}" -ne 9 ]; then + echo "FAIL: expected 9 job-mode step ConfigMaps, got ${mode_job_count}" >&2 exit 1 fi podspec_keys=$(grep -c '^ podSpec: |' "$TMP/render.yaml") -if [ "${podspec_keys}" -lt 8 ]; then - echo "FAIL: expected at least 8 podSpec keys (one per job-mode step), got ${podspec_keys}" >&2 +if [ "${podspec_keys}" -lt 9 ]; then + echo "FAIL: expected at least 9 podSpec keys (one per job-mode step), got ${podspec_keys}" >&2 exit 1 fi stepname_keys=$(grep -c '^ stepName:' "$TMP/render.yaml") -if [ "${stepname_keys}" -lt 9 ]; then - echo "FAIL: expected at least 9 stepName keys, got ${stepname_keys}" >&2 +if [ "${stepname_keys}" -lt 10 ]; then + echo "FAIL: expected at least 10 stepName keys, got ${stepname_keys}" >&2 exit 1 fi echo " PASS (data keys present on every step)" @@ -467,4 +467,68 @@ if ! grep -B0 -A2 'apiGroups: \["gateway.networking.k8s.io"\]' "$TMP/render.yaml fi echo " PASS (Step-06 Phase -1 gateway-wait + RBAC wired)" +echo "[cutover-contract] Case 21: Step-10 vcluster-registry-pivot patches bp-*-vcluster HelmReleases (TBD-V24 MISS-1)" +# Chart <0.1.35 shipped NO vCluster image-registry pivot. The chart's +# own comment at platform/bp-mgmt-vcluster/chart/values.yaml:77-79 +# promised "post-handover, the per-Sovereign overlay rewrites to +# `harbor./proxy-ghcr/...`" but the rewrite step +# never existed. Result: MGMT/RTZ/DMZ vCluster control-plane Pods +# kept pulling from harbor.openova.io indefinitely post-handover, +# violating Principle #11 (no tether to harbor.openova.io after +# handover). Caught by the TBD-V24 tether audit 2026-05-20. +# +# Chart 0.1.35 adds Step-10 that: +# - kubectl patches each of {bp-mgmt-vcluster, bp-rtz-vcluster, +# bp-dmz-vcluster} HelmReleases with image.repository pointing at +# harbor.${SOVEREIGN_FQDN}/proxy-ghcr/loft-sh/vcluster +# - ALSO patches the upstream subchart's +# vcluster.controlPlane.statefulSet.image.{registry,repository} +# - git push edits clusters/_template/bootstrap-kit/{54,58,59}-bp-*- +# vcluster.yaml to local Gitea so the override survives reconciles +# - Idempotent on re-run (skip-if-already-pivoted + sentinel-comment +# guard on YAML injection) +# +# Guard against future regressions that drop the step. +if ! grep -q 'cutover-step-10-vcluster-registry-pivot' "$TMP/render.yaml"; then + echo "FAIL: Step-10 vcluster-registry-pivot ConfigMap missing (TBD-V24 MISS-1)" >&2 + exit 1 +fi +if ! grep -A20 'cutover-step-10-vcluster-registry-pivot' "$TMP/render.yaml" | grep -q 'bp.openova.io/cutover-order: "10"'; then + echo "FAIL: Step-10 not labelled bp.openova.io/cutover-order=10 (TBD-V24 MISS-1)" >&2 + exit 1 +fi +# All 3 vCluster HelmRelease names must be referenced in the patch script. +for hr in bp-mgmt-vcluster bp-rtz-vcluster bp-dmz-vcluster; do + if ! grep -q "patch_hr \"${hr}\"" "$TMP/render.yaml"; then + echo "FAIL: Step-10 missing patch invocation for ${hr} (TBD-V24 MISS-1)" >&2 + exit 1 + fi +done +# All 3 role keys must be wired (umbrella chart values key). +for role in mgmtVcluster rtzVcluster dmzVcluster; do + if ! grep -q "\"${role}\"" "$TMP/render.yaml"; then + echo "FAIL: Step-10 missing role-key wiring for ${role} (TBD-V24 MISS-1)" >&2 + exit 1 + fi +done +# Subchart pivot — must patch vcluster.controlPlane.statefulSet.image too. +if ! grep -q 'controlPlane.statefulSet.image' "$TMP/render.yaml"; then + echo "FAIL: Step-10 missing subchart image pivot (TBD-V24 MISS-1)" >&2 + exit 1 +fi +# Phase-2 git push to local Gitea — without this, Phase-1 patches get +# reverted by bootstrap-kit Kustomization reconcile within ~1 min. +if ! grep -B2 -A8 'cutover-step-10-vcluster-registry-pivot' "$TMP/render.yaml" >/dev/null; then + echo "FAIL: Step-10 ConfigMap region not located for Phase-2 git push check" >&2 + exit 1 +fi +# RBAC: ClusterRole must permit patch on helmreleases (needed by Step-10 +# AND by Step-06 Phase-1.6, which was silently relying on this verb). +if ! awk '/^kind: ClusterRole$/,/^---$/' "$TMP/render.yaml" \ + | grep -B1 -A2 '"helmreleases"' | grep -E 'verbs:.*"patch"|verbs:.*"update"' >/dev/null; then + echo "FAIL: ClusterRole missing helm.toolkit.fluxcd.io.helmreleases [update|patch] verb (TBD-V24 MISS-1)" >&2 + exit 1 +fi +echo " PASS (Step-10 wired to pivot vCluster HRs to local Harbor)" + echo "[cutover-contract] All gates green." diff --git a/platform/self-sovereign-cutover/chart/values.yaml b/platform/self-sovereign-cutover/chart/values.yaml index 13353749..7f22880d 100644 --- a/platform/self-sovereign-cutover/chart/values.yaml +++ b/platform/self-sovereign-cutover/chart/values.yaml @@ -425,6 +425,34 @@ stepTimeouts: # 600s leaves generous headroom for the provisioning Deployment # rollout (which can take a few minutes on a cold-start cluster). giteaTokenMintSeconds: 600 + # Step 10 (vcluster-registry-pivot, TBD-V24 MISS-1) — three HR patches + # + one git clone/push to local Gitea. Bounded by the Gitea push + # round-trip; 600s is generous (typical run completes in <60s). + vclusterRegistryPivotSeconds: 600 + +# Step 10 (vcluster-registry-pivot, TBD-V24 MISS-1, issue #2034) — pivot +# the three bp-*-vcluster HelmReleases' image.repository from +# `harbor.openova.io/proxy-ghcr/loft-sh/vcluster` to +# `harbor./proxy-ghcr/loft-sh/vcluster` so the +# MGMT/RTZ/DMZ vCluster control-plane Pods pull from the Sovereign-local +# Harbor mirror post-cutover. Without this step the vCluster image-pull +# remains tethered to mothership Harbor indefinitely — a Pillar-5 +# (Sovereign independence) violation. See 10-vcluster-registry-pivot-job.yaml +# for the full protocol. +vclusterRegistryPivot: + # Namespace where the 3 bp-*-vcluster HelmReleases live. The + # bootstrap-kit's slots 54/58/59 all install into flux-system, so the + # default here matches the canonical layout. Per-Sovereign overlay + # can change this if a future topology splits the HRs elsewhere. + helmReleaseNamespace: "flux-system" + # vCluster image tag — must match the tag pinned in each bp-*-vcluster + # chart's values.yaml (currently 0.20.0 across all three). This is + # NOT used by the merge-patch directly (the chart's existing + # image.tag value is preserved untouched); it's threaded through + # only for the YAML-edit comment so the injected block is + # self-documenting. Override via per-Sovereign overlay if you've + # pinned a different upstream vCluster version. + imageTag: "0.20.0" # Step 09 (gitea-token-mint, TBD-C18) — bootstrap the Gitea API token # that the SME provisioning service uses for tenant repo materialisation.