Commit Graph

2662 Commits

Author SHA1 Message Date
hatiyildiz
9774d7ba53 feat(self-sovereign-cutover): add step 10 — pivot vCluster HelmReleases to Sovereign Harbor (Refs #2034)
The chart's own comment at platform/bp-mgmt-vcluster/chart/values.yaml:77-79
promised "post-handover, the per-Sovereign overlay rewrites to
`harbor.<sovereign-fqdn>/proxy-ghcr/...`" — but the rewrite step never
existed anywhere in the cutover sequence. As a result, every Sovereign
post-handover keeps pulling vCluster control-plane images from
`harbor.openova.io` indefinitely, a direct violation of Principle #11
(no tether to harbor.openova.io after handover). Caught by the TBD-V24
tether audit on 2026-05-20.

Why step 04 (containerd registries.yaml pivot) doesn't catch it:
registries.yaml.v2 only mirrors the 7 canonical UPSTREAMS (ghcr.io,
docker.io, registry.k8s.io, gcr.io, quay.io, xpkg.upbound.io,
public.ecr.aws). The host `harbor.openova.io` is treated as a literal
endpoint, not an upstream, so containerd routes those image pulls
direct to mothership Harbor regardless of mirror config.

This step adds:
- Phase 1: live `kubectl patch helmrelease` against each of
  {bp-mgmt-vcluster, bp-rtz-vcluster, bp-dmz-vcluster} in flux-system,
  patching BOTH `spec.values.<role>Vcluster.image.repository`
  (umbrella) AND `spec.values.vcluster.controlPlane.statefulSet.image.
  {registry,repository}` (loft-sh subchart). Topology-aware: secondaries
  skip MGMT (not present), primary skips RTZ (not present). Idempotent:
  re-runs no-op when already pivoted.
- Phase 2: git push to local Gitea injecting the same override blocks
  into clusters/_template/bootstrap-kit/{54,58,59}-bp-*-vcluster.yaml
  so the bootstrap-kit Kustomization doesn't revert the live patch on
  next reconcile (same pattern as step 06 Phase 2 + Phase 2.5).

Coordination with chart 0.1.34 (TBD-V25, PR #2036, already merged):
totalSteps bumped from "9" → "10" in 09-cutover-status-configmap.yaml.
Contract test (tests/cutover-contract.sh) asserts shift from 9 → 10
step ConfigMaps and from 8 → 9 job-mode ConfigMaps. New Case 21
verifies Step 10's wrapper + subchart patches are wired correctly.

RBAC: ClusterRole gains helm.toolkit.fluxcd.io.helmreleases
{update,patch}. Step-06 Phase-1.6 (the openova-catalog HR patch shipped
in chart 0.1.31) was silently relying on this verb already — chart
0.1.31's RBAC change was missed, so this bump ALSO closes a latent
permission gap that would have surfaced on any cluster where the prior
patch attempt happened to require it.

Operator note: existing actively-running vCluster Pods do NOT churn on
this step — they're already running with images pulled at startup. The
patch ensures the NEXT image-pull (chart bump, Pod restart, region
add) routes through the Sovereign-local Harbor.

Refs #2034 (NOT Closes — operator-walk on fresh prov + screenshot
required per CLAUDE.md §4 anti-theater discipline).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 04:18:04 +02:00
e3mrah
9c340efe43
fix(self-sovereign-cutover): strip mothership-side auths from ghcr-pull Secret on cutover (Refs #2034) (#2041)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
* fix(self-sovereign-cutover): strip mothership-side auths from ghcr-pull Secret on cutover (Refs #2034)

TBD-V24 MISS-2 — close the credential-hygiene gap flagged by the Pillar-5
Sovereign-independence audit. Pre-cutover the `flux-system/ghcr-pull`
Secret carries auth for `ghcr.io` and `harbor.openova.io` (mothership-
side registries that source-controller and containerd use during cold-
start). Phase-0 of step-06 already MERGES the local Harbor entry in
(chart 0.1.24 / PR #1184) but never STRIPS the original two — leaving
standing creds for upstream registries that the post-cutover cluster
must NOT depend on per CLAUDE.md §3 Principle #11.

This PR adds the strip side. Key choices:

  * Strip list is driven by `.Values.harbor.mothershipAuthsToStrip`
    (defaults: ghcr.io, harbor.openova.io) — never-hardcode per
    INVIOLABLE-PRINCIPLES #4. Operators can extend the list via overlay
    if their Sovereign carries additional mothership-rooted creds.
  * Strip runs in the SAME jq pipeline as the add, so the Secret takes
    a SINGLE resourceVersion bump per Phase-0 invocation (avoids the
    "noisy reflector cascade" the existing idempotency guard already
    protects against).
  * Idempotency check extended: Phase-0 skips entirely only when BOTH
    the local Harbor entry is in place AND every strip target is
    already absent. Re-runs after the initial strip no-op via
    jq `del(.auths[$h])` semantics (deleting a missing key is silent).
  * Defence-in-depth: the strip loop never deletes the local harbor
    host, even if an operator overlay erroneously lists it — would
    deadlock Phase-1.
  * POSIX-sh portable: positional-param-array construction via
    `set --` works in the alpine/k8s busybox `ash` the Job uses; no
    bash-only array syntax.
  * `--arg` injection: every strip host lands as a JSON-string operand
    to jq's `del()` filter — never shell-interpolated, so even a
    malicious overlay value is contained.

Verification (Principle #15):
  * `bash tests/cutover-contract.sh` — all 20 contract gates green.
  * Fixture script proves the rendered jq filter takes a 3-auth fixture
    (ghcr.io + harbor.openova.io + new harbor.t99.omani.works) →
    produces a 1-auth result with only harbor.t99.omani.works remaining;
    idempotent on re-run; del() of absent key is a no-op.
  * `go test ./internal/handler/... -count=1 -run Cutover` — cutover
    handler tests pass.
  * Smoke render with overlay-supplied `harbor.mothershipAuthsToStrip`
    list confirms the comma-joined env var picks up overrides.

Chart bump 0.1.34 -> 0.1.35. Bootstrap-kit pin bumped in lockstep.

ORDERING: this fix lives in Phase-0 of step-06 (before Phase-1 URL
rewrites). There is NO dependency on TBD-V24 MISS-1 (the vCluster
image-registry pivot) because the strip operates on the `ghcr-pull`
Secret data plane, not on per-chart `values.yaml`.

NOT closing TBD-V24 — the Pillar-5 claim only flips VERIFIED-PASS
after an operator walks a fresh prov through the full deny-egress
hold (TBD-V23 sibling) and confirms `.auths` contains ONLY the local
harbor host. Operator-walk-with-screenshot per CLAUDE.md §0 anti-
theater discipline.

Refs #2034

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(self-sovereign-cutover): bump blueprint.yaml version pin to 0.1.35 in lockstep with Chart.yaml

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 06:13:36 +04:00
github-actions[bot]
56f1e407f0 deploy: update Catalyst marketplace image to 84b35d2
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Build & Deploy Catalyst Marketplace / build (push) Waiting to run
Build & Deploy Catalyst Marketplace / deploy (push) Blocked by required conditions
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-20 02:01:31 +00:00
e3mrah
84b35d22c2
feat(marketplace-ui): render configSchema fields on AppDetail (Pillar 1 step 2 unblock) (#2038)
Refs #2026 (TBD-V18). Marketplace AppDetail now renders the
per-instance configSchema declared by the catalog (replicas / disk /
backup for Postgres-backed bundles, replicas / persistence for Redis,
etc.) directly under Description / Features.

Pre-fix Pillar 1 step 2 of the CLAUDE.md §0 deterministic walk
("Click the canonical Postgres-backed bundle → app card opens;
configSchema renders") failed: the catalog Go store carries
`ConfigSchema []ConfigField` (core/services/catalog/store/store.go:55)
and serialises it as `config_schema` over the wire via the embedded
`appResponse` (json/bson tag), but `core/marketplace/src/lib/api.ts::getApps`
mapper dropped the field entirely, so AppDetail.svelte had no per-instance
tunables section.

Root cause: TS interface drift from the Go contract. No backend change
required — the wire already carries the field.

Fix:
  * api.ts — add a `ConfigField` shape mirroring
    `store.ConfigField` one-for-one (key/label/type/default/min/max/
    options/description/advanced) + `configSchema?: ConfigField[]` on
    the `App` interface. getApps mapper reads `a.config_schema`.
  * AppDetail.svelte — render one widget per ConfigField.type
    (int → numeric input with min/max bounds, bool → checkbox,
    enum → select, string/size → text input). 'advanced' fields
    carry a muted badge. Local form state is seeded from per-field
    defaults so the rendered surface is always populated.
  * customer-journey.spec.ts — add `03b` regression: navigate to
    /app?slug=wordpress, assert the section + 3 fields render with
    seeded defaults + 'advanced' badge on the backups_enabled field.
  * Chart.yaml + bootstrap-kit pin — bump 1.4.221 → 1.4.222 in
    lockstep (Inviolable Principle #14).

Threading customer-chosen values into the install POST is a follow-up
(TBD-V18-D) — this PR's scope is "configSchema renders" only, per
the issue body.

Validation:
  * `npm run build` in core/marketplace — succeeds, AppDetail bundle
    grows 7.43 → 10.31 kB (configSchema + widgets).
  * `helm template products/catalyst/chart` — renders clean.
  * Did NOT use `--dry-run=server` (Inviolable Principle #15).

DoD reminder (anti-theater): operator must walk the surface on a
fresh multi-region prov + screenshot configSchema rendering →
attached as a comment on #2026 before the issue can close. PR body
uses `Refs #N`, NOT `Closes #N`.

Co-authored-by: hatiyildiz <emrah.baysal@openova.io>
2026-05-20 06:00:21 +04:00
github-actions[bot]
749519fa12 deploy: bump sandbox-controller image to 4ac1db1
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-20 01:57:50 +00:00
github-actions[bot]
c7db557055 deploy: bump sandbox-mcp-server image to 4ac1db1 2026-05-20 01:56:19 +00:00
hatiyildiz
f19df1410a deploy(bp-newapi): bump bootstrap-kit pin 1.4.31 -> 1.4.32 (auto, Refs TBD-A6)
Also locksteps platform blueprint.yaml spec.version 1.4.31 -> 1.4.32 (Refs TBD-A20, #1856).
2026-05-20 01:55:41 +00:00
github-actions[bot]
ccc5ae5ec4 deploy: bump bp-newapi upstream v0.13.2 chart 1.4.32
Some checks are pending
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Build sandbox-controller / build (push) Waiting to run
Build sandbox-mcp-server / build (push) Waiting to run
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
2026-05-20 01:54:56 +00:00
e3mrah
4ac1db1d93
fix(sandbox-controller): add 4 missing SANDBOX_* env vars + LLM_GATEWAY_TOKEN case fix (Refs #2032) (#2037)
* fix(sandbox-controller): add 4 missing SANDBOX_* env vars + LLM_GATEWAY_TOKEN case fix (Refs #2032)

Ships the 4 residual MCP env-var residuals PR #1987 did not cover (per
the Pillar-4 audit at /tmp/audit-pillar4-deep-wiring-2026-05-20.md
finding D1, tracked in TBD-V21 #2032):

  SANDBOX_TOKEN (P1)        — mounted from the per-Sandbox Secret's
                              LLM_GATEWAY_TOKEN key (same source as the
                              pre-existing LLM_GATEWAY_TOKEN env mount;
                              single source of truth, zero Secret
                              writes per Principle #4). Without this
                              env every marketplace.* tool call from
                              the MCP returned "SANDBOX_TOKEN not set"
                              and blocked Pillar-4 Phase-2 step 2d
                              (qwen-code provisioning an additional
                              app via the marketplace.* family).
  SANDBOX_JWT_SECRET (P1)   — mounted from
                              newapi-bp-newapi-token-signing-key
                              Secret's SIGNING_KEY key (chart default;
                              bp-newapi 1.4.31 extends reflectorName-
                              spaces to include the sandbox-.* regex
                              pattern so emberstack/reflector mirrors
                              the key into every per-Sandbox namespace).
                              Without this env the MCP's auth gate
                              degrades to test-dev mode (registry.go:
                              71) — bearer claims are not validated,
                              org-scope + capability gates silently
                              short-circuit.
  SANDBOX_REPOS (P3)        — comma-joined sb.Spec.Repos[].giteaRepo
                              list. Without this gitea.repos.list
                              returns the un-filtered org repo list
                              instead of the per-Sandbox subset.
  SANDBOX_KUBECONFIG (P4)   — intentionally NOT emitted; empty is the
                              canonical in-cluster value per MCP
                              env.go:78.

Also fixes a pre-existing case-mismatch bug at the MCP and pty-server
LLM_GATEWAY_TOKEN / OPENAI_API_KEY secretKeyRef: the key ref was
lowercase 'llm-gateway-token' while the per-Sandbox Secret's stringData
writes uppercase 'LLM_GATEWAY_TOKEN' (newapiTokenSecretTemplate, line
270). With 'optional: true' the mismatch silently no-opped — every
agent CLI spawned in the pty-server shell ran without an LLM bearer,
and every newapi-proxy call from the MCP missed its credential.

Changes:

  - core/controllers/sandbox/internal/gitops/manifests.go:
    + Add SANDBOX_TOKEN, SANDBOX_JWT_SECRET, SANDBOX_REPOS env vars
      to mcpDeploymentTemplate.
    + Fix LLM_GATEWAY_TOKEN / OPENAI_API_KEY secretKeyRef.key case
      (lowercase 'llm-gateway-token' -> uppercase 'LLM_GATEWAY_TOKEN')
      on BOTH the MCP Deployment AND pty-server StatefulSet.
    + Add JWTSigningKeySecretName, JWTSigningKeySecretKey, SandboxRepos
      fields to Inputs. Render() populates SandboxRepos from in.Repos
      and defaults the JWT Secret coordinates to canonical bp-newapi
      values when unset.

  - core/controllers/sandbox/internal/controller/sandbox_controller_test.go:
    + Extend the regression test to assert the 3 new env vars + the
      LLM_GATEWAY_TOKEN key case + the canonical JWT secret ref. The
      existing negative assertion on bare ORG_ID / SOVEREIGN_FQDN on
      the MCP Deployment is unchanged (those names remain on the
      pty-server for user-shell-inherited agent context — separate
      contract).

  - platform/newapi/chart/values.yaml:
    + Extend sandboxTokenSigningKey.reflectorNamespaces default from
      "catalyst-system,sandbox" to "catalyst-system,sandbox,sandbox-.*"
      so emberstack/reflector mirrors SIGNING_KEY into every per-
      Sandbox namespace. Emberstack reflector treats each comma-
      separated entry as a regex (kubernetes-reflector#162).

  - platform/newapi/chart/templates/sandbox-token-signing-key-secret.yaml:
    + Update fallback in 'default' filter to match new canonical value.

  - platform/newapi/chart/Chart.yaml: 1.4.30 -> 1.4.31.
  - platform/sandbox/chart/Chart.yaml: 0.3.1 -> 0.3.2.
  - clusters/_template/bootstrap-kit/80-newapi.yaml: pin 1.4.30 -> 1.4.31.
  - clusters/_template/bootstrap-kit/19a-bp-sandbox.yaml: pin 0.3.1 -> 0.3.2.

Validation:

  - go test ./sandbox/... -count=1: ALL PASS (sandbox controller +
    gitops + idlescaler + sandboxapi + newapi). Includes the extended
    regression test asserting the new env vars on the MCP Deployment.
  - helm dependency update + helm template platform/newapi/chart:
    confirms the rendered Secret carries
    reflection-{allowed,auto}-namespaces:
    "catalyst-system,sandbox,sandbox-.*"
  - helm template platform/sandbox/chart with runtime values: chart
    renders cleanly (no new chart values added; manifests.go defaults
    cover the new secretKeyRef coords).
  - Did NOT use --dry-run=server (lies per PR #1933 lesson; Principle
    #15).

DoD: per CLAUDE.md anti-theater discipline, TBD-V21 #2032 stays OPEN
(Refs, not Closes) until an operator walks the surface on a fresh
prov:
  - kubectl exec -n sandbox-<owner-uid> deploy/openova-sandbox-mcp env
    | grep -E '^SANDBOX_(TOKEN|REPOS|JWT_SECRET)=' returns 3 non-empty
  - A marketplace.* MCP tool/call no longer returns
    "SANDBOX_TOKEN not set"
  - The MCP auth gate fires (a tool/call with no bearer returns 401,
    not silently passes).

Refs #2032
Refs #1986

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: lockstep bp-newapi blueprint.yaml to 1.4.31 (Refs #2032)

CI manifest-validation flagged lockstep drift between
platform/newapi/blueprint.yaml (1.4.30) and platform/newapi/chart/
Chart.yaml (1.4.31). Bumping blueprint.yaml in lockstep per TBD-A20
(#1856).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 05:54:36 +04:00
e3mrah
2f5937ac0e
fix(self-sovereign-cutover): correct totalSteps from "8" to "9" in cutover status ConfigMap (#2036)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
The status ConfigMap shipped by bp-self-sovereign-cutover hardcoded
totalSteps: "8" but the chart has shipped 9 step ConfigMaps since
0.1.30 (TBD-C18 added step 09 gitea-token-mint). The contract test
(tests/cutover-contract.sh:64) already asserts step_count -eq 9, but
the literal in the initial-state ConfigMap was decoupled from that
gate.

Post-trigger this is harmless: catalyst-api overwrites totalSteps with
the live discovered count on /start (cutover.go:763 patches with
strconv.Itoa(len(steps))). Pre-trigger though — between chart install
and the auto-trigger Job firing the /start POST, typically seconds but
up to ~25 min on a slow cold-start cluster — any GET /status returns
totalSteps=8 for 9 actual steps. UIs rendering progress as
`<currentIndex>/<totalSteps>` show the wrong denominator in that window.

Cross-impact on TBD-V13 (#2016) resume logic: NONE. The resume engine
derives totalSteps via len(steps) from live ConfigMap discovery
(cutover.go:1087, 1190, 1221), not from the literal. The literal is
only read for the /status response shape (cutover.go:1371). Resume was
never affected by the off-by-one.

Single-literal swap (Option B from the audit). Option A (drop the
literal + default from live discovery in HandleCutoverStatus) is
deferred — Option B is the smaller, contract-test-gated fix.

Chart 0.1.33 -> 0.1.34. Blueprint manifest + bootstrap-kit pin bumped
in lockstep (Principle #14).

Refs #2035

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 05:44:01 +04:00
github-actions[bot]
d05e981d3c deploy: bump projector image to d74298c
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-20 01:31:28 +00:00
e3mrah
d74298c234
feat(ci): add build-projector workflow + publish to GHCR (unblocks controllers.projector.enabled flip) (#2031)
Some checks are pending
Build projector / build (push) Waiting to run
Adds .github/workflows/build-projector.yaml — the missing CI pipeline
that builds the `core/cmd/projector/` Go binary, publishes it to
`ghcr.io/openova-io/openova/projector:<short-sha>` + `:latest`, signs
with cosign keyless (Sigstore), attests SBOM, then auto-bumps
`controllers.projector.image.tag` in products/catalyst/chart/values.yaml
and dispatches blueprint-release for catalyst chart re-publish.

Why
---
enabled:false audit (V18-B): the projector source landed in
`core/cmd/projector/` with its own Containerfile but NO CI workflow
was ever added to publish the image. That means
`controllers.projector.enabled` CANNOT be flipped on — the chart
template would render an empty `image.tag` and `helm template` would
fail-fast (Inviolable Principle #4a). Every prior attempt at wiring
the CQRS read-side for the NATS event spine (Pillar 1 marketplace +
Pillar 4 sandbox control-plane, per CLAUDE.md §11) silently stalled
here.

Scope
-----
- Adds the CI workflow ONLY.
- Does NOT flip `controllers.projector.enabled` to true — that
  remains a separate chain (TBD-V18-C) that needs the NACK consumer
  installed and JetStream catalystStreams reconciled before the gate
  can flip safely.
- Does NOT bump the bp-catalyst-platform chart version (CI does
  that automatically on the first push-to-main, then dispatches
  blueprint-release).

Sibling-modeled on
------------------
- build-blueprint-controller.yaml (auth flow + auto-bump pattern)
- build-k8s-ws-proxy.yaml (per-cmd go.mod layout + Containerfile)

Both already in production; this PR uses the same Buildx + cosign
keyless + SBOM-attest + values.yaml auto-bump + blueprint-release
dispatch shape — no novel patterns.

Refs TBD-V22 (filed alongside this PR) — projector image-build
pipeline missing.
Refs #1099 — EPIC-4: Cloud Resources / projector.
Refs #1094 — EPIC: Catalyst Phase 0/1 (control-plane).

Co-authored-by: hatiyildiz <noreply@anthropic.com>
2026-05-20 05:29:06 +04:00
github-actions[bot]
7bf19317c4 deploy: update catalyst images to dc968a4
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Strategy flip regression (RollingUpdate -> Recreate) / strategy-flip-regression (push) Waiting to run
2026-05-20 01:21:16 +00:00
e3mrah
dc968a429c
Merge pull request #2029 from openova-io/fix-tbd-v20-wizard-issue-first-voucher-anti-canon-cta
Some checks are pending
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Cosmetic + step-flow regression guards / Playwright cosmetic + step-flow guards (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
SME demo end-to-end (issue / SME demo Playwright happy path (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Build & Deploy Catalyst / build-ui (push) Waiting to run
fix(catalyst-ui): wizard StepSuccess CTA → BSS menu in operator console (kills admin.<fqdn> anti-canon ref)
2026-05-20 05:19:19 +04:00
hatiyildiz
73be865d85 fix(catalyst-ui): wizard StepSuccess CTA → BSS menu in operator console (kills admin.<fqdn> anti-canon ref)
The wizard's terminal "Issue first voucher" CTA in StepSuccess linked at
`https://admin.<sovereign-fqdn>/billing/vouchers/new`. Per CLAUDE.md §0
canon there is no `admin.*` subdomain — voucher + billing operations
live under the BSS menu inside the operator console:

  https://console.<fqdn>/bss/vouchers

The BSS routes are already correctly mounted at router.tsx:1576
(`/bss/vouchers` → VouchersPage with consoleLayoutRoute parent). This
PR points the wizard CTA at them.

Changes:
- products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepSuccess.tsx
    voucherURL now derives from `consoleURL` + `/bss/vouchers` (drops
    the unused `adminURL` computation; updates the doc-comment header).
- products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepSuccess.test.tsx
    3 fixture assertions bumped to the BSS canonical URL.
- products/catalyst/bootstrap/ui/e2e/sme-demo.spec.ts
    stale doc-comment in a skipped fixme test updated for consistency.
- products/catalyst/chart/Chart.yaml
    bp-catalyst-platform 1.4.220 → 1.4.221 with a header entry.
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml
    pin bumped 1.4.220 → 1.4.221 (lockstep — Principle #14).

Surfaces-only — no API / wire / chart-template changes; image SHAs
unchanged from 1.4.220.

Validated with `helm template products/catalyst/chart/` from a fresh
clone of origin/main (Principle #15 — not --dry-run=server). Templates
clean; no schema regressions.

Refs #2028
2026-05-20 03:16:59 +02:00
e3mrah
a068d210c7
fix(security/kyverno-policies): annotate chart catalyst.openova.io/no-upstream=true (#2023)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
The Blueprint-Release CI workflow's "hollow-chart guard" (issue #181)
requires every umbrella chart at `platform/<name>/chart/` to declare
upstream dependencies — OR opt out via the annotation
`catalyst.openova.io/no-upstream: "true"` for charts that legitimately
ship only Catalyst-authored CRs.

bp-kyverno-policies is the latter shape (18+2 ClusterPolicy templates
targeting the kyverno.io CRDs installed by bp-kyverno at slot 27 — no
upstream Helm subchart to bundle). PR #2022 missed this annotation and
the post-merge Blueprint Release run failed with:

  ERROR: Chart platform/kyverno-policies/chart/Chart.yaml declares NO
  dependencies. ... (To opt out for charts that legitimately ship only
  Catalyst-authored CRs, set annotations.catalyst.openova.io/no-upstream:
  "true".)

Adds the annotation. Chart version stays 1.0.0 since no artifact was
published yet (the failed run aborted before `helm push`). The slot pin
in clusters/_template/bootstrap-kit/27a-kyverno-policies.yaml already
points at 1.0.0, so this single Chart.yaml edit retriggers the workflow
on the same version tag.

Same shape as bp-crossplane-claims/chart/Chart.yaml which already opts
out via this annotation.

Refs #2019, Refs #1096

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-20 04:48:10 +04:00
e3mrah
7f2a121a9a
feat(security/kyverno): split policies into bp-kyverno-policies@1.0.0 Blueprint (Refs #2019) (#2022)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
* feat(security/kyverno): split policies into bp-kyverno-policies@1.0.0 Blueprint

Splits the 20 EPIC-1 (#1096) compliance ClusterPolicy templates out of
bp-kyverno (engine umbrella chart) into a dedicated Blueprint
bp-kyverno-policies@1.0.0 with its own HelmRelease, ordered via HR-to-HR
dependsOn on bp-kyverno in the bootstrap-kit Kustomization.

WHY (the bug we're killing):
PR #1138 (2026-05-08) shipped 20 ClusterPolicy templates with
`enabled: false` defaults → dead-on-arrival for 11 days. PR #1933
(2026-05-19) flipped 18 defaults to `enabled: true` + bumped chart
1.1.0 → 1.2.0 + bumped the bootstrap-kit pin — but hit a CRD install-
ordering race on fresh prov t33: ClusterPolicy CRs (in
templates/policies/baseline/*.yaml) and Kyverno CRDs (in upstream
charts/crds/templates/) render in the SAME Helm pass, and the
apiserver's RESTMapper has not yet learned kyverno.io/v1.ClusterPolicy
when Helm applies the ClusterPolicy CRs. PR #1935 reverted ONLY the
bootstrap-kit pin (1.2.0 → 1.1.0) — chart source kept claiming policies
were on by default while the deployed pin pulled an engine-only artifact
with zero policies. "Theater on theater" — founder walk on t34 confirmed
GET /api/v1/sovereigns/<id>/compliance/policies returns `policyCount=0`,
only `useraccess-boundary` (from bp-crossplane-claims) was installed.

The structural fix is splitting the chart so the engine + CRDs reconcile
+ register first, THEN the policy chart applies its CRs cleanly. Audit
mode default = non-blocking (admission still passes, PolicyReport rows
populate). Operators flip individual policies to Enforce per-Sovereign
overlay or via EnvironmentPolicy.spec.compliance.modes (slice C2
controller path — separate work item).

CHANGES:

1. NEW chart `platform/kyverno-policies/chart/`:
   - Chart.yaml: name=bp-kyverno-policies, version=1.0.0, no subchart deps
   - values.yaml: `compliancePolicies:` block moved verbatim from bp-kyverno
     (defaults: 18 enabled+Audit, 2 intentionally OFF — `hubbleFlowsSeen`
     stub for W2 evaluator, `cosignVerified` until operator supplies PEM)
   - templates/baseline/01-..20-*.yaml: 20 ClusterPolicy templates moved
     via `git mv` (preserves blame; preserves PR #1933's 3 operator fixes
     — regex_match JMESPath + operator: Equals for 11/12/19)
   - tests/fixtures/: moved with the policies (fixtures reference policy
     output, not engine output)

2. ENGINE chart `platform/kyverno/chart/`:
   - Chart.yaml: 1.2.0 → 1.2.1 (policies removed, source no longer
     drift-claims compliance content)
   - values.yaml: `compliancePolicies:` block deleted (now lives in
     bp-kyverno-policies)
   - templates/clusterpolicy-mutate-add-openova-labels.yaml + ...require-
     openova-labels.yaml KEPT (engine-coupled mutating policies, EPIC-0
     label-vocab E1/E2, defaults OFF — separate concern from EPIC-1
     compliance library)
   - Empty `templates/policies/` directory removed

3. NEW bootstrap-kit slot `clusters/_template/bootstrap-kit/27a-kyverno-
   policies.yaml`:
   - HelmRelease bp-kyverno-policies pinned at chart `1.0.0`
   - HR-level `dependsOn: [bp-kyverno]` — same-kind, honored by Flux
     (per docs/INVIOLABLE-PRINCIPLES.md #14 cross-kind HR→Kustomization
     dependsOn is silently ignored, so we keep ordering at HR→HR within
     the single bootstrap-kit Kustomization)
   - targetNamespace: kyverno (same as engine — ClusterPolicy is cluster-
     scoped but the umbrella overlay namespacing matches the engine)
   - disableWait: true — Kyverno reports ClusterPolicy Ready asynchronously
     so we don't want downstream HRs stalling on policy-level health

4. UPDATED `clusters/_template/bootstrap-kit/kustomization.yaml`:
   - Added `27a-kyverno-policies.yaml` immediately after `27-kyverno.yaml`

5. BUMPED `clusters/_template/bootstrap-kit/27-kyverno.yaml`:
   - Engine pin 1.1.0 → 1.2.1 (engine-only; install behavior identical
     to 1.1.0 since policies + their values are no longer in this chart)

VALIDATION (Principle #15 — validate against fresh state, not stable state):

  $ helm template bp-kyverno-policies platform/kyverno-policies/chart \
      | grep -c '^kind: ClusterPolicy'
  18
  $ helm lint platform/kyverno-policies/chart && helm lint platform/kyverno/chart
  ==> 1 chart(s) linted, 0 chart(s) failed (both)
  $ helm template bp-kyverno platform/kyverno/chart \
      | grep -c '^kind: ClusterPolicy'
  0   # engine no longer renders any ClusterPolicy CRs
  $ helm package platform/kyverno-policies/chart
  Successfully packaged → bp-kyverno-policies-1.0.0.tgz (20 templates)

  CRD-race REPRODUCED locally without container runtime: applying the
  rendered policy YAML to a cluster WITHOUT Kyverno CRDs returns
    "no matches for kind \"ClusterPolicy\" in version \"kyverno.io/v1\"
     ensure CRDs are installed first"
  for every policy — proving the install-order fix is necessary.

  Full `helm install` from-scratch on Kind blocked locally (no container
  runtime on bastion); the Blueprint-Release CI workflow runs the full
  `helm dependency build` + package + GHCR push pipeline AND a
  `helm template` smoke render at publish time — that is the fresh-state
  Helm install gate before any pin lands.

CI / GHCR (Principle #13):
  Blueprint-Release workflow auto-detects `platform/kyverno-policies/chart/**`
  and publishes `oci://ghcr.io/openova-io/bp-kyverno-policies:1.0.0`
  on push to main. The slot pin in 27a-kyverno-policies.yaml is set to
  `1.0.0` to match (auto-bump-pin step is a no-op when source version
  already matches the slot pin).

DELIBERATELY OUT OF SCOPE:
  - W2 Go evaluator for `hubble-flows-seen` (stub stays a no-op)
  - Cosign publicKey supply path for `cosign-verified`
  - Per-Environment EnvironmentPolicy.spec.compliance.modes enforcement
    flip controller
  - Score-aggregator weight defaults configuration UI
  - `useraccess-boundary` (lives in bp-crossplane-claims, unchanged)

This does NOT close #1096. The EPIC remains open until a fresh-prov walk
shows `kubectl get clusterpolicies -A` returning the 18 baseline policies
+ useraccess-boundary, plus the AppDetail Compliance tab rendering non-
zero policyCount. Founder closes #1096 after that walk.

Refs #1096, Refs #2019, Refs #1929, Refs #1936

* fix(ci): register bp-kyverno-policies in expected-bootstrap-deps.yaml

* fix(blueprints): blueprint.yaml lockstep for kyverno 1.2.1 + add kyverno-policies 1.0.0 blueprint.yaml

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-20 04:42:29 +04:00
github-actions[bot]
5c2b934295 deploy: bump continuum-controller image to e72efb8
Some checks are pending
Build application-controller / build (push) Waiting to run
Build blueprint-controller / build (push) Waiting to run
Build continuum-controller / build (push) Waiting to run
Build continuum-controller / notify (push) Blocked by required conditions
Build environment-controller / build (push) Waiting to run
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Controller-workflow uniformity guardrail / Controller-workflow uniformity (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Strategy flip regression (RollingUpdate -> Recreate) / strategy-flip-regression (push) Waiting to run
Build useraccess-controller / test (push) Waiting to run
Build useraccess-controller / build (push) Blocked by required conditions
2026-05-20 00:14:25 +00:00
github-actions[bot]
b03deddda0 deploy: bump environment-controller image to e72efb8 2026-05-20 00:14:18 +00:00
github-actions[bot]
470c3950a4 deploy: bump useraccess-controller image to e72efb8 2026-05-20 00:14:12 +00:00
github-actions[bot]
440eb47233 deploy: bump application-controller image to e72efb8 2026-05-20 00:12:52 +00:00
e3mrah
e72efb87cd
chore(ci): add auto-bump-images + pkg/** path filter to all build-*-controller workflows (Closes #2006) (#2012)
TBD-A69. PR #2005 fixed build-organization-controller.yaml only. The
other six controller workflows (application, blueprint, continuum,
environment, sandbox, useraccess) had the same gaps that caused the
#1997 18h deploy gap:

- application-controller: missing pkg/** in path filter (auto-bump
  already present from earlier work).
- blueprint, continuum, environment, useraccess: missing BOTH pkg/**
  path filter AND auto-bump pipeline (permissions promotion +
  values.yaml bump + commit/push + blueprint-release dispatch).
- sandbox: already complete (pkg/** + auto-bump to platform/sandbox
  chart) — left untouched.

Each updated workflow inherits the canonical shape from
build-organization-controller.yaml (PR #2005):

  1. `core/controllers/pkg/**` added to BOTH push.paths and
     pull_request.paths. Without this, a fix that only touches the
     shared HTTP-client tree (gitea/keycloak/kc-mappers) silently
     fails to rebuild the controller image.
  2. `permissions.contents: write` + `actions: write` so the build
     job can push the values.yaml bump and dispatch the downstream
     chart re-publish.
  3. An awk-scoped `Bump controllers.<who>.image.tag in values.yaml`
     step that updates ONLY the targeted controller's tag (verified
     locally — sibling tags remain untouched).
  4. A commit/push step that bumps
     products/catalyst/chart/values.yaml (or
     products/continuum/chart/values.yaml for continuum, which has
     its own chart).
  5. A `gh workflow run blueprint-release.yaml` dispatch so the
     bot-pushed commit fires the downstream chart re-publish
     (GitHub Actions silently filters bot pushes from path-trigger
     workflows otherwise).

Adds two new files to lock the shape in:

  - `scripts/check-controller-workflow-uniformity.sh` — a CI
    regression test that grep-asserts every controller workflow has
    the canonical pkg/** filter + auto-bump pipeline. Fails loudly
    if any new controller workflow ships without the canonical shape,
    or if an existing one regresses.
  - `.github/workflows/check-controller-workflow-uniformity.yaml` —
    push-on-touch + pull_request-on-touch event-driven wrapper that
    runs the script. Mirrors the shape of check-vendor-coupling.yaml.

Verified locally:
  - YAML syntax valid for all 7 controller workflows + the new check
    workflow.
  - Regression script passes on all 7 controller workflows.
  - Simulated awk bumps against products/catalyst/chart/values.yaml
    and products/continuum/chart/values.yaml — each script bumps
    ONLY the targeted controller's tag, sibling tags untouched.

No chart bumps. No Go/chart changes. CI-workflow-only.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 04:11:04 +04:00
github-actions[bot]
c194767187 deploy: update catalyst images to e2ba34a 2026-05-20 00:10:59 +00:00
e3mrah
e2ba34a70f
fix(self-sovereign-cutover): make state-resume idempotent across orchestrator restart (#2018)
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Test — Bootstrap API (Go) / test (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
The bp-self-sovereign-cutover orchestrator stuck at step 5/9 on t38
2026-05-19 when catalyst-api restarted mid-cutover. The in-process
runCutover goroutine died; the durable status ConfigMap captured the
in-flight state but NOTHING auto-fired the engine on the fresh Pod.
The chart's auto-trigger Helm Job only runs on post-install /
post-upgrade hooks; a catalyst-api Pod restart AFTER the chart is
already installed leaves the cutover stranded. Step 09 (gitea-token-mint)
was never created → PR #2008's provisioning init-container blocked
forever waiting for the cutover-step-09 token annotation → tenant
onboarding flow stuck (Pillar 1 + 4 + 5 blocked).

Root cause (cutover.go, lines 770-790): the engine reads `priorStatus`
on a fresh /start call and skips steps where result==success, but only
HandleCutoverStart / HandleCutoverInternalTrigger can trigger that
code path. No startup hook → no auto-resume. Additionally, in-flight
step rows whose result==running stay "running" forever in the durable
record.

Fix (single PR, no chart changes — purely catalyst-api Go code):

1. Handler.ResumeInterruptedCutover(ctx) — new exported method that
   reads the cutover status ConfigMap, detects in-flight cutovers
   (cutoverComplete=false AND cutoverStartedAt!=""), resets every
   step row whose .result=="running" back to "" (so the engine
   treats it as not-yet-attempted), and spawns runCutover with a
   background context.

2. cmd/api/main.go — call h.ResumeInterruptedCutover(ctx) just before
   ListenAndServe so a startup-resume race against a stale auto-
   trigger Job retry is serialised through the in-process running
   flag (tryStartRun).

3. createCutoverJob — Create-or-Get on AlreadyExists (concurrent
   trigger fires from operator CTA + auto-trigger Job hitting
   catalyst-api simultaneously is now benign).

Tests (cutover_test.go):
- TestResumeInterruptedCutover_ResumesAndCompletes — seeds 3-step
  status with step-1 success, step-2 running, step-3 untouched.
  Asserts after resume: step-1 NOT re-run, step-2 re-run, step-3
  run, cutoverComplete=true.
- TestResumeInterruptedCutover_NoOpWhenComplete — already-done
  status produces zero Job creates.
- TestResumeInterruptedCutover_NoOpWhenNeverStarted — empty
  cutoverStartedAt MUST not pre-empt the chart's auto-trigger Job.

Chart bump: bp-catalyst-platform 1.4.219 → 1.4.220 + bootstrap-kit
pin in lockstep (clusters/_template/bootstrap-kit/
13-bp-catalyst-platform.yaml). No bp-self-sovereign-cutover chart
changes — every step PodSpec is already idempotent by design.

Refs #2016

Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
2026-05-20 04:08:00 +04:00
github-actions[bot]
4568298c9b deploy: bump sandbox-mcp-server image to 0ee94cb 2026-05-20 00:06:21 +00:00
e3mrah
0ee94cb7bc
fix(continuum-witness/cfkv): stabilise RenewExtendsTTLAndBumpsGeneration (unblocks pre-existing CI red) (#2014)
Some checks are pending
Build continuum-controller / build (push) Waiting to run
Build continuum-controller / notify (push) Blocked by required conditions
Build sandbox-mcp-server / build (push) Waiting to run
Root cause: CFKVClient.Renew compared the server-stamped ExpiresAt
against the client's wall-clock (time.Now()). The Cloudflare Worker
is the timestamping authority — ExpiresAt is in the Worker's clock
frame. Whenever the Worker's clock and the client's wall-clock
diverged (NTP skew, fake-clock tests, or simply the test fixture
clock pinned to 2026-05-09 while CI runs on a later date), the
client's check declared the lease expired and Renew returned
ErrLeaseLost — even though the Worker still considered the lease
healthy.

This caused the Build continuum-controller workflow to red on every
push since 2026-05-09 with:

  --- FAIL: TestCFKV_ContractSuite/RenewExtendsTTLAndBumpsGeneration
      contract.go:214: Renew: witness: lease lost
  --- FAIL: TestCFKV_ContractSuite/GenerationMonotonicityAcrossOps
      contract.go:298: Renew: witness: lease lost

Fix: remove the client-side wall-clock expiry check. Expiry is
enforced server-side — an expired renew returns 412, which write()
already maps to ErrLeaseHeldByAnother, which the Renew wrapper then
re-maps to ErrLeaseLost. This keeps a single source of truth for
"is the lease alive" (the Worker), avoiding the dual-clock
disagreement. The non-holder early return (cur.Holder != holder ->
ErrLeaseLost) is preserved because it never depended on time.

Validation:
- TestCFKV_ContractSuite/RenewExtendsTTLAndBumpsGeneration GREEN
- All 14 contract suite sub-tests GREEN
- ./continuum/internal/witness/cloudflarekv/... -count=10 GREEN
- All ./continuum/... packages GREEN

Refs #2012

Co-authored-by: Emrah Baysal <emrah.baysal@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 04:04:44 +04:00
e3mrah
271855d419
fix(bp-sandbox): correct default NEWAPI_BASE_URL to actual bp-newapi service name (#2017)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
The bp-sandbox chart defaulted `env.newapiBaseURL` to
`http://newapi.newapi.svc.cluster.local:3000`. That assumes the bp-newapi
ClusterIP Service is named bare `newapi`. In practice the canonical
service name rendered by `helm template newapi platform/newapi/chart/
-s templates/service.yaml` is `newapi-bp-newapi`, because
`bp-newapi.fullname` in `platform/newapi/chart/templates/_helpers.tpl`
emits `{Release.Name}-{Chart.Name}` and `clusters/_template/bootstrap-kit/
80-newapi.yaml` sets `releaseName: newapi` against chart `bp-newapi`.

The bootstrap-kit overlay at
`clusters/_template/bootstrap-kit/19a-bp-sandbox.yaml` does NOT override
`env.newapiBaseURL`, so every Sovereign's sandbox-controller resolved a
DNS name no Service ever publishes:

  POST /admin/tokens/sandbox → lookup newapi.newapi.svc.cluster.local
  on 10.43.0.10:53: no such host

Walker on t38 (chart 1.4.216, substrate be4f78bc872e2c56, 2026-05-19)
caught the live regression. Every qwen-code Sandbox session failed at
TokenMint, blocking the canonical Pillar-4 customer journey
(console.<orgslug>.omani.homes → Sandbox → qwen-code provisions
additional app).

Fix scope:
- platform/sandbox/chart/values.yaml: default flipped to
  `http://newapi-bp-newapi.newapi.svc.cluster.local:3000`.
- platform/sandbox/chart/templates/deployment.yaml: inline `default` in
  the env block matched.
- platform/sandbox/chart/Chart.yaml: bp-sandbox 0.3.0 -> 0.3.1.
- clusters/_template/bootstrap-kit/19a-bp-sandbox.yaml: pin 0.3.0 ->
  0.3.1 in lockstep (Inviolable Principle #14).

Verification:
- `helm template bp-sandbox platform/sandbox/chart/ -s
  templates/deployment.yaml` with required values set renders the env
  literal `value: "http://newapi-bp-newapi.newapi.svc.cluster.local:3000"`.
- `helm template newapi platform/newapi/chart/ -s templates/service.yaml`
  renders `metadata.name: newapi-bp-newapi`.

DoD per anti-theater discipline (CLAUDE.md §0): issue stays open until a
fresh-prov Sandbox session successfully mints a NewAPI token and reaches
qwen-code. This PR ships the source-of-truth env-var fix only; it does
NOT defensively retry alternate names in the dial path.

Refs #2015

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-20 03:52:53 +04:00
github-actions[bot]
a7a2a9f5fe deploy: bump sandbox-mcp-server image to 5b44a66
Some checks are pending
Build blueprint-controller / build (push) Waiting to run
Build sandbox-mcp-server / build (push) Waiting to run
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-19 23:44:37 +00:00
e3mrah
5b44a66991
fix(blueprint-controller): align mode enum with bp-*-vcluster blueprint files (unblocks pre-existing CI red) (#2013)
Two tiers of placement modes coexist in the Blueprint corpus but only
one was registered in the validator + CRD enum, causing
TestValidate_ExistingBlueprintCorpus to fail on the 4 bp-*-vcluster
blueprints since 2026-05-09:

  - Application-tier (marketplace 99%): single-region / active-active /
    active-hotstandby
  - Bootstrap-topology tier (docs/SOVEREIGN-MULTI-REGION-DOD.md A4):
    primary-only / secondary-only / every-region

The 4 affected blueprints (bp-mgmt-vcluster / bp-dmz-vcluster /
bp-rtz-vcluster / bp-vcluster-helmrepo) correctly use the bootstrap-
topology tier — these are NOT operator-selectable; they document
which regions the bootstrap layer auto-installs the chart into.

Extends:
  - validate.go canonicalPlacementModes with the three bootstrap modes
    + inline documentation of the two-tier taxonomy
  - blueprint.yaml CRD enum (placementSchema.modes.items + .default)
    kept in sync per the validator's "must mirror" comment
  - 4 new unit-test cases for the bootstrap-topology modes

Result: TestValidate_ExistingBlueprintCorpus 71/71 GREEN
(previously 67/71, 4 FAIL).

Unblocks #2012 and every other PR touching blueprint-controller.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 03:42:54 +04:00
hatiyildiz
3de581c906 deploy(bp-catalyst-platform): bump bootstrap-kit pin 1.4.218 -> 1.4.219 (auto, Refs TBD-A6)
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (auth) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (billing) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (catalog) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (domain) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (gateway) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (metering-sidecar) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (notification) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (provisioning) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (tenant) (push) Waiting to run
Build & Deploy Catalyst Backend Services / deploy (push) Blocked by required conditions
Test — Billing Integration (real Postgres) / integration (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-19 23:23:16 +00:00
github-actions[bot]
497f782d17 deploy: update sme service images to 59367b2 + bump chart to 1.4.219 2026-05-19 23:22:40 +00:00
e3mrah
59367b2fe5
fix(billing): transactional voucher redemption — only decrement on order.placed success (Closes #2000) (#2011)
t38 walk caught the canonical TBD-V9 bug: customer redeems voucher
WALK-T38-2138 on a 50 OMR order, voucher credit is only 10 OMR, Stripe
is unconfigured in the Sovereign, Checkout returns 503 "payment processor
not configured" — but promo_codes.times_redeemed had already advanced
0→1, promo_redemptions row was inserted, and a credit_ledger grant was
written. Voucher shows "Exhausted 1/1" with no order to show for it; the
customer's one-per-customer promo is silently burned.

Root cause: store.RedeemPromoCode runs its own transaction (necessary
for the FOR UPDATE concurrency cap) and commits the three side effects
up front. The rest of the Checkout pipeline (GetCreditBalance, GetSettings,
CreditOnlyCheckout, Stripe customer + session creation) can fail without
undoing the redemption.

Fix (saga / compensating action):
- store.RollbackPromoCodeRedemption(customerID, code) — single tx that
  DELETEs promo_redemptions, decrements times_redeemed (GREATEST(..,0)
  underflow guarded), and DELETEs the credit_ledger redeem grant (filter
  reason='promo:<code>' AND order_id IS NULL so order spend ledger rows
  are not touched). Idempotent: 0-row DELETE on promo_redemptions
  short-circuits the rest, so re-running a failed checkout never
  double-decrements.
- handlers.Checkout tracks voucherRedeemed and calls
  RollbackPromoCodeRedemption on every downstream failure: settings load,
  Stripe-unconfigured 503 (the t38 walk path), CreateOrder failure,
  Stripe customer rejection, Stripe session rejection, plan-price
  unresolvable.
- Voucher only stays committed once (a) CreditOnlyCheckout commits the
  order+spend+sub transactionally and order.placed fires, or (b) the
  Stripe Checkout Session URL is handed back to the customer (canonical
  abandoned-cart: credit persists on ledger for the next attempt).

Tests:
- store_test.go: three new tests cover the rollback contract — happy
  path (all three side effects undone in one tx), idempotent no-op
  when no redemption row exists, empty-args no-op (no DB hit).
- checkout_test.go: TestCheckout_VoucherPartialCover_StripeUnconfigured_RollsBackRedemption
  is the t38 regression — full sqlmock walk asserting the rollback tx
  fires before the 503 response.

bp-catalyst-platform Chart.yaml + bootstrap-kit pin bumped 1.4.214 → 1.4.215.

Co-authored-by: Claude Code (hatiyildiz) <claude@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 03:21:34 +04:00
github-actions[bot]
3b2b9c2ffd deploy: update Catalyst marketplace image to 17ea3f3
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Build & Deploy Catalyst Marketplace / build (push) Waiting to run
Build & Deploy Catalyst Marketplace / deploy (push) Blocked by required conditions
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-19 23:12:49 +00:00
e3mrah
17ea3f3873
fix(marketplace): post-checkout redirects to console.<slug>.<pool-tld> not operator console (Closes #2001) (#2010)
TBD-V10 — t38 walk: after successful /redeem + /checkout the customer
was redirected to the operator console URL (`console.<sov-fqdn>`)
instead of the per-tenant console (`console.<slug>.<sov-fqdn>`).

Root cause: `core/marketplace/src/lib/config.ts::deriveConsoleURL`
mapped `marketplace.<sov-fqdn> → console.<sov-fqdn>`, never prepending
the tenant slug. PR #1993 (TBD-A67) restored the `console.` prefix in
the chart-side HTTPRoute (tenant-public-routes.yaml) AND the runtime
organization-controller's tenant_route.go (both emit
`console.<slug>.<parentDomain>` byte-identically), but the marketplace
JS that does the post-checkout redirect never picked up the slug-
prefixed shape.

Fix
---
- `src/lib/config.ts`: `deriveConsoleURL(slug?)` now splices the slug
  as the left-most label when the marketplace host is
  `marketplace.<sov-fqdn>`. Slug source: explicit arg → localStorage
  (`sme-active-org-slug`) → fallback to slug-less operator host.
  Exported pure helper `composeTenantConsoleURL(host, slug)` for
  testability. Mothership (`marketplace.openova.io`) and partner
  vanity hosts unchanged.
- `src/lib/api.ts`: new `setActiveOrgSlug()`. `logout()` clears both
  `sme-active-org-slug` and `sme-checkout-tenant-slug`.
- `src/components/CheckoutStep.svelte`: persist `tenant.slug` to
  `sme-checkout-tenant-slug` BEFORE the Stripe hop so the cross-
  origin return can re-stamp it; call `setActiveOrgSlug(tenant.slug)`
  on credit-covered path; pass the slug through `consoleHref(...,
  { slug })` for the redirect navigation.
- `src/layouts/Layout.astro`: inline returning-user redirect now
  pulls the slug from the live-orgs response (preferring the org
  matching `sme-active-org`) and stamps `sme-active-org-slug` before
  redirecting to `console.<slug>.<sov-fqdn>`.

Validation
----------
- `playwright/customer-journey.spec.ts` step 16 extended with the
  brief's exact assertion: `marketplace.omani.homes` + slug `demo`
  → `https://console.demo.omani.homes`. Plus regression guards for
  multi-label sov-fqdn (`marketplace.t38.omani.works` + `acme` →
  `console.acme.t38.omani.works`), mixed-case slug lowercasing, empty/
  null slug falling back to operator host, and mothership ignoring
  the slug.
- `git grep '\.openova\.io"' core/marketplace/src/` returns ZERO new
  hits introduced by this PR (existing references are the tenant
  table for `omantel.openova.io` and the canonical mothership host
  guard — both intentional).
- `npm run build` clean on the affected files (Astro static export
  including CheckoutStep.svelte rebuild).

Chart bump
----------
- products/catalyst/chart/Chart.yaml: 1.4.213 → 1.4.214
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml pin:
  1.4.213 → 1.4.214

Refs: PR #1993 (TBD-A67 console-prefix chart fix), #1949 (/redeem)

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 03:11:51 +04:00
hatiyildiz
d4b995c551 deploy(bp-catalyst-platform): bump bootstrap-kit pin 1.4.215 -> 1.4.216 (auto, Refs TBD-A6)
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (auth) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (billing) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (catalog) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (domain) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (gateway) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (metering-sidecar) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (notification) (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (provisioning) (push) Waiting to run
Build & Deploy Catalyst Backend Services / build (tenant) (push) Waiting to run
Build & Deploy Catalyst Backend Services / deploy (push) Blocked by required conditions
Test — Billing Integration (real Postgres) / integration (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-19 23:03:59 +00:00
github-actions[bot]
0785e3470e deploy: update sme service images to b190566 + bump chart to 1.4.216 2026-05-19 23:03:15 +00:00
e3mrah
b190566c40
fix(sme-notification): align JWT signing secret with catalyst-api bridge (Closes #1999) (#2009)
TBD-V8: voucher email never delivered. On t38 canonical walk (agent
a550281a, 2026-05-19 21:37:33Z) operator issued voucher, row persisted,
HTTP 200 returned, but recipient IMAP stayed empty. catalyst-api logs
showed sme/notification returning 401 to the downstream dispatch.

Trace (end-to-end, per docs/INVIOLABLE-PRINCIPLES.md #1):

  FE → catalyst-api → SME gateway → billing → notification

catalyst-api → gateway → billing wire is correct: catalyst-api mints an
HS256 bridge token from the operator's RS256 Keycloak session via
sharedauth.MintSMEAccessToken (signed with the reflector mirror of
sme-secrets/JWT_SECRET into catalyst-system), gateway and billing both
verify HS256 with the same bytes.

billing → notification wire was broken: billing's sendVoucherIssuedEmail
(core/services/billing/handlers/vouchers.go) POSTed with only
Content-Type — NO Authorization header. notification's HTTP surface is
gated by the shared HS256 JWTAuth middleware
(core/services/shared/middleware/jwt.go); a missing header returns 401
silently. The voucher upsert already persisted so the operator saw 200,
but no email ever landed.

TBD's hypothesis ("JWT signing-secret mismatch between catalyst-api and
sme/notification") was incorrect — both Pods already read from the SAME
sme-secrets/JWT_SECRET (chart templates/sme-services/billing.yaml line
67-71 and notification.yaml line 47-51, both pointing at the same
secretKeyRef). The real gap was that billing never USED those bytes to
mint an outbound service token.

Fix (Go-side only, no chart-template change):

  1. Add JWTSecret []byte to billing's Handler struct
     (core/services/billing/handlers/handlers.go).
  2. Wire it in core/services/billing/main.go from the same JWT_SECRET
     env the inbound JWTAuth middleware already consumes.
  3. In sendVoucherIssuedEmail, mint a 5-minute HS256 service token
     via sharedauth.MintSMEAccessToken (the SAME helper catalyst-api's
     RS256→HS256 bridge uses, so the wire contract is symmetric) and
     forward it as Authorization: Bearer <token>.
     Claims: sub="sme-billing", role="superadmin", typ="session".
  4. Empty JWTSecret falls back to the legacy no-header path so a stale
     chart that doesn't wire JWT_SECRET into billing doesn't crash the
     voucher upsert (mirrors optional:true on catalyst-api's
     CATALYST_SME_JWT_SECRET secretKeyRef).

Tests:

  - TestIssueVoucher_SendsAuthorizationHeader: exercises the full round-
    trip. Billing mints with the test bytes; we re-parse the captured
    token with the SAME bytes (the exact path notification's JWTAuth
    middleware takes on receive) and assert claim shape — sub, role,
    typ, exp. Pre-fix the captured request had no Authorization header
    so this would have failed at the first check.
  - TestIssueVoucher_NoAuthHeader_WhenJWTSecretUnset: back-compat guard
    for the legacy no-secret path.
  - All pre-existing TestIssueVoucher_* tests still pass.

Chart bumped 1.4.213 → 1.4.214 and bootstrap-kit pin in
clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml updated
to match.

Validation:
  - go test ./core/services/billing/... → PASS (3 packages)
  - helm template products/catalyst/chart --set
    ingress.marketplace.enabled=true → both sme/billing and
    sme/notification Deployments read JWT_SECRET from
    secretKeyRef.name=sme-secrets, key=JWT_SECRET.

Refs #1842 (D28 voucher email arrival), #1829 (D29 customer journey).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 03:02:02 +04:00
github-actions[bot]
886c48d4e2 deploy: update catalyst images to cdd7eac
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Strategy flip regression (RollingUpdate -> Recreate) / strategy-flip-regression (push) Waiting to run
2026-05-19 22:45:02 +00:00
e3mrah
cdd7eac20a
fix(bp-sme): wait for gitea user-bootstrap before provisioning starts (Closes #2002) (#2008)
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
TBD-V11 / Issue #2002. On t38 fresh prov, sme/provisioning Pod logged
`HTTP 401 user does not exist [uid: 0, name: ""]` on the first tenant
Org CR creation. Root cause: provisioning Pod started with the chart's
first-install placeholder GITHUB_TOKEN (the Gitea admin password mirrored
verbatim by provisioning-github-token.yaml — enough to clear Container-
ConfigError but NOT a valid Gitea API token). Step 09 of bp-self-
sovereign-cutover later mints a real API token + patches the Secret
+ rollout-restarts the Pod, but the FIRST tenant journey always 401'd
because the Pod was already serving with the bad placeholder.

Approach (B): add an init container `wait-for-cutover-token` to the
SME provisioning Deployment that polls the Secret for the cutover
annotation `catalyst.openova.io/token-source: self-sovereign-cutover-
step-09` (stamped by Step 09 alongside the minted token bytes). The
Pod stays in Init:0/1 until Step 09 has actually completed, then the
main container starts with a guaranteed-valid token. Default poll
budget = 10s × 180 = 1800s (covers Hetzner cold-start ~18m + slack).

Why NOT HelmRelease.dependsOn:
- Per Principle #14, HR.dependsOn → Kustomization is silently ignored.
- bp-self-sovereign-cutover HR is dormant + disableWait:true: it goes
  Ready=True at install BEFORE Step 09's Job actually runs. Adding it
  to bp-catalyst-platform.dependsOn would buy nothing.
- Pod-level init gating waits on the actual condition (Secret
  annotation set by Step 09), not on a proxy.

Why NOT change bp-self-sovereign-cutover trigger order:
- Step 09 must run AFTER bp-catalyst-platform creates the Secret
  (otherwise the patch has no target). Reordering would break the
  inverse dependency.

Why NOT a Job that bootstraps the user upfront:
- Step 09 already mints the token; we don't need a second bootstrap.
- The bug is timing, not absence of bootstrap.

Files changed:
- products/catalyst/chart/templates/sme-services/provisioning.yaml:
  add initContainers block gated on
  smeServices.provisioning.waitForCutoverToken.enabled (default true).
  Re-uses existing `provisioning` SA (already has secrets get/list/watch
  in `sme` ns via sme-provisioning ClusterRole — no new RBAC).
- products/catalyst/chart/values.yaml: add
  smeServices.provisioning.waitForCutoverToken.{enabled,image,
  intervalSeconds,timeoutSeconds} block.
- products/catalyst/chart/Chart.yaml: bump 1.4.213 → 1.4.214 with
  full TBD-V11 changelog entry.
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml: bump
  HelmRelease pin 1.4.213 → 1.4.214 (chart bump only delivers the fix
  when the pin moves — TBD-A68 / 1.4.213 precedent).

Validation:
- `helm template` Sovereign-mode render shows the init container in
  the provisioning Deployment with kubectl-poll loop.
- Default-values smoke render unaffected (gate is
  ingress.marketplace.enabled=true; smoke uses defaults where false).
- `helm lint products/catalyst/chart/` passes.
- Contabo-Zero render path safe by construction (chart only renders
  the Deployment when ingress.marketplace.enabled=true; contabo
  doesn't enable marketplace via this chart).

Closes #2002. Refs #1829 (D29 tenant materialisation gate).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 02:43:01 +04:00
github-actions[bot]
a090477aa1 deploy: bump organization-controller image to 8d8ce40
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
2026-05-19 22:33:22 +00:00
github-actions[bot]
102521cb71 deploy: bump sandbox-controller image to 8d8ce40 2026-05-19 22:33:13 +00:00
github-actions[bot]
f9ea3c9c9e deploy: bump sandbox-mcp-server image to 8d8ce40 2026-05-19 22:32:00 +00:00
e3mrah
8d8ce40045
fix(build-organization-controller): add missing auto-bump pipeline + pkg/** path filter + wire-level test (Refs #1997) (#2005)
Some checks are pending
Build organization-controller / build (push) Waiting to run
Build sandbox-controller / build (push) Waiting to run
Build sandbox-mcp-server / build (push) Waiting to run
Build catalyst-catalog / test (push) Waiting to run
Build catalyst-catalog / build (push) Blocked by required conditions
Build catalyst-catalog / notify (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Followup hardening for #1997 (PR #2004 catch-up bumped the
organization-controller chart pin to c9b58ea). PR #2004 unblocks t38
right now, but the underlying cause — `build-organization-controller.yaml`
has no auto-bump step and its path filter misses `core/controllers/pkg/**`
— is still live and will re-strand the next gitea-client fix the
moment it lands. This PR closes both gaps so the bug cannot recur.

Two surgical additions:

  1. `.github/workflows/build-organization-controller.yaml`
     a. Promote `permissions.contents: read` → `write` (+ `actions:
        write`), mirroring `build-application-controller.yaml`.
     b. Add `Bump controllers.organization.image.tag in values.yaml`
        step (awk-scoped to the `organization:` block only — cannot
        accidentally bump a sibling controller's tag).
     c. Add `Commit and push values.yaml bump` step (rebase-safe,
        skip-if-no-change).
     d. Add `Dispatch blueprint-release for chart re-publish` step
        — anti-recursion bypass for the GH-Actions rule that bot
        pushes don't fire downstream workflows. Without this the
        rebuilt image is NEVER baked into a new chart version.
     e. Add `core/controllers/pkg/**` to push + pull_request path
        filters. The shared HTTP-client tree (gitea, keycloak,
        kc-mappers, …) is COPYed into every Group C controller's
        image via the Containerfile, so a change to it MUST rebuild.
        PR #1910 only triggered a rebuild because it happened to
        also touch `organization_controller_test.go`; a pure pkg/
        fix would silently skip the workflow.

  2. `core/controllers/pkg/gitea/client_test.go`
     New `TestCreateOrg_HitsOrgsEndpointWithAuth` — wire-level
     regression guard that:
     - Fails hard if the client EVER hits `/api/v1/admin/orgs` (would
       catch a refactor accident that re-introduces the Gitea 1.22+
       405 bug regardless of which chart pin is deployed).
     - Asserts the request is `POST /api/v1/orgs` exactly once.
     - Asserts the request carries `Authorization: token <hex>` with
       the exact expected value (defense-in-depth: even if the URL
       is right, Gitea 1.22+ still returns 405 without the token).

Sibling controllers (environment, blueprint, useraccess, …) likely
have the same missing-auto-bump + missing-pkg/** path filter. NOT
fixing them in this PR — blast-radius discipline. Follow-up
recommended: audit every `build-*-controller.yaml` for both gaps.

Validation:
  • go vet ./pkg/gitea/... — clean
  • go test -race ./pkg/gitea/... — ok, all pre-existing + new tests pass
  • go test -run TestCreateOrg_HitsOrgsEndpointWithAuth -v — PASS

Refs #1997 (PR #2004 closed the immediate symptom; this PR closes
the deploy gap so #1997 cannot recur)
Refs #1910 (the original /admin/orgs → /orgs code fix)
Refs #1829 (D29 customer journey hardening)

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 02:29:59 +04:00
e3mrah
1ca37ea7f8
fix(bp-valkey): default auth.enabled=false to match bp-newapi passwordless REDIS_CONN_STRING (Closes #2003) (#2007)
Pre-1.0.2 bp-valkey shipped `valkey.auth.enabled: true` (bitnami default)
while bp-newapi's REDIS_CONN_STRING default was the passwordless URL
`redis://valkey-primary.valkey.svc.cluster.local:6379`. On every
freshly-franchised Sovereign the newapi Pod CrashLoopBackOff'd 45x on
the Redis ping probe with `NOAUTH Authentication required` — caught
on t38 sandbox walk 2026-05-20. This is the Pillar-4 verifier-killing
bug for the Sandbox + qwen-code + MCP end-user DoD (#1986).

Approach A (simpler, this PR): flip bp-valkey's default to
`auth.enabled: false` so the upstream bitnami chart exports
`ALLOW_EMPTY_PASSWORD=yes` to the Valkey container. Verified via
`helm template` — the render now contains:

    - name: ALLOW_EMPTY_PASSWORD
      value: "yes"

Other in-cluster consumers tolerate the change:
  - products/catalyst sme-services (auth.yaml + gateway.yaml) read
    VALKEY_PASSWORD via `secretKeyRef ... optional: true` and fall
    back to the no-auth connect path in
    core/services/shared/db/valkey.go when the value is empty.
  - products/catalyst projector wraps the password Secret mount in
    `{{- with .Values.services.projector.valkey.passwordSecret }}`
    so an absent Secret simply skips the password env var.

Approach B (deferred): make bp-newapi mirror the bp-valkey
auto-generated password Secret into the newapi namespace and template
it into REDIS_CONN_STRING. Larger scope, tracked under #2003 follow-up.

Changes:
  - platform/valkey/chart/values.yaml — auth.enabled: true → false
  - platform/valkey/chart/Chart.yaml — version 1.0.1 → 1.0.2
  - platform/valkey/blueprint.yaml — spec.version + configSchema default
  - clusters/_template/bootstrap-kit/17-valkey.yaml — chart pin 1.0.1 → 1.0.2

Verified:
  - `helm dependency build` succeeds (bitnami/valkey 5.5.1 unchanged)
  - `helm template` renders `ALLOW_EMPTY_PASSWORD=yes` on the Pod
  - tests/observability-toggle.sh — all 4 cases PASS

Closes #2003
Refs #1986

Co-authored-by: hatiyildiz <catalyst@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 02:26:56 +04:00
github-actions[bot]
c45d10a1d9 deploy: update catalyst images to f90d697
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Strategy flip regression (RollingUpdate -> Recreate) / strategy-flip-regression (push) Waiting to run
2026-05-19 22:23:46 +00:00
e3mrah
f90d697846
fix(chart): bump organization-controller 72e3f08 -> c9b58ea so PR #1910's gitea-client fix actually ships (Closes #1997) (#2004)
TBD-A68: t38 walkthrough on 2026-05-19 21:41Z (chart 1.4.211) put two
tenant Organization CRs (walkdemo38, walk-t38-2138) into
Ready=False/GiteaOrgFailed with `POST .../api/v1/admin/orgs HTTP 405`.

Investigation showed the code fix already landed on main as PR #1910
(merged 2026-05-19 03:59Z, commit f442c28): `gitea.EnsureOrg` now hits
`POST /api/v1/orgs` (the user-token endpoint) instead of the admin-only
`/api/v1/admin/orgs` that returns 405 to the in-cluster service-account
token. The build-organization-controller workflow successfully produced
fresh images at f442c28 and then again at c9b58ea (most recent main-
HEAD push touching the controller, 2026-05-19 20:58Z).

The bug on t38 was deployment-time: the chart's image pin at
products/catalyst/chart/values.yaml:369 still pointed at `72e3f08`
from 2026-05-10 across three subsequent chart bumps (1.4.210 / 1.4.211
/ 1.4.212). The CI auto-bump-images job covers SME images only, not
controller images, so this class of stale pin slips through. Filing
TBD-A69 separately to close that CI gap.

Files (pure deployment-pin update, no code change):
- products/catalyst/chart/values.yaml:369
  tag: "72e3f08" -> tag: "c9b58ea"
- products/catalyst/chart/Chart.yaml
  version + appVersion 1.4.212 -> 1.4.213, changelog entry added.
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml
  version: 1.4.212 -> 1.4.213, changelog entry added.

Validation:
- `helm template products/catalyst/chart | grep organization-controller`
  -> `image: "ghcr.io/openova-io/openova/organization-controller:c9b58ea"`
- `grep -c "72e3f08" <helm template output>` -> 0
- GHCR manifest probe for c9b58ea returns HTTP 200 with
  application/vnd.docker.distribution.manifest.v2+json (image exists
  and is pullable by the in-cluster ghcr-pull secret).

Post-deploy expectation:
- organization-controller Pod rolls to c9b58ea on `helm upgrade`.
- Controller logs flip from `POST /api/v1/admin/orgs HTTP 405` (every 30s)
  to `POST /api/v1/orgs 201` on the existing stuck Organization CRs.
- walkdemo38 + walk-t38-2138 auto-recover to Ready=True without operator
  intervention (gitea EnsureOrg is idempotent; the reconcile loop will
  re-fire and succeed).
- Unblocks D29 tenant-org provisioning chain (Keycloak group +
  vCluster + tenant URL HTTPRoute + WordPress install all gate on the
  Organization CR being Ready).

Closes #1997
Refs #1829 (D29 tenant onboarding), #1842, #1945, #1910 (the upstream
code fix this chart bump finally ships).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 02:21:46 +04:00
e3mrah
ebfc59c18e
fix(bp-flux-stuck-hr-recovery): grant helmreleases/status patch RBAC + log stderr (Closes #1995) (#1998)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Playwright UI smoke (Group L — / Playwright UI smoke (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
* fix(bp-flux-stuck-hr-recovery): grant helmreleases/status patch RBAC + log stderr (Closes #1995)

Agent ae9d7638 verifying PR #1991 on t38 (2026-05-19 21:18Z) found
the bp-flux-stuck-hr-recovery CronJob correctly detected bp-alloy in
`Ready=Unknown for 427s, history[0].status=deployed` state, entered
the TBD-A66 branch B, and attempted the patch — but the in-Pod
`kubectl patch hr --subresource=status` silently failed because its
stderr was swallowed by `2>&1` into the same /dev/null pipe as
stdout. A manual identical patch from bastion succeeded immediately,
so RBAC was not the blocker.

Investigation: the 1.2.3 ClusterRole already grants `helmreleases`
+ `helmreleases/status` patch+update verbs (it was added in PR #1991
to enable the new branch in the first place). The actual root cause
of the silent failure was diagnostic-blind: the script could not
distinguish a successful patch from a failing one, so the
human-readable `RECOVER ... — patching` log line emitted in both
cases.

Fix (1.2.4):
- Capture `kubectl patch --subresource=status` stderr to a tempfile
  under /tmp (the writable emptyDir mount) so multi-line apiserver
  errors survive intact.
- Emit three structured `[A66]` log lines that operators / agents
  can grep:
    detection: `[A66] HR <ns>/<name> Ready=Unknown for <age>s,
                history[0]=deployed → attempting patch`
    success:   `[A66] HR <ns>/<name> patched to Ready=True`
    failure:   `[A66] HR <ns>/<name> patch FAILED: <stderr>`
- Same treatment for the annotation-rollback path so a stuck
  idempotency annotation can also be diagnosed.
- Add Case 8 to leader-election-and-recovery.sh asserting:
    * detection / success / failure log lines render in the script
    * the `>/dev/null 2>&1` pattern is no longer on the critical
      `kubectl patch --subresource=status` line
    * stderr is captured via `mktemp /tmp/a66-patch-err.XXXXXX`

Chart 1.2.3 -> 1.2.4; bootstrap-kit pin 03-flux.yaml bumped in
lockstep (bootstrap-kit pin-sync check passes for bp-flux).

Refs #1989 (TBD-A66). Closes #1995.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(bp-flux): bump blueprint.yaml spec.version 1.2.3 → 1.2.4 in lockstep with Chart.yaml

manifest-validation's TestBootstrapKit_BlueprintCardsHaveRequiredFields + TestBootstrapKit_BlueprintVersionLockstepSweep require blueprint.yaml spec.version to track Chart.yaml version exactly (TBD-A20 / #1856). Forgotten in the previous commit.

Refs #1995.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 01:55:14 +04:00
github-actions[bot]
9183bc938f deploy: update catalyst images to 3d20ee3
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Test — Strategy flip regression (RollingUpdate -> Recreate) / strategy-flip-regression (push) Waiting to run
2026-05-19 21:40:14 +00:00
e3mrah
3d20ee35bf
fix: purge 5 .openova.io leaks — tenant users now reach their Sovereign not mothership (Closes #1994) (#1996)
Some checks are pending
Build & Deploy Catalyst / build-ui (push) Waiting to run
Build & Deploy Catalyst / build-api (push) Waiting to run
Build & Deploy Catalyst / deploy (push) Blocked by required conditions
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Build & Deploy Catalyst Console / build (push) Waiting to run
Build & Deploy Catalyst Console / deploy (push) Blocked by required conditions
Cosmetic + step-flow regression guards / Playwright cosmetic + step-flow guards (push) Waiting to run
Test — Bootstrap API (Go) / test (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Five surgical fixes for TBD-A68 (#1994) — every tenant-facing URL the
catalyst-api / SPA / chart could emit now follows the Sovereign FQDN
the deployment is bound to, instead of hardcoding the mothership host.

1. products/catalyst/bootstrap/api/internal/handler/auth.go
   PIN email plaintext + HTML bodies now read SOVEREIGN_FQDN env via a
   new pinEmailLoginURL() helper. Chroot mode (SOVEREIGN_FQDN set)
   emits `https://console.<fqdn>/login`; mothership mode keeps the
   historical `https://console.openova.io/sovereign/login`. The HTML
   visible-link text is also derived from the resolved host.

2. core/console/src/lib/config.ts
   MARKETPLACE_URL / CHECKOUT_URL / MARKETPLACE_HOME_URL now lazy-
   resolve via resolveMarketplaceOrigin() — Astro public env
   `PUBLIC_MARKETPLACE_ORIGIN` first, runtime `window.location.host`
   second (strip `console.<slug>?` + prepend `marketplace.`), legacy
   `https://marketplace.openova.io` fallback for SSR snapshots.

3. products/catalyst/chart/templates/sme-services/configmap.yaml
   CORS_ORIGIN_PUBLIC / CORS_ORIGIN_ADMIN / CORS_ORIGIN_GATEWAY /
   PUBLIC_BASE_URL / PUBLIC_API_BASE_URL / CNAME_TARGET /
   CHECKOUT_SUCCESS_URL / CHECKOUT_CANCEL_URL now templated against
   `marketplace.<global.sovereignFQDN>` + sibling platform zone.
   Catalyst-Zero render (no sovereignFQDN, no host override) keeps
   the legacy `sme.openova.io` byte-identical so contabo's existing
   CORS / public URLs don't drift.

4. products/catalyst/chart/templates/sme-services/notification.yaml
   Notification Deployment's CORS_ORIGIN env now sources from the
   shared `sme-services-config.CORS_ORIGIN_PUBLIC` key instead of
   hardcoding `https://sme.openova.io`. Per-Sovereign FQDN
   substitution flows through automatically.

5. Regression test:
   TestPinEmail_SovereignFQDNRoutesLoginURL in auth_pin_test.go covers
   both modes (chroot routes to sovereign console; mothership keeps
   openova.io target) and asserts the HTML body never routes tenant
   traffic through openova.io when SOVEREIGN_FQDN is set.

Validation:
- `helm template products/catalyst/chart --set global.sovereignFQDN=t38.omani.works`
  renders ZERO openova.io strings in CORS / PUBLIC_BASE_URL / CHECKOUT
  keys. Catalyst-Zero render preserves the legacy sme.openova.io paths.
- `go test ./internal/handler/` passes 101.4s (full suite + new
  TestPinEmail regression test).

Chart bump: bp-catalyst-platform 1.4.211 -> 1.4.212 + bootstrap-kit
pin in clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml.

Closes #1994

Co-authored-by: hatiyildiz <claude@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 01:38:01 +04:00
hatiyildiz
675e863082 deploy(bp-catalyst-platform): bump bootstrap-kit pin 1.4.210 -> 1.4.211 (auto, Refs TBD-A6)
Some checks are pending
Vendor-coupling guardrail / Vendor-coupling guardrail (push) Waiting to run
Cluster bootstrap-kit drift guardrail / Detect bootstrap-kit drift (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / dependency-graph-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / pin-sync-audit (push) Waiting to run
Test — Bootstrap Kit (kind cluster + Flux) / manifest-validation (push) Blocked by required conditions
Test — Bootstrap Kit (kind cluster + Flux) / kind-reconciliation (push) Blocked by required conditions
Phase-8a preflight C — Cilium Gateway HTTPRoute admission / Preflight Cilium HTTPRoute admission (push) Waiting to run
2026-05-19 21:00:59 +00:00