fix(chart): bump organization-controller 72e3f08 -> c9b58ea so PR #1910's gitea-client fix actually ships (Closes #1997)

TBD-A68: t38 walkthrough on 2026-05-19 21:41Z (chart 1.4.211) put two
tenant Organization CRs (walkdemo38, walk-t38-2138) into
Ready=False/GiteaOrgFailed with `POST .../api/v1/admin/orgs HTTP 405`.

Investigation showed the code fix already landed on main as PR #1910
(merged 2026-05-19 03:59Z, commit f442c28): `gitea.EnsureOrg` now hits
`POST /api/v1/orgs` (the user-token endpoint) instead of the admin-only
`/api/v1/admin/orgs` that returns 405 to the in-cluster service-account
token. The build-organization-controller workflow successfully produced
fresh images at f442c28 and then again at c9b58ea (most recent main-
HEAD push touching the controller, 2026-05-19 20:58Z).

The bug on t38 was deployment-time: the chart's image pin at
products/catalyst/chart/values.yaml:369 still pointed at `72e3f08`
from 2026-05-10 across three subsequent chart bumps (1.4.210 / 1.4.211
/ 1.4.212). The CI auto-bump-images job covers SME images only, not
controller images, so this class of stale pin slips through. Filing
TBD-A69 separately to close that CI gap.

Files (pure deployment-pin update, no code change):
- products/catalyst/chart/values.yaml:369
  tag: "72e3f08" -> tag: "c9b58ea"
- products/catalyst/chart/Chart.yaml
  version + appVersion 1.4.212 -> 1.4.213, changelog entry added.
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml
  version: 1.4.212 -> 1.4.213, changelog entry added.

Validation:
- `helm template products/catalyst/chart | grep organization-controller`
  -> `image: "ghcr.io/openova-io/openova/organization-controller:c9b58ea"`
- `grep -c "72e3f08" <helm template output>` -> 0
- GHCR manifest probe for c9b58ea returns HTTP 200 with
  application/vnd.docker.distribution.manifest.v2+json (image exists
  and is pullable by the in-cluster ghcr-pull secret).

Post-deploy expectation:
- organization-controller Pod rolls to c9b58ea on `helm upgrade`.
- Controller logs flip from `POST /api/v1/admin/orgs HTTP 405` (every 30s)
  to `POST /api/v1/orgs 201` on the existing stuck Organization CRs.
- walkdemo38 + walk-t38-2138 auto-recover to Ready=True without operator
  intervention (gitea EnsureOrg is idempotent; the reconcile loop will
  re-fire and succeed).
- Unblocks D29 tenant-org provisioning chain (Keycloak group +
  vCluster + tenant URL HTTPRoute + WordPress install all gate on the
  Organization CR being Ready).

Closes #1997
Refs #1829 (D29 tenant onboarding), #1842, #1945, #1910 (the upstream
code fix this chart bump finally ships).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
hatiyildiz 2026-05-20 00:18:58 +02:00
parent ebfc59c18e
commit 7db57921fd
3 changed files with 51 additions and 6 deletions

View File

@ -736,7 +736,17 @@ spec:
# `marketplace.<global.sovereignFQDN>` so every tenant request
# stays on its own Sovereign instead of bouncing to the
# mothership marketplace. Catalyst-Zero render byte-identical.
version: 1.4.212
# 1.4.213 — TBD-A68 follow-up / #1997 (2026-05-20): bump the
# organization-controller image pin from the 2026-05-10
# `72e3f08` to `c9b58ea` so the chart ships PR #1910's
# gitea-client fix (POST /api/v1/orgs, not /api/v1/admin/orgs).
# Pre-fix on t38 the controller logged `POST /api/v1/admin/orgs
# HTTP 405` every 30s and tenant Organization CRs were stuck
# Ready=False/GiteaOrgFailed. Pure pin bump, no code in this
# PR; the code fix is upstream in #1910. The CI auto-bump-
# images job skipped controller images (TBD-A69 follow-up
# tracks closing that gap).
version: 1.4.213
sourceRef:
kind: HelmRepository
name: bp-catalyst-platform

View File

@ -1,5 +1,25 @@
apiVersion: v2
name: bp-catalyst-platform
# 1.4.213 — TBD-A68 / #1997 (2026-05-20): bump organization-controller
# image pin 72e3f08 → c9b58ea so the chart actually ships PR #1910's
# gitea-client fix (POST /api/v1/orgs instead of the admin-only
# /api/v1/admin/orgs route that returns HTTP 405 to the in-cluster SA).
# Pre-fix on t38 (2026-05-19 21:41Z, chart 1.4.211): two Organization
# CRs walkdemo38 + walk-t38-2138 stuck Ready=False/GiteaOrgFailed
# with `POST .../api/v1/admin/orgs: HTTP 405` in the message, and
# organization-controller logs spewing the same 405 every 30s.
# Root cause was deployment-time, not code: PR #1910 merged to main
# on 2026-05-19 03:59Z and the build-organization-controller workflow
# produced fresh images (f442c28, then c9b58ea), but the chart's pin
# at products/catalyst/chart/values.yaml:369 was never updated. The
# CI auto-bump-images job covers SME images only; controller images
# fall through the gap (TBD-A69 follow-up tracks closing that gap so
# this class of stale-pin bug stops happening). This bump is a pure
# pin update — no code change in this PR; the code fix is upstream in
# #1910. Validation: `helm template products/catalyst/chart | grep
# organization-controller` shows c9b58ea, not 72e3f08; post-deploy
# controller logs show `POST /orgs 201`, the two stuck Organization
# CRs on t38 auto-recover.
# 1.4.212 — TBD-A68 (issue #1994, 2026-05-19): purge five `.openova.io`
# leaks that sent tenant users to the mothership instead of THEIR
# Sovereign. Five surgical fixes:
@ -1831,8 +1851,8 @@ name: bp-catalyst-platform
# was already shipped on 1.4.197 (PR #1820 lineage); this completes
# the data-layer side so the dropdown finally appears on multi-region
# Sovereigns. Refs #1821, DoD D20.
version: 1.4.212
appVersion: 1.4.212
version: 1.4.213
appVersion: 1.4.213
# 1.4.183 — fix(httproute): omit default sectionName so multi-zone
# Sovereigns attach via Cilium Gateway hostname matcher (Closes #1884,
# TBD-A30). Pre-1.4.183 every catalyst-system HTTPRoute pinned

View File

@ -364,9 +364,24 @@ controllers:
enabled: true
image:
repository: "ghcr.io/openova-io/openova/organization-controller"
# 72e3f08 = qa-loop iter-8 Fix #42 (#1252 + Containerfile fix-up
# #1253) — fixes Bug 1 (UserAccess Claim namespace).
tag: "72e3f08"
# c9b58ea = TBD-A68 fix (#1997, 2026-05-20) — picks up PR #1910's
# gitea-client fix (POST /api/v1/orgs instead of the admin-only
# /api/v1/admin/orgs endpoint that returned HTTP 405 to the
# in-cluster service account). Walked on t38 2026-05-19 21:41Z:
# two Organization CRs (walkdemo38, walk-t38-2138) stuck
# Ready=False / GiteaOrgFailed with the 405 in the message.
# Pre-fix the chart pinned 72e3f08 (2026-05-10), which predates
# #1910's merge to main on 2026-05-19 03:59Z. The CI auto-bump-
# images job covers SME images only (TBD-A69 follow-up tracks
# extending it to controller images), so this controller-image
# pin remained stale through three subsequent chart bumps.
# c9b58ea is the most recent main-HEAD push that successfully
# rebuilt this image and is the latest controller-touching SHA
# (run id 2026-05-19T20:58:00Z on the build-organization-
# controller workflow). Previously: 72e3f08 = qa-loop iter-8
# Fix #42 (#1252 + Containerfile fix-up #1253), fixed Bug 1
# (UserAccess Claim namespace).
tag: "c9b58ea"
pullPolicy: IfNotPresent
replicas: 1
leaderElection: