Commit Graph

2662 Commits

Author SHA1 Message Date
hatiyildiz
a52bda30cb docs(pass-9b): retry banners on harbor / falco / sigstore / syft-grype
Pass 9's commit ea81c38 only landed banners on grafana + kyverno —
the harbor / falco / sigstore / syft-grype edits failed because the
Edit tool requires a Read pass per file before write. Now Read'd
and applied:

- harbor: per-host-cluster registry, pointer to PLATFORM-TECH-STACK §3.5.
- falco: per-host-cluster runtime security, pointer to §3.3 + SRE §10
  (SIEM/SOAR pipeline).
- sigstore: cosign signing chain on every Blueprint OCI artifact,
  Kyverno admission verifies signatures.
- syft-grype: CI-side SBOM + runtime CVE matching.

Pass 9 now complete.

Refs #37
2026-04-27 21:41:22 +02:00
hatiyildiz
ea81c38e15 docs(pass-9): role-in-Catalyst banners on grafana / harbor / falco / kyverno / sigstore / syft-grype
Pass 9 — six more component READMEs got Catalyst-role banners
matching the rule of thumb in CLAUDE.md (every platform/<x>/README.md
should state its role in Catalyst).

- grafana: observability stack on every host cluster; Catalyst's
  own self-monitoring + Application telemetry flows here.
- harbor: per-host-cluster container registry for Catalyst images,
  mirrored Blueprint OCI artifacts, customer images.
- falco: runtime security on every host cluster; feeds SIEM/SOAR.
- kyverno: policy engine on every host cluster; enforces Catalyst
  policy contracts (cosign on Blueprints, default-deny NetworkPolicies
  on Organization namespaces, priority-class injection).
- sigstore: cosign-signed Blueprint OCI artifacts + admission
  verification chain on every host cluster.
- syft-grype: SBOM generation in CI per Blueprint + runtime CVE scans.

Plus Kyverno priority-class clarification: prose around `tenant-high`
/ `tenant-default` / `tenant-batch` priority class names now reads
"Organization workloads" instead of "tenant workloads", with an
explicit note that the priority class artifact names themselves stay
as-is until a separate migration ticket renames them in deployed
clusters (renaming PriorityClass objects requires recreate, not
in-place rename).

VALIDATION-LOG: Pass 9 entry added.

Refs #37
2026-04-27 21:40:51 +02:00
hatiyildiz
14ed84de41 docs(pass-8): role-in-Catalyst banners + dead-link fix in component READMEs
Pass 8 — line-by-line read of platform/cnpg, platform/strimzi,
platform/k8gb, platform/keycloak, platform/cert-manager, platform/cilium.

CNPG and Strimzi: read in full and confirmed clean — they correctly
position themselves as Application Blueprints and don't drift from
the canonical model. CNPG's `<org>-postgres-dr` cluster name
(Application-tier database role) is acceptable per NAMING-CONVENTION
§1.3 (which only forbids primary/dr in K8s host-cluster names, not
in Application-internal CRD names).

Four READMEs updated:

k8gb:
- Header reframed: per-host-cluster infrastructure pointer to
  PLATFORM-TECH-STACK §3.1 and SRE §2.4 split-brain protection.
- Removed dead link to ../failover-controller/docs/ADR-FAILOVER-
  CONTROLLER.md (the failover-controller folder has no docs/);
  replaced with link to that component's README + SRE §2.4.

keycloak:
- Header reframed from "FAPI Authorization Server for Open Banking"
  (narrow) to "User identity for Catalyst Sovereigns" (broad).
  Keycloak handles ALL user identity in Catalyst, not just FAPI.
- Added per-Org / per-Sovereign topology callout matching SECURITY
  §6. Clarified that "Multi-tenant TPP" refers to PSD2 Third Party
  Providers, not Catalyst's Organization-level multi-tenancy.
- FAPI features kept since Keycloak still serves Fingate as the
  FAPI Authorization Server.

cert-manager:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.3.

cilium:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.1, including the install-first note
  (CNI must come before any other workload during Phase 0).

VALIDATION-LOG: Pass 8 entry added.

Refs #37
2026-04-27 21:39:03 +02:00
hatiyildiz
a5ffa1a716 docs(pass-7): align Gitea + Flux multi-region story; fix broken mermaid id
Continuing Pass 7 cleanup after the OpenBao/ESO rewrite (42aeb62).

Gitea README:
- Was describing "Bidirectional mirroring for multi-region" with two
  Gitea instances mirroring repos cross-region. Wrong: Catalyst's
  agreed model has one Gitea per Sovereign on the management cluster
  (PLATFORM-TECH-STACK §2.3). Replaced the multi-region mirror
  diagram with a single-Gitea + intra-cluster HA topology and added
  a "Why not cross-region bidirectional mirror" explainer (write-
  conflict semantics would break EnvironmentPolicy enforcement).
- Status banner: notes the canonical references.
- Backup section: removed "Repository mirror for redundancy"
  (replaced with Velero scheduled backups).

Flux README:
- "Multi-Region GitOps" section was showing one Gitea per region
  with bidirectional mirror. Replaced with one Gitea per Sovereign
  topology. Per-vcluster Flux pulls from this single Gitea.

Mermaid syntax bug:
- Earlier mass replace_all of "Catalyst IDP" → "Catalyst console"
  had left an invalid mermaid node identifier
  `Catalyst console[Catalyst console]` (mermaid forbids spaces in
  node IDs). Fixed to `Console[Catalyst console]`. Would have
  rendered as a broken diagram on GitHub.

VALIDATION-LOG: Pass 7 entry added documenting the OpenBao/ESO
active-active rewrite (the most consequential drift fix in any pass).

Refs #37
2026-04-27 21:36:20 +02:00
hatiyildiz
42aeb629bb docs(pass-7): rewrite OpenBao + ESO READMEs to match agreed multi-region semantics
Pass 7 — line-by-line read of platform/openbao/README.md and
platform/external-secrets/README.md found a major architectural drift:
both files described an OLD active-active bidirectional sync model
that contradicts docs/SECURITY.md §5 (the canonical reference).

The active-active design was rejected during the architecture session
because it would have been a stretched cluster — a single region's
network blip would block writes everywhere. The agreed model is:

- Independent Raft cluster per region (intra-region quorum only).
- Single-primary writes; replicas accept reads only.
- Async Performance Replication primary → replicas (lag <1s typical).
- Explicit DR promotion (sovereign-admin or failover-controller).

Fixes:

platform/openbao/README.md:
- Overview: removed "active-active deployments" / "either region can
  update secrets". Replaced with "independent Raft cluster per region",
  "asynchronous Performance Replication".
- Architecture diagram: replaced bidirectional-push diagram with the
  primary→replicas async perf replication topology that matches
  SECURITY.md §5.
- ClusterSecretStores: simplified from "two stores (local+remote)" to
  "one local store"; reads always pull locally.
- Renamed "PushSecret (Bidirectional)" → "Writes go to the primary
  region" with a single-target PushSecret pointing at bao-primary.
- Added DR promotion section pointing at SECURITY.md §5.2.
- Status banner: notes that the canonical multi-region reference is
  SECURITY.md.

platform/external-secrets/README.md:
- Header line: repositioned as per-host-cluster infrastructure with
  pointer to PLATFORM-TECH-STACK §3.3.
- Removed broken link to non-existent ../openbao/docs/ADR-OPENBAO.md
  (replaced with link to ../openbao/README.md).
- "Multi-region sync | Push to both OpenBao instances simultaneously"
  → "Multi-region reads | Async perf replication".
- "PushSecret to Multiple OpenBao Instances" example was writing to
  two ClusterSecretStores in parallel — replaced with single-target
  primary write.
- "Multi-region sync via single PushSecret" in Consequences →
  "Cross-region availability via Performance Replication".
- Mermaid sequence diagram: "Bootstrap Wizard" actor → "Catalyst
  Bootstrap (Phase 0)"; "Terraform" → "OpenTofu"; ESO connection
  description "via K8s auth" → "via SPIFFE SVID (workload identity)".

These were the most consequential drift fixes found in any pass —
two READMEs were documenting an architecture explicitly rejected by
the agreed model.

Refs #37
2026-04-27 21:34:09 +02:00
hatiyildiz
8072b012b9 docs: record Pass 6 entry in VALIDATION-LOG 2026-04-27 21:30:59 +02:00
hatiyildiz
fec0c342a8 docs(pass-6): reconcile topology diagram + unify JetStream Account scoping
Pass 6 — fresh-eyes line-by-line read of ARCHITECTURE.md. Found two
internal contradictions that earlier passes missed.

ARCHITECTURE §3 (topology diagram) listed Crossplane, Flux, Harbor,
and grafana-stack INSIDE the Catalyst control plane block. But §11
(Catalyst-on-Catalyst) explicitly says these are per-host-cluster
infrastructure, NOT Catalyst control-plane components. PLATFORM-TECH-
STACK §3 also classifies them as per-host-cluster.

Fixed: §3 topology diagram now shows only true Catalyst control-plane
components (console, marketplace, admin, catalog-svc, projector,
provisioning, environment-controller, blueprint-controller, billing,
gitea, nats-jetstream, openbao, keycloak, spire-server, observability)
and adds a separate line for "Plus per-host-cluster infrastructure"
that defers to PLATFORM-TECH-STACK §3 for the full list (Cilium, Flux,
Crossplane, cert-manager, ESO, Kyverno, Harbor, Reloader, Trivy, Falco,
Sigstore, Syft+Grype, VPA, KEDA, External-DNS, k8gb, Coraza, MinIO,
Velero, failover-controller). Also added the previously-missing
`provisioning` row.

JetStream Account scoping was contradictory:
- ARCHITECTURE §5 said "Per-Org account: ws.{org}-{env_type}.>" —
  reads ambiguously: is the Account per-Org or per-Env?
- NAMING-CONVENTION §11.2 said "One JetStream Account scoped to
  ws.{org}-{env_type}.>" — implied per-Environment.
- GLOSSARY + PLATFORM-TECH-STACK + SECURITY all say per-Organization.

Reconciled to the per-Org-Account-with-per-Env-subjects model:
- Account isolation: ONE NATS Account per Organization.
- Subjects within the Account use prefix `ws.{org}-{env_type}.>` for
  per-Environment partitioning.

This is the cleanest isolation model: Accounts are NATS' strongest
isolation boundary (per-Org); subjects partition further within each
Account (per-Env).

Refs #37
2026-04-27 21:30:03 +02:00
hatiyildiz
7298a7ddca docs(pass-5c): add VALIDATION-LOG.md — trail of multi-pass integrity work
Concluding the validation loop with a process artifact. The new file
records:

- Why the validation existed (post-rewrite trust verification).
- Each pass's scope and concrete fixes (16 iterations across Pass 1
  + sweeps in Passes 2/3/4/5).
- The acceptance criteria as runnable grep commands so any future
  contributor can re-verify.
- Authorship convention (hatiyildiz, per-commit identity flags).
- Re-validation cadence (after rewrites, after new banned terms,
  after component renames, quarterly drift check).

Linked from README.md docs table.

This file is meant as a playbook for the next validation, not a
status snapshot — for status, IMPLEMENTATION-STATUS.md remains
canonical.

Refs #37
2026-04-27 21:27:40 +02:00
hatiyildiz
ba048d2fd7 docs(pass-5b): scrub remaining "instance" usages where "Application" is meant
Two user-facing residuals where the banned product term "instance"
slipped through:

- docs/ARCHITECTURE.md §9: example console dialog "Use existing
  instance or create a dedicated one?" → "Use an existing Postgres
  Application or create a new dedicated one?". This is a UI prompt
  text — must use the user-facing noun "Application", not "instance".

- docs/NAMING-CONVENTION.md §6.2 tag comment: "Application instance
  name" → "Application name within the Environment". The CRD might
  internally still use the noun Instance for class-vs-instance
  semantics, but in tag annotations and user-visible context the
  Application IS the instance.

Other "instance" occurrences confirmed legitimate (Postgres instance
as Crossplane resource type, Flux instance as software deployment,
EC2/Hetzner instance as cloud-provider terminology) and retained.

Final cross-reference check: all Markdown links across all canonical
docs resolve. No residual banned terms.

Refs #37
2026-04-27 21:26:27 +02:00
hatiyildiz
79c59a27a2 docs(pass-5): reconcile Phase-0 install order, IMPLEMENTATION-STATUS section numbering
Pass-5A — fresh-eyes deep read found two structural drifts.

ARCHITECTURE §10 Phase-0 install order:
- Old order: cert-manager → Cilium → Flux → ... → Catalyst control plane.
- SOVEREIGN-PROVISIONING §3 has the correct order: Cilium first
  (CNI must be in place before pods can network), THEN cert-manager.
- ARCHITECTURE updated to match: Cilium → cert-manager → Flux →
  Crossplane → Sealed Secrets → SPIRE → JetStream → OpenBao →
  Keycloak → Gitea → Catalyst control plane (11 items, matching
  the SOVEREIGN-PROVISIONING list which had Keycloak and Gitea
  spelled out separately).

IMPLEMENTATION-STATUS section numbering:
- Old: §1 → §2 → §2bis → §3 → §4 → §5 → §6 → §7 → §8.
  The "§2bis" was a workaround for inserting per-host-cluster
  infrastructure without renumbering. Reads weird.
- New: §1 → §2 → §3 → §4 → §5 → §6 → §7 → §8 → §9. Clean numbering.

Refs #37
2026-04-27 21:25:07 +02:00
hatiyildiz
d1a2ed73a3 docs(pass-4): align ARCHITECTURE phase numbering with SOVEREIGN-PROVISIONING
ARCHITECTURE §10 listed 3 provisioning phases (Phase 0 / 1 / 2) and
labeled Phase 2 as "Self-sufficient". SOVEREIGN-PROVISIONING.md uses
4 phases (Phase 0 Bootstrap / Phase 1 Hand-off / Phase 2 Day-1 setup
/ Phase 3 Steady-state). The same phase number meant different things
in the two docs.

Aligned ARCHITECTURE to the 4-phase numbering. SOVEREIGN-PROVISIONING
is now explicitly the canonical reference for phase semantics.

Refs #37
2026-04-27 21:22:07 +02:00
hatiyildiz
f4e99bb882 docs(pass-3): normalize muscatpharmacy Org-slug example consistency
PERSONAS-AND-JOURNEYS and SECURITY were using two competing slugs
for the same example Organization:
- "muscat-pharmacy" (with hyphen) — used as Org name + Environment
  name in the Ahmed journey narrative.
- "muscatpharmacy" (no hyphen) — used as the vcluster name in the
  same paragraph, and used everywhere else (NAMING-CONVENTION
  examples, ARCHITECTURE topology diagram, SECURITY SPIFFE ID).

NAMING §2.5 allows both spellings (Org slug regex permits hyphens).
But within a single example the spelling must be stable, otherwise
readers see a contradiction between Org and vcluster names.

Normalized to single-token "muscatpharmacy" throughout (matches the
predominant usage and produces simpler URLs / paths).

Result: all docs now show the same example Org consistently —
muscatpharmacy as Org, muscatpharmacy as vcluster, muscatpharmacy-prod
as Environment, gitea.omantel.openova.io/muscatpharmacy/muscatpharmacy-prod
as Environment Gitea repo.

Refs #37
2026-04-27 21:20:52 +02:00
hatiyildiz
b810002b16 docs(pass-3): align IMPLEMENTATION-STATUS with PLATFORM-TECH-STACK §2/§3 split
After the PLATFORM-TECH-STACK reorganization (§2 = Catalyst control
plane, §3 = per-host-cluster infrastructure), IMPLEMENTATION-STATUS
§2 was still mixing the two — listing cilium, k8gb, kyverno, falco,
etc. under "Catalyst control plane components" alongside console,
projector, etc.

Split into:
- §2 (renumbered subsections 2.1, 2.2): Catalyst control plane only —
  the per-Sovereign components that make a cluster a Sovereign.
- §2bis: Per-host-cluster infrastructure — the substrate every host
  cluster needs (Cilium, Flux, Crossplane, cert-manager, ESO, Kyverno,
  Trivy, Falco, Sigstore, Syft+Grype, VPA, KEDA, Reloader, MinIO,
  Velero, Harbor, failover-controller).

Status flags retained per component (📐 design / 🚧 README only / 
implemented / ⏸ deferred). All per-host-cluster components currently
🚧 (READMEs exist; none yet packaged as deployable Blueprints).

This brings IMPLEMENTATION-STATUS into 1:1 correspondence with the
PLATFORM-TECH-STACK §2 / §3 / §4 categorization that other docs
reference.

Refs #37
2026-04-27 21:19:57 +02:00
hatiyildiz
d6a51b8a7a docs(pass-2): final entity-noun sweep — external-secrets sequence diagram
Pass 2 — fresh-eyes sweep across the entire docs tree. One residual
entity-noun usage found:

- platform/external-secrets/README.md:75 (in a Mermaid sequence
  diagram): "Note over Wizard: Operator saves unseal keys offline"
  — "Operator" used as person/entity. Renamed to "sovereign-admin"
  to match the role from GLOSSARY.md.

All other banned-term sweeps clean:
- No tenant (architectural) anywhere.
- No Catalyst IDP anywhere.
- No Synapse-as-product anywhere (only the legitimate
  "Matrix/Synapse server" usages).
- No workspace-controller (only the banned-term entries that define
  the rename).
- No capital-W Workspace as Catalyst scope.
- No github.com/openova (without -io).
- All cross-doc Markdown links resolve.
- All §X references resolve to the new section numbering after
  PLATFORM-TECH-STACK reorg.
- API group catalyst.openova.io/v1alpha1 consistent across 6 references.
- OCI artifact prefix `bp-` consistent across README, CLAUDE,
  BLUEPRINT-AUTHORING, IMPLEMENTATION-STATUS.

Other "Operator" mentions intentionally retained (legitimate
technical usage):
- "External Secrets Operator (ESO)", "Trivy Operator" — K8s
  Operator pattern (controllers), explicitly allowed by GLOSSARY.
- "Operator compatibility" in BUSINESS-STRATEGY's OpenShift migration
  table — refers to compatibility with K8s Operators (the technology),
  not as an entity/role.

Refs #37
2026-04-27 21:18:55 +02:00
hatiyildiz
15905cee6f docs(iter-9-12): repo structure clarity, PLATFORM-TECH-STACK reorg, SRE alignment
README + CLAUDE.md (iter 9):
- README's "Build a Blueprint" section was contradicting itself: said
  "A Blueprint is a Git repo" while elsewhere we'd locked in the
  monorepo decision. Rewritten: Blueprint = a folder under
  platform/<name>/ or products/<name>/ in this monorepo. CI publishes
  per-folder OCI artifacts.
- CLAUDE.md "Repo structure": replaced the brief tree with a more
  honest one that distinguishes target structure from current
  placeholders (core/apps/ is target console+projector+...; current
  has only legacy bootstrap/ and manager/ .gitkeep dirs). Annotated
  each products/<name>/ folder with current state (axon = real code;
  others = README only; catalyst = bootstrap/ui scaffold).
- CLAUDE.md banned-terms entry "Workspace": now covers component
  names too (was only Catalyst scope), matching GLOSSARY's expanded
  banned-term entry.

PLATFORM-TECH-STACK (iter 10) — substantive reorganization:

The §1 categorization established three buckets:
  (a) Catalyst control plane (per-Sovereign on mgt)
  (b) Per-host-cluster infrastructure (every host cluster)
  (c) Application Blueprints (a la carte)

But §2 "Catalyst control plane components" was mixing buckets (a)
and (b): it listed flux, crossplane, cert-manager, kyverno, harbor,
external-secrets, reloader, vpa, keda, k8gb, coraza, falco, trivy,
sigstore, syft-grype, minio, velero, failover-controller all under
"Catalyst control plane" — but those are per-host-cluster
infrastructure per §1, and §1 itself said Crossplane "Never
user-facing" / per-host-cluster.

Reorganized §2 + §3:
- §2 now contains ONLY the Catalyst control plane:
    2.1 User-facing surfaces (console, marketplace, admin)
    2.2 Catalyst backend services (projector, catalog-svc, provisioning,
        environment-controller, blueprint-controller, billing)
    2.3 Per-Sovereign supporting services (keycloak, openbao, spire-
        server, nats-jetstream, gitea, observability)
- New §3 Per-host-cluster infrastructure with subsections for
  networking, GitOps+IaC, security+policy, scaling+ops, storage+
  registry, resilience.
- Application Blueprints renumbered §3 → §4. Added missing
  opensearch row to §4.1 (was previously misplaced in observability).
- Composite Blueprints (Products) §4 → §5.
- Multi-Region §5 → §6. Resource estimates §6 → §7. Cluster
  deployment §7 → §8. User choice §8 → §9. SIEM §9 → §10. License §10 → §11.

Cross-doc references to PLATFORM-TECH-STACK §1 / §2 (in NAMING,
ARCHITECTURE, IMPLEMENTATION-STATUS) all still resolve correctly
under the new numbering.

SRE (iter 11):
- §2.4 split-brain table: "MongoDB" → "FerretDB" (MongoDB was
  retired in favor of FerretDB-on-CNPG per project-memory).
- §2.5 data replication: clarified each row's layer (Application
  Blueprint vs per-host-cluster vs Catalyst control plane) instead
  of misclassifying MinIO/Harbor as Application Blueprints. Added
  OpenSearch row.
- §3.1 Flagger and §3.2 Flipt: explicitly marked "Status: design,
  not yet a deployed Blueprint" since they're "components to watch"
  in TECHNOLOGY-FORECAST, not in the current PLATFORM-TECH-STACK §3
  inventory.

BUSINESS-STRATEGY + TECHNOLOGY-FORECAST (iter 12):
- Final scan: clean. No tenant/operator-team/Catalyst-IDP/Lifecycle
  Manager/Synapse(product) violations remaining.

Refs #37
2026-04-27 21:17:15 +02:00
hatiyildiz
8d351d7001 docs(iter-6-8): security/provisioning/blueprint corrections + OCI artifact naming
SECURITY (iter 6):
- "Environment repo" → "Environment Gitea repo" in §3 secrets diagram.
- "ChangePolicy enforces approvals" → "EnvironmentPolicy enforces
  approvals" in §9 SOC2 row (ChangePolicy was a fictional CRD —
  EnvironmentPolicy is the real one defined in ARCHITECTURE §8).
- "Catalyst's compliance-controller surfaces evidence" → "evidence
  surfaced via Catalyst console audit views and SIEM exports"
  (compliance-controller wasn't defined elsewhere; this avoids
  inventing new components in compliance prose).

SOVEREIGN-PROVISIONING (iter 7):
- "vault-stored" → "stored in OpenBao on the provisioner"
  (Vault was replaced by OpenBao; "vault-stored" was generic English
  but read as a contradiction).

BLUEPRINT-AUTHORING (iter 8):
- OCI artifact naming locked: `ghcr.io/openova-io/bp-<name>:<semver>`
  where `<name>` is the folder name. The `bp-` prefix lives in the
  OCI artifact name (self-identifying), not the folder name.
  Fixed in §1, §10, §11, §13 — and propagated to README.md so the
  pattern is consistent across the repo.
- Crossplane Composition example: `compositeTypeRef.apiVersion`
  changed from `bp-wordpress.openova.io/v1alpha1` (per-Blueprint
  group, ugly) to `compose.openova.io/v1alpha1` (shared XRD group
  across all Blueprints).
- §11 CI pipeline final step: "publish blueprint.yaml as the
  manifest" → "as the OCI manifest's metadata layer" (clearer about
  what it does in the OCI sense).

Refs #37
2026-04-27 21:12:14 +02:00
hatiyildiz
80b91709e1 docs(iter-3-5): purge operator-as-entity, fix Workspace-controller capital, JetStream KV references
ARCHITECTURE (iter 3):
- Removed catalystctl from the §4 write-side diagram (it's read-only;
  presenting it as a write input contradicted §7.4).
- "Both tabs read the same Valkey snapshot" → "JetStream KV snapshot"
  in §5 (Valkey is no longer in the control plane).
- §7.4: catalystctl reframed as "may exist as small read-only debug
  CLI" rather than implying it ships today.
- §11 dependency list: added bp-catalyst-provisioning; removed
  bp-catalyst-crossplane (Crossplane is per-host-cluster infra, not a
  Catalyst control-plane component); added clarifying note.
- §12 CRD list: added SecretPolicy + Runbook (were already in
  IMPLEMENTATION-STATUS but missing from the principles table).
- §2 SME-style description: "SaaS Operator team (Omantel staff)" →
  "SaaS provider's cloud team" (Operator banned as entity).

NAMING-CONVENTION (iter 4):
- §5.1 heading "operator domain" → "Sovereign domain".
- §7 multi-region diagram: replaced piecemeal Catalyst component list
  with a deferral to PLATFORM-TECH-STACK §2; added SPIRE server;
  fixed "per-Org workspaces" → "per-Environment Gitea repos"; added
  per-host-cluster infrastructure callout.

SECURITY (iter 6 — partial; fold into this commit):
- "operator-approved" → "sovereign-admin-approved" for DR promotion.
- Realm name "catalyst-operator" → "catalyst-admin" (entity-noun
  scrubbed from the realm naming itself).

SOVEREIGN-PROVISIONING (iter 7 — partial):
- "single operator's laptop" → "single person's laptop" (avoid
  "operator" as entity).
- "the next operator" → "the next Sovereign provisioning request,
  regardless of who initiates it".
- "catalyst-operator realm" → "catalyst-admin realm" (×2).
- Capital-W "Workspace-controller" residuals (3) → "Environment-
  controller" (replace_all is case-sensitive; previous iter caught
  lowercase only).

PERSONAS (iter 5):
- P3 "within a Sovereign Operator team" → "within a Sovereign's
  operations team".
- Two capital-W "Workspace-controller" residuals fixed.

SRE (iter 11 — partial):
- §13.2 "Workspace-controller stuck" runbook entry →
  "Environment-controller stuck".

Banned-term sweep result post-fix: no `Operator team|role|account|
user|admin` anywhere; no capital-W Workspace as Catalyst scope;
no Valkey-as-control-plane refs.

Refs #37
2026-04-27 21:09:31 +02:00
hatiyildiz
27325edb32 docs(iter-2): glossary alignment — rename workspace-controller, fix definitions
GLOSSARY.md line-by-line audit. Eight corrections.

1. workspace-controller → environment-controller everywhere. The
   controller reconciles the Environment CRD; "workspace" is banned as
   a Catalyst scope, so it cannot be in a component name either. Fixed
   in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION,
   SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README,
   BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly
   covers component names too.

2. "workspace repos" (per-Environment Gitea repos) → "Environment
   Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK.

3. JWT claim {workspace, org, role} → {environment, org, role} in
   ARCHITECTURE projector diagram.

4. OpenOva definition refined: was "Never used to name a product",
   which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand
   prefix in product names; bare "OpenOva" = the company; bare
   "Catalyst" = the platform.

5. Catalyst definition completed: was missing provisioning, billing,
   gitea, observability — now lists all 14 control-plane components,
   pointing at the table below.

6. Catalyst components table: added `provisioning` (validates
   configSchema, commits to Environment Gitea); reordered to match
   ARCHITECTURE §3 grouping; clarified each component's source-of-truth
   (catalog-svc reads monorepo + Gitea, blueprint-controller watches
   monorepo + Gitea, etc.).

7. Environment definition: refers to NAMING §2.4 for env_type values;
   removed inline list that didn't match canonical ordering. Added
   concrete examples (acme-prod, acme-dev, bankdhofar-uat).

8. Application example: dropped "RocketChat" which appeared nowhere
   else; replaced with generic "running deployment" plus the
   established WordPress / Postgres examples.

9. sovereign-admin description: was "runs Crossplane" — Crossplane is
   platform plumbing not user-facing. Now: "manages the underlying
   clusters via Crossplane (which is platform plumbing, not a
   user-facing surface)".

Banned-term coverage:
- "Workspace" entry now covers BOTH the Catalyst scope AND component
  naming (workspace-controller → environment-controller).

Refs #37
2026-04-27 21:06:09 +02:00
hatiyildiz
2c4902b409 docs(iter-1): add IMPLEMENTATION-STATUS, fix wrong-org refs, reconcile monorepo
First validation iteration. Three concrete corrections.

1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target
   architecture and current code state. Status legend ( / 🚧 / 📐 / ⏸)
   applied per-component. Catalyst control plane = mostly 📐. Component
   READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon
   =  (only product with real code). core/ = 📐 (just .gitkeep).

2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING,
   BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE
   pointing readers at IMPLEMENTATION-STATUS.md before they treat any
   described feature as built. GLOSSARY also references it.

3. Architectural decision (Option A — monorepo canonical):
   - Each platform/<name>/ and products/<name>/ folder is the source of
     ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by
     CI fan-out from the monorepo root.
   - BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match.
   - README.md "what's in this repo" rewritten to clarify monorepo +
     OCI-fan-out shape; no longer claims every directory is a Blueprint
     in a way that contradicts BLUEPRINT-AUTHORING.

Wrong-org fixes (3 places):
   - docs/PERSONAS-AND-JOURNEYS.md:13   github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md:13     github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md:404    github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io

API group consistency:
   - All references unified to catalyst.openova.io/v1alpha1
     (was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are
     design-stage with no implementation).

core/README.md updated to honestly describe the directory tree as
"target structure with .gitkeep placeholders" rather than implying
the apps/console, apps/projector, etc. binaries already exist.
The legacy apps/bootstrap and apps/manager directories are
acknowledged as transitional placeholders that will be removed when
the new apps/ layout is scaffolded.

CLAUDE.md and .claude/project-memory.md updated to put
IMPLEMENTATION-STATUS.md second in the read-first ordering.

Refs #37
2026-04-27 20:43:31 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
hatiyildiz
b857f46706 docs(strategy,forecast): terminology pass — Catalyst as platform, console not IDP
Targeted updates to BUSINESS-STRATEGY.md §5.1 and §9.2 plus
TECHNOLOGY-FORECAST §removed-components.

- BUSINESS-STRATEGY.md §5.1: OpenOva Catalyst row repositioned. It is
  the platform itself (the self-sufficient Kubernetes-native control
  plane that turns any cluster into a Sovereign), not a sub-product
  bundling bootstrap+IDP+lifecycle manager. Other OpenOva products
  (Cortex, Fingate, Fabric, Relay, Specter, Axon) run ON Catalyst as
  composite Blueprints.

- BUSINESS-STRATEGY.md §9.2: capability matrix "Developer portal" cell
  updated from "Catalyst IDP" to "Catalyst console" — IDP function is
  one of the console's responsibilities, not a separate product.

- TECHNOLOGY-FORECAST.md §removed-components: Backstage row updated to
  describe replacement as "Catalyst console (the platform's own
  developer-facing UI)" rather than the now-retired "Catalyst IDP"
  sub-product.

Strategy narrative, market segmentation, pricing model, and migration
playbook are unchanged — they stand on their own.

Refs #37
2026-04-27 20:06:31 +02:00
hatiyildiz
4b3a6884f5 docs(stack,sre): align tech stack and SRE handbook with Catalyst control plane
Two related rewrites that put the control plane / application Blueprint
distinction front and center.

PLATFORM-TECH-STACK.md
  - §1: explicit three-way component categorization — Catalyst control
    plane (one per Sovereign), per-host-cluster infrastructure (every
    cluster), Application Blueprints (inside per-Org vclusters).
  - §2: Catalyst control plane components listed by responsibility —
    user-facing surfaces, backend services, identity, secrets, event
    spine, GitOps, networking, security, scaling, storage,
    observability, resilience.
  - §3: Application Blueprints (the a-la-carte catalog) — Valkey and
    Strimzi explicitly callout that they are Application Blueprints,
    NOT control-plane components (control plane uses NATS JetStream).
  - §4: composite Blueprints (Cortex, Axon, Fingate, Fabric, Relay)
    repositioned as Applications running ON Catalyst, not as parallel
    products.
  - §5: multi-region diagram showing independent OpenBao Raft per
    region, NATS leaf nodes, Crossplane on mgt.
  - §6: resource estimates updated for control plane (~12 GB +
    per-Org Keycloak in SME tier).
  - §10: license posture table — every control-plane component carries
    a redistribution-safe license (no BSL).

SRE.md
  - §2: multi-region principles updated; explicit "no stretched
    clusters" applies to OpenBao, JetStream, etcd, every quorum-
    based component.
  - §2.5: data replication patterns now scoped to Application
    Blueprints (the things a customer installs), separate from
    control-plane patterns documented in SECURITY.md and
    ARCHITECTURE.md.
  - §4: alert-to-action mapping segmented by Catalyst control plane
    vs per-product (Cortex, Fingate); new alerts: OpenBaoSealed,
    JetstreamLagHigh.
  - §7-§13: terminology aligned to Catalyst (console instead of IDP);
    runbooks now Runbook CRD-backed; incident severities updated.
  - §13.2-13.3: Catalyst-specific incidents (workspace-controller,
    OpenBao seal, projector lag) plus AI Hub incidents under
    bp-cortex installation.

Refs #37
2026-04-27 20:06:20 +02:00
hatiyildiz
039a724f31 docs: rewrite repository foundation around Catalyst as the platform
Repositions the public repo's identity. OpenOva is the company; Catalyst
is the platform. Sovereign is a deployed Catalyst. The historical
positioning (OpenOva = platform, Catalyst = bootstrap+IDP+lifecycle
sub-product) is retired. Catalyst now subsumes bootstrap, lifecycle, and
IDP responsibilities into one control plane.

- README.md             Catalyst-first front door. Sovereign concept,
                        repo structure, stack at a glance, cloud
                        provider matrix, getting-started paths
                        (managed via marketplace.openova.io vs
                        self-host via catalyst-provisioner).

- CLAUDE.md             Codebase guide for Claude. Banned-term table,
                        commit conventions (hatiyildiz default for
                        public repo), the no-fourth-surface rule,
                        per-component README rule of thumb.

- .claude/project-memory.md   Reduced to an index + decision log;
                        full architecture moved to docs/. Stack
                        decisions locked (NATS JetStream, OpenBao,
                        SPIFFE/SPIRE, per-Org Keycloak SME / per-
                        Sovereign corporate, Crossplane only IaC,
                        no Terraform/Pulumi user-facing surface).

- core/README.md        Catalyst control-plane Go application. Drops
                        the bootstrap-vs-manager split (both fold under
                        "Catalyst control plane"). Lists each component
                        deployable from this codebase: console,
                        marketplace, admin, projector, catalog-svc,
                        provisioning, workspace-controller, blueprint-
                        controller, billing. CRD list updated:
                        Sovereign / Organization / Environment /
                        Application / Blueprint / EnvironmentPolicy /
                        SecretPolicy / Runbook.

Refs #37
2026-04-27 20:05:58 +02:00
hatiyildiz
217c882916 docs(naming): rename {env}→{env_type}, add Organization + vcluster + Catalyst Environment layers
The naming convention pre-dates vcluster and Catalyst's user-facing
Environment object. Three additions, one rename:

- §2.4: {env} dimension renamed to {env_type} to disambiguate from the
  Catalyst Environment object (which is the user-facing scope, not a
  dimension).

- §2.5: new Organization dimension (slug, lowercase, hyphenated). Used
  for vcluster identity and any Organization-scoped resource.

- §4.7: new vcluster naming layer. Pattern is just {org} within the
  parent host cluster (Don't Repeat the Parent — Principle 1.2). Globally-
  qualified form is {prov}-{reg}-{bb}-{env_type}-{org} for cross-cluster
  references and kubeconfig contexts.

- §11: Catalyst Environment defined as the user-facing {org}-{env_type}
  scope. One Environment is realized by N vclusters across regions × bb
  filtered by Application Placement. Each Environment has its own Gitea
  repo and JetStream Account.

Tags updated: openova.io/environment → openova.io/env-type for
disambiguation; new openova.io/organization, openova.io/vcluster,
openova.io/environment (for Catalyst scope), openova.io/sovereign tags.

DNS pattern §5 split into two: control-plane (component.{location-code}.
{sovereign-domain}) and Application (app.{environment}.{sovereign-or-org-
domain}) — supporting white-label Sovereigns where the Application DNS
uses the customer's own domain.

Refs #37
2026-04-27 20:05:42 +02:00
hatiyildiz
d51a3fba4d docs: add canonical Catalyst documentation set
Six new docs that establish the unified Catalyst model — Sovereign as
deployed instance, Organization as multi-tenancy unit, Environment as
{org}-{env_type} scope, Application as user-facing handle, Blueprint as
unified module+template successor.

- docs/GLOSSARY.md           single source of truth for terminology;
                             every other doc defers to it; banned terms
                             (tenant, operator-as-entity, module, template,
                             Backstage, etc.) listed with replacements.

- docs/ARCHITECTURE.md       overall Catalyst architecture: control plane
                             vs application Blueprints, write path
                             (Git → Flux → K8s + Crossplane), read path
                             (CQRS via NATS JetStream → projector → SSE),
                             SPIFFE/SPIRE workload identity, OpenBao
                             independent Raft per region (no stretched
                             cluster), Keycloak per-Org (SME) vs
                             per-Sovereign (corporate).

- docs/PERSONAS-AND-JOURNEYS.md   personas × journeys matrix; only
                             three first-class surfaces (UI, Git, API);
                             explicit removal of Terraform/Pulumi/CLI as
                             user-facing IaC; Application card anatomy.

- docs/SECURITY.md           identity (workload + user), OpenBao + ESO
                             credential flow, dynamic credentials with
                             auto-rotation sidecar, multi-region
                             OpenBao (independent Raft per region with
                             async perf replication — explicitly NOT
                             stretched), rotation policy CRDs, threat
                             model.

- docs/SOVEREIGN-PROVISIONING.md   Phase 0 (catalyst-provisioner +
                             OpenTofu one-shot) → Phase 1 (Crossplane
                             adopts) → Phase 2 (self-sufficient Catalyst
                             control plane); air-gap procedure;
                             Organization migration; decommission.

- docs/BLUEPRINT-AUTHORING.md   Blueprint CRD spec, configSchema,
                             placementSchema, depends, manifests,
                             overlays; Crossplane Composition authoring
                             for non-K8s; signing/publishing pipeline;
                             public vs private (Org-scoped) visibility;
                             contribution path.

Refs #37
2026-04-27 20:05:25 +02:00
e3mrah
69706a80ec feat(axon): make qwen3-coder thinking mode toggleable via request parameter
Client sends `thinking: true` to enable reasoning tokens. Default remains
disabled for instant streaming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 09:20:33 +02:00
e3mrah
63fc7a381f fix(axon): disable qwen3-coder thinking mode for instant streaming
Qwen3-coder generates hundreds of `reasoning` tokens before `content`
tokens, causing 10+ second perceived delay. The reasoning tokens stream
through Axon but the ChatWidget only renders `delta.content`, so users
see a long pause then a burst. Passing `enable_thinking: false` via
chat_template_kwargs skips the reasoning phase entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 09:08:47 +02:00
e3mrah
5201bdc962 fix(axon): tighten WAF payload limits — system 4000, assistant 800, total 8000
3-turn conversations passed at ~9120 chars but 4-turn failed at ~10640.
WAF anomaly threshold is between those values. Lowered all limits to keep
multi-turn conversations well under the threshold.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:52:04 +02:00
e3mrah
00ddc1437c fix(axon): cap assistant messages and total payload to prevent WAF rejection on long conversations
WAF anomaly scoring accumulates across the entire request body. After 2-3 turns,
assistant responses containing infrastructure terms (security, scanning, etc.)
push the total past the threshold. Added per-assistant trim (1500 chars) and a
12000-char sliding window that drops oldest messages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:44:33 +02:00
e3mrah
40c4abe4f6 fix(axon): deduplicate system messages before forwarding to vLLM
vLLM requires system messages to be at the beginning. When Axon merges
conversation history with new messages, duplicate system messages cause
a 400 error. Strip all but the first system message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:35:28 +02:00
e3mrah
4110161577 fix(axon): trim large system prompts to avoid vLLM WAF rejection
The vLLM backend at Bank Dhofar runs behind an Istio/Envoy WAF with
ModSecurity-style anomaly scoring. The ChatWidget's 41KB system prompt
accumulates enough infrastructure/security keywords to trigger a 403.

Trim system messages to 6000 chars (70% head + 30% tail) before
forwarding to vLLM — preserves identity/behavior instructions at the
start and FAQ/response guidelines at the end.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:27:14 +02:00
e3mrah
85e1319e01 fix(axon): resolve unknown model names to vLLM default
Clients (e.g. ChatWidget) send OpenAI model names like gpt-4o-mini which
vLLM doesn't recognize. The provider now queries available models on
startup and remaps any unrecognized name to the configured default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 07:54:07 +02:00
e3mrah
68fcbe1aed feat(axon): add toggleable vLLM provider backend
Introduces a provider abstraction so Axon can proxy to either Claude SDK
(existing behavior) or a vLLM-compatible endpoint. Toggled via
AXON_PROVIDER env var ("claude" | "vllm"). When vllm, requests pass
through as-is (no prompt translation), session pool and OAuth are skipped.

Closes openova-io/openova#36

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 07:36:58 +02:00
e3mrah
500aca483e refactor(catalyst/ui): canonical OpenOva mark everywhere; drop OctagonAlert placeholder
Part of the brand-consistency sweep (openova-private#116). All brand
surfaces must use the swirl mark with gradient #3B82F6 → #818CF8.

- public/favicon.svg: replaced with canonical mark (was a purple
  placeholder logo)
- OOLogo.tsx: default c1 coerced from #38BDF8 → #3B82F6 to match the
  brand/logo-mark.svg canonical gradient
- AuthLayout.tsx: OctagonAlert in blue-square placeholder → OOLogo;
  label 'OpenOva Corporate' → 'OpenOva Sovereign'
- AppLayout.tsx: same — OctagonAlert → OOLogo; 'Corporate' → 'OpenOva Sovereign'
2026-04-22 09:08:20 +02:00
e3mrah
60938c0775 refactor(catalyst/ui): serve at /sovereign base path; centralize URL config; never hardcode
Part of the tier-URL cutover (openova-private issue #116). Catalyst UI moves
from catalyst.openova.io to console.openova.io/sovereign.

- vite.config.ts: base '/sovereign/' so built HTML/asset refs are prefixed
- src/app/router.tsx: TanStack Router basepath '/sovereign' so Link/navigate
  emit /sovereign-prefixed URLs
- src/shared/config/urls.ts (NEW): central BASE / API_BASE / path() helper
- StepReview.tsx: fetch('/api/v1/deployments') → fetch(`${API_BASE}/v1/deployments`)
  and window.location.href = '/provision.html' → path('provision.html')
- StepCredentials.tsx: same treatment for /api/v1/credentials/validate
- nginx.conf SPA fallback simplified to try_files $uri /index.html — avoids
  nginx's directory trailing-slash redirect that would strip the /sovereign
  prefix client-side

No more hardcoded URLs in the source (see feedback_never_hardcode_urls rule).
2026-04-21 18:31:08 +02:00
e3mrah
5dfb4f3867 fix(catalyst): persistent footer (no more bounce on step change)
Footer used to be rendered inside StepShell — every step navigation
unmounted the old footer and mounted a new one, which produced a
visible flicker (the 'bounce' the user reported).

Architecture change:
- New shared/lib/wizardNav.ts — tiny Zustand store holding the active
  step's nav handlers (onNext, onBack, disabled, loading, label, title)
- StepShell now publishes its nav state via useEffect; renders no footer
- WizardLayout renders ONE persistent footer that reads from wizardNav

Result: footer DOM is stable across step transitions. Buttons swap
behavior synchronously (no remount, no fade-in flicker). Stepper
counter and progress pill stay in place too.
2026-04-16 12:44:16 +02:00
e3mrah
09ed921b81 feat(catalyst): compact step header + bottom-placed helper text
User: 'I don't think anyone will read that section. If you believe
the info is required, put it at the bottom.'

Matching SME's pattern now:
- Compact 1.1rem title with subtle border-bottom divider
- Description paragraph removed from the top
- Description re-rendered at the BOTTOM of the step content as muted
  helper text (0.82rem, dashed border-top), so users who want the
  context can still read it after completing the main task
- Reclaims ~70-90 px of vertical space above the step content
2026-04-16 11:58:04 +02:00
e3mrah
b09503cdf9 feat(catalyst): align light borders + useful footer
1) Light-mode palette alignment with SME:
   - --wiz-border #d1d5db → #e2e8f0 (SME slate-200)
   - --wiz-text-md #334155#475569 (SME text-dim)
   - --wiz-border-sub adjusted to SME slate-100
   Fixes the darker-border-in-light-mode discrepancy the user spotted.

2) Sticky footer now uses its space. Adds a left-hand summary:
     'Step 3 of 6 · Provider' + progress pill (33%)
   Matches SME's summary-on-left pattern.
   Responsive: pill hides <820px, title/divider hide <640px.
2026-04-16 11:35:46 +02:00
e3mrah
3a836a1884 feat(catalyst): deep unification with SME — colors, buttons, footer, flat cards
User requested Corporate inherit every polish element where SME is
better. Changes (all minimum-touch):

Palette (globals.css):
- Accent channel: sky-400 (56,189,248) → SME blue-500 (59,130,246)
- Light-mode accent: sky-600 (2,132,199) → SME blue-600 (37,99,235)
- New --wiz-success-ch: SME emerald (16,185,129 dark / 5,150,105 light)
- Unifies green dots and blue pills with SME on sight

WizardLayout.tsx:
- Stepper circle 'done' state: #22C55E → rgba(var(--wiz-success-ch))
- Stepper separator 'done': same
- Driven entirely by CSS vars — light/dark flips automatically

_shared.tsx (StepShell flat + SME footer):
- Removed outer card wrapper (no bg, no border, no shadow, no blur)
- Content flows flat — child cards are the only surfaces now
- Nav buttons moved to a sticky bottom footer (like SME):
    • Back: transparent outline, SME border style
    • Continue: solid SME accent, subtle shadow, no gradient
- Backdrop-blur on footer matches the header
- Loading spinner inline in Continue button
2026-04-16 10:51:28 +02:00
e3mrah
427023cf6a revert(catalyst): restore stepper-below-header (previous approach)
User preferred the previous approach — stepper as its own centered
row below the header, matching SME's current pattern exactly.

Reverts the in-header stepper change; restores WizardLayout.tsx to
the state from commit 7ed9239.
2026-04-16 10:27:34 +02:00
e3mrah
3bb3666e09 fix(catalyst): integrate stepper INTO header row
User wanted the stepper to live in the header area to reclaim
vertical space, not as a separate row below. Now:

- Header: 3-zone grid (logo · stepper · actions) in one row
- Stepper: inline pills (row: circle + label side-by-side)
- Active pill has accent bg + ring; done shows green circle with check
- Responsive: labels hide below 980px, pill strips to compact dots
- Phone: header reflows to 2 rows (logo/actions + stepper below)
- All chrome fits in ~56 px of header height total
2026-04-16 09:50:06 +02:00
e3mrah
7ed9239c2d feat(catalyst): unify wizard with SME — horizontal stepper, flat palette
User request: unify both wizards on the horizontal pattern and bring
Corporate in line with SME's look-and-feel (dark/light mode, colors,
cards) with minimum changes.

Minimum-touch changes:
- globals.css: flatten --wiz-page-bg from radial gradient to solid
  #0b1220 (dark) / #f8fafc (light) — matches SME's flat bg.
  --wiz-panel-bg bumped to #111827 (dark) / #ffffff (light) to match
  SME card surfaces.
- WizardLayout.tsx: complete rewrite as a horizontal top-stepper
  (header + stepper row + content), mirroring the SME stepper pattern
  (32px numbered circles + labels below + 44px connecting lines).
  Done circles turn green with a check; active is accent blue with a
  soft ring; pending stays as a hollow circle.
- Responsive: labels hide below 720px, circles shrink to 28px so 6
  steps remain legible on tablets and phones.

Step content components (StepOrganisation, StepTopology, ...) are
unchanged — they inherit the new palette via the existing --wiz-*
variables.
2026-04-16 08:57:34 +02:00
e3mrah
afd9df01de fix(catalyst): balls adjacent to content card + +30% vertical gap
User feedback: 1km gap between balls and main card, and vertical spacing
between balls was too tight at 22px.

- Body padding-left 40px → 8px (desktop)
- Content wrapper: margin: 0 auto → marginLeft: 0 (left-align to hug
  the sidebar; card right edge now rests against the balls)
- Desktop step gap: 22 → 28 (+27%)
- Tablet step gap: 18 → 24 (+33%)
- Content maxWidth 960 → 1000 to fill the extra breathing room
2026-04-16 08:46:20 +02:00
e3mrah
7fe0133b9e fix(catalyst): strip pane, cap gap, labels-left, right-aligned stepper
User feedback: previous revision brought back a subtle sidebar pane
(tint + right border) which was wrong direction. Also gaps between
balls stretched to fill full viewport height, making spacing excessive.

Redesign:
- Sidebar width 260 → 200 px, NO bg, NO border (fully transparent)
- Fixed 22 px gap between balls — no more flex:1 stretch
- Stepper right-aligned within sidebar so balls sit flush against
  the main content card (tight visual proximity, as requested)
- Labels rendered LEFT of balls (one word each — dropped the
  two-line title+description pattern)
- Logo also right-aligned to match direction
- Progress bar compact at the bottom, right-aligned
- Tablet variant: icon-only balls, same transparent + centered pattern
2026-04-15 23:04:22 +02:00
e3mrah
403d5083c2 fix(catalyst): sidebar balance + visible pending rails
Previous redesign killed sidebar bg entirely — content read as left-aligned
because the left 260px was visually empty (no counterbalance).

Also: pending rails used --wiz-border-sub (rgba 255/0.06) for a dashed
pattern, which rendered as invisible. User reported 'no lines between
balls when not selected'.

Fixes:
- Sidebar: subtle tint rgba(var(--wiz-ch), 0.015) + thin right border
  rgba(var(--wiz-ch), 0.08). Enough weight to balance the page without
  returning to 'menu' feel.
- Rail thickness: 1.5px → 2px for cleaner rendering
- Pending rail: solid rgba(var(--wiz-ch), 0.2) instead of invisible
  dashed. Always visible regardless of state.
- Border radius 1px on rails for softer edges.
- Applied consistently to desktop and tablet variants.
2026-04-15 22:51:39 +02:00
e3mrah
9e344078a2 feat(catalyst): wizard sidebar redesign + rename UI to "Corporate"
Sidebar:
- Removed distinct bg + border + backdrop-filter that made it read as a menu
- Added vertical connecting rail between step circles (solid gradient for
  done/current, dashed grey for pending) — clearly signals journey, not nav
- Distributed steps with flex: 1 grow on each item so the rail fills
  the full viewport height instead of clustering at top
- Active step circle has a soft pulse ring animation
- Progress bar integrated at rail's end (no hard divider)
- Same rail pattern applied to tablet variant

Rename (user-facing only — internal codename stays "catalyst"):
- index.html title: OpenOva Catalyst → OpenOva Corporate
- WizardLayout logo sub-label: Catalyst → Corporate
- AuthLayout brand text: OpenOva Catalyst → OpenOva Corporate
- AppLayout sidebar label: Catalyst → Corporate
- LoginPage subtitle: "Catalyst account" → "Corporate account"

Not renamed (internal): store names, CSS vars, repo paths, k8s namespace,
catalyst.openova.io domain — avoids SEO/DNS/infra churn.
2026-04-15 22:38:54 +02:00
e3mrah
7a9f308eb5 feat(catalyst): swap provisioning screen to DAG visualization
Replace the live-SSE phase+log view with a static DAG animation page
at /provision.html. Launch OpenOva now redirects there via
window.location. The old React ProvisionPage and /provision route are
removed. Backend POST /api/v1/deployments still fires so the API side
is unchanged; only the rendered provisioning view is swapped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 22:48:13 +02:00
e3mrah
dd2e9b1de3 fix(axon): handle missing credentials file in token refresh
Skip refresh gracefully when .credentials.json doesn't exist (e.g. CI
smoke test with no Claude auth mounted).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 15:08:28 +02:00
e3mrah
0cfe1bc361 feat(axon): add OAuth token refresh on startup and periodic timer
The Claude Agent SDK does not refresh OAuth tokens. Axon now:
1. Refreshes the token on startup before creating session pool
2. Runs a periodic refresh every 4 hours
3. Writes refreshed credentials to disk so session subprocesses use them

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 15:07:07 +02:00
e3mrah
2da38e9f7a feat(axon): CronJob for automatic OAuth token refresh
The Claude Agent SDK does not handle OAuth token refresh. Adds a CronJob
(every 4h) that refreshes the token via Anthropic's OAuth endpoint and
updates the K8s secret. Disabled by default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 14:44:40 +02:00