feat(sandbox-controller): inject mcp.json config so agents auto-discover openova-sandbox-mcp (Refs #1986)

TBD-P4 audit Surface B / finding B1: NO MCP config file was injected
anywhere. Even after PR #1988 bundled agent binaries (B1) and PR #1992
wired the slug->binary spawn registry, the agents had zero discovery
mechanism for the openova-sandbox-mcp server. Customer opens Sandbox,
picks qwen-code, agent launches, agent has no idea where MCP lives.

This PR adds the foundation wire:

  - New per-Sandbox ConfigMap `sandbox-mcp-config` carrying a single
    `mcp.json` key in the canonical "mcpServers" schema.
  - The pty-server StatefulSet mounts the same ConfigMap key at every
    canonical agent-config path via subPath projections:
      * /workspace/.mcp.json            (project-level, claude-code)
      * /home/node/.claude.json         (user-level,    claude-code)
      * /home/node/.qwen/settings.json  (qwen-code; same shape as
                                         the gemini-cli fork it derives from)
      * /workspace/.cursor/mcp.json     (cursor-agent)
    Aider does not natively support MCP -- the mounts are inert there
    by design (no error path).
  - `kustomization.yaml` resources list extended to include the new
    ConfigMap so Flux applies it ahead of the pty-server StatefulSet
    (Kubernetes-side ConfigMap-as-volume mount waits for the resource
    to exist before the Pod starts).

mcp.json schema (matches the standard documented at
https://modelcontextprotocol.io/):

    {
      "mcpServers": {
        "openova-sandbox-mcp": {
          "command": "/usr/local/bin/openova-sandbox-mcp",
          "args": [],
          "env": {}
        }
      }
    }

Empty `env: {}` lets the MCP binary inherit the per-Sandbox env vars
the controller already plumbs (SANDBOX_*, LLM_GATEWAY_*, KEYCLOAK_*) so
credentials do NOT live in the ConfigMap.

HONEST DISCLOSURE -- this is FOUNDATION work:

  - The MCP binary must ALSO be installed INTO the pty-server
    agent-runner image at /usr/local/bin/openova-sandbox-mcp for the
    stdio child shape to resolve end-to-end. That is follow-up work
    tracked under TBD-P4 audit finding B2 (the existing
    deployment-mcp.yaml ships the binary as a standalone Deployment
    Pod; per the MCP main.go contract it is a stdio child of the agent
    and the Deployment shape CrashLoops on stdin EOF).
  - Until B2 ships, this config references a path that ENOENTs at
    spawn. The agent surfaces a clean "mcp server not found" error
    instead of the current silent no-discovery state -- a strict
    improvement, but not full Pillar-4 Phase 2 readiness.

Validation:
  go test ./core/controllers/sandbox/... -count=1                PASS
  helm template platform/sandbox/chart ...                       PASS
  gofmt: no new violations introduced (pre-existing field-alignment
    drift in Inputs unrelated to this PR).

Did NOT use kubectl --dry-run=server (per founder principle #15;
fresh helm-template-from-scratch only).

Chart / pin lockstep:
  platform/sandbox/chart/Chart.yaml           0.3.2 -> 0.3.3
  clusters/_template/bootstrap-kit/19a-bp-sandbox.yaml
                                              version: 0.3.2 -> 0.3.3

Refs #1986
This commit is contained in:
hatiyildiz 2026-05-20 05:54:30 +02:00
parent 9d9feccff7
commit d180714dc2
4 changed files with 198 additions and 5 deletions

View File

@ -68,6 +68,19 @@ spec:
chart:
spec:
chart: sandbox
# 0.3.3 (TBD-P4 B3 #1986, 2026-05-20): inject mcp.json config so
# agent CLIs (claude-code, qwen-code, cursor-agent) auto-discover
# the openova-sandbox-mcp server on session start. The renderer
# now emits `configmap-mcp-config.yaml` and the pty-server
# StatefulSet mounts it at every canonical per-agent config path
# via subPath projections (~/.claude.json, ~/.qwen/settings.json,
# ./.mcp.json, .cursor/mcp.json). FOUNDATION wire — pairs with
# TBD-P4 audit finding B2 (the MCP binary must also be installed
# INTO the pty-server agent-runner image). Until B2 lands, this
# config references a path that ENOENTs at spawn; agents surface a
# clean "mcp server not found" error instead of the current silent
# no-discovery state.
#
# 0.3.2 (TBD-V21 #2032, 2026-05-20): ship 4 residual MCP env vars
# not covered by PR #1987 — SANDBOX_TOKEN (P1; unblocks marketplace.*
# tools), SANDBOX_JWT_SECRET (P1; auth gate exits test-mode),
@ -85,7 +98,7 @@ spec:
# helper), not bare `newapi`. Pre-fix every Sovereign's
# sandbox-controller TokenMint POST returned `no such host`,
# blocking the canonical Pillar-4 qwen-code customer journey.
version: 0.3.2
version: 0.3.3
sourceRef:
kind: HelmRepository
name: bp-sandbox

View File

@ -279,8 +279,9 @@ func TestReconcile_HappyPath(t *testing.T) {
t.Errorf("happy path should not requeue: got %v", res)
}
// Wave 1 + Wave 8: 6 fixed + 1 kust + 2 repo PVCs + 4 wave-8 = 13.
expectedFiles := 6 + 1 + 2 + 4
// Wave 1 + Wave 8 + TBD-P4 B3: 6 fixed + 1 kust + 2 repo PVCs +
// 4 wave-8 runtime + 1 MCP-config ConfigMap (TBD-P4 B3 #1986) = 14.
expectedFiles := 6 + 1 + 2 + 4 + 1
if gs.createFiles != expectedFiles {
t.Errorf("expected %d file creates, got %d", expectedFiles, gs.createFiles)
}
@ -468,12 +469,48 @@ func TestReconcile_Wave8RuntimeShape(t *testing.T) {
"name: repo-acme-eventforge",
"mountPath: /workspace/acme-eventforge",
"name: repo-acme-internal-tools",
// TBD-P4 B3 (#1986) — MCP config ConfigMap volume + mounts at
// every canonical agent-config path so claude-code, qwen-code,
// and cursor-agent all auto-discover openova-sandbox-mcp without
// any user-typed config. ASSERTING ALL four mount paths so any
// future renderer change that drops one is caught at test time.
"name: mcp-config",
"mountPath: /workspace/.mcp.json",
"mountPath: /home/node/.claude.json",
"mountPath: /home/node/.qwen/settings.json",
"mountPath: /workspace/.cursor/mcp.json",
"subPath: mcp.json",
"name: sandbox-mcp-config",
} {
if !strings.Contains(ss, want) {
t.Errorf("statefulset-pty-server.yaml missing %q", want)
}
}
// TBD-P4 B3 (#1986) — the MCP config ConfigMap MUST be rendered as
// a sibling file under the Gitea prefix. The pty-server StatefulSet
// references it by name (`sandbox-mcp-config`) via a configMap
// volume source; missing this ConfigMap = pty-server Pod stays in
// ContainerCreating with FailedMount.
cm := get("configmap-mcp-config.yaml")
for _, want := range []string{
"kind: ConfigMap",
"name: sandbox-mcp-config",
"namespace: sandbox-ceo-at-acme-com",
"openova.io/sandbox: emrah",
`openova.io/sandbox-mcp-config-version: "v1"`,
"mcp.json: |",
`"mcpServers"`,
`"openova-sandbox-mcp"`,
`"command": "/usr/local/bin/openova-sandbox-mcp"`,
`"args": []`,
`"env": {}`,
} {
if !strings.Contains(cm, want) {
t.Errorf("configmap-mcp-config.yaml missing %q", want)
}
}
dep := get("deployment-mcp.yaml")
for _, want := range []string{
"kind: Deployment",
@ -604,6 +641,11 @@ func TestReconcile_Wave8RuntimeShape(t *testing.T) {
"service-pty-server.yaml",
"deployment-mcp.yaml",
"httproute-pty-server.yaml",
// TBD-P4 B3 (#1986) — the MCP config ConfigMap MUST be listed
// in the kustomization so Flux applies it. Without this entry
// the ConfigMap never lands in the cluster and the pty-server
// Pod sits in ContainerCreating with FailedMount.
"configmap-mcp-config.yaml",
} {
if !strings.Contains(kust, want) {
t.Errorf("kustomization.yaml missing %q", want)

View File

@ -256,6 +256,84 @@ stringData:
placeholder: ""
`
// mcpConfigMapTemplate renders the canonical `mcp.json` config that
// agent CLIs (claude-code, qwen-code, cursor-agent, …) read on session
// start to auto-discover the `openova-sandbox-mcp` server.
//
// TBD-P4 B3 (#1986) — Pillar-4 audit Surface B / finding B1 caught that
// NO MCP config file is injected anywhere. Even after PR #1988 bundled
// the agent binaries (B1) and PR #1992 wired slug→binary spawn (the
// other B3), the agents had zero discovery for the MCP server. This
// ConfigMap closes that gap.
//
// Schema is the canonical "claude-code / standard MCP" shape:
//
// {
// "mcpServers": {
// "openova-sandbox-mcp": {
// "command": "/usr/local/bin/openova-sandbox-mcp",
// "args": [],
// "env": {}
// }
// }
// }
//
// The MCP binary path matches the canonical install location the MCP
// Dockerfile uses (products/sandbox/mcp-server/Dockerfile:46). NOTE:
// for the stdio child shape to work end-to-end, the MCP binary must
// also be installed INTO the pty-server agent-runner image — that is
// follow-up work (TBD-P4 audit B2, separate PR). This ConfigMap is the
// FOUNDATION wire: when B2 lands, the journey works without further
// controller changes.
//
// The agents pick their config up from multiple paths:
// - claude-code: project-level `./.mcp.json` (CWD) + user-level
// `~/.claude.json` with a `mcpServers` key
// - qwen-code: `~/.qwen/settings.json` with `mcpServers` (qwen-code
// is a fork of gemini-cli; same shape)
// - cursor-agent: project-level `.cursor/mcp.json`
//
// We mount the SAME ConfigMap key at all canonical paths via multiple
// volumeMount entries. Empty `env: {}` lets the MCP binary inherit the
// per-Sandbox env vars the controller already plumbs (SANDBOX_*,
// LLM_GATEWAY_*, etc.) so credentials do NOT live in the ConfigMap.
const mcpConfigMapTemplate = `apiVersion: v1
kind: ConfigMap
metadata:
name: sandbox-mcp-config
namespace: {{ .NamespaceName }}
labels:
openova.io/sandbox: {{ .Name }}
openova.io/sandbox-owner: {{ .OwnerUID }}
openova.io/managed-by: catalyst
app.kubernetes.io/name: sandbox-mcp-config
app.kubernetes.io/component: mcp-config
annotations:
openova.io/sandbox-mcp-config-version: "v1"
data:
# Canonical MCP config per the standard "mcpServers" schema documented
# at https://modelcontextprotocol.io/. Claude Code, qwen-code, and
# cursor-agent all read this shape; aider does not natively support
# MCP (no-op for that agent, by design).
#
# TBD-P4 B3 (#1986) foundation wire. Pairs with TBD-P4 audit B2:
# the MCP binary must be installed INTO the pty-server agent-runner
# image at /usr/local/bin/openova-sandbox-mcp. Until B2 ships the
# binary into the image, this config will reference a path that
# ENOENTs at spawn the agent surfaces a clean "mcp server not found"
# error rather than the current silent-no-discovery state.
mcp.json: |
{
"mcpServers": {
"openova-sandbox-mcp": {
"command": "/usr/local/bin/openova-sandbox-mcp",
"args": [],
"env": {}
}
}
}
`
// newapiTokenSecretTemplate renders the per-Sandbox NewAPI bearer
// Secret (Wave 9). Materialized into the Org vcluster's
// sandbox-<owner-uid> namespace by Flux; Wave 8's pty-server
@ -388,6 +466,35 @@ spec:
- name: repo-{{ .Slug }}
mountPath: /workspace/{{ .Slug }}
{{- end }}
# TBD-P4 B3 (#1986) MCP config mounts. ConfigMap
# sandbox-mcp-config carries a single mcp.json key in the
# canonical "mcpServers" schema. We project it at every
# canonical agent-config path so claude-code (user-level
# ~/.claude.json + project ./.mcp.json), qwen-code
# (~/.qwen/settings.json), and cursor-agent (.cursor/mcp.json)
# all auto-discover the openova-sandbox-mcp server without
# any user-typed config. Aider does not natively support MCP
# so the mounts are inert there (by design).
#
# subPath is used so each mount stays a single file (not a
# whole directory) and does NOT shadow other entries the
# agent might write into the same parent dir at runtime.
- name: mcp-config
mountPath: /workspace/.mcp.json
subPath: mcp.json
readOnly: true
- name: mcp-config
mountPath: /home/node/.claude.json
subPath: mcp.json
readOnly: true
- name: mcp-config
mountPath: /home/node/.qwen/settings.json
subPath: mcp.json
readOnly: true
- name: mcp-config
mountPath: /workspace/.cursor/mcp.json
subPath: mcp.json
readOnly: true
readinessProbe:
httpGet:
path: /healthz
@ -418,6 +525,14 @@ spec:
persistentVolumeClaim:
claimName: repo-{{ .Slug }}
{{- end }}
# TBD-P4 B3 (#1986) MCP config ConfigMap source. Projected at
# multiple agent-canonical paths via the volumeMounts above.
- name: mcp-config
configMap:
name: sandbox-mcp-config
items:
- key: mcp.json
path: mcp.json
terminationGracePeriodSeconds: 30
`
@ -707,6 +822,7 @@ resources:
{{- range .RepoPaths }}
- {{ . }}
{{- end }}
- configmap-mcp-config.yaml
- statefulset-pty-server.yaml
- service-pty-server.yaml
- deployment-mcp.yaml
@ -933,6 +1049,14 @@ func Render(in Inputs) (map[string][]byte, error) {
"service-pty-server.yaml": ptyServerServiceTemplate,
"deployment-mcp.yaml": mcpDeploymentTemplate,
"httproute-pty-server.yaml": httpRouteTemplate,
// TBD-P4 B3 (#1986) — `configmap-mcp-config.yaml` carries the
// canonical `mcp.json` that agent CLIs read on session start to
// auto-discover openova-sandbox-mcp. The pty-server StatefulSet
// mounts this ConfigMap at every canonical per-agent path
// (~/.claude.json, ~/.qwen/settings.json, ./.mcp.json,
// .cursor/mcp.json). See mcpConfigMapTemplate for the full
// design discussion.
"configmap-mcp-config.yaml": mcpConfigMapTemplate,
} {
buf, err := renderTemplate(path, raw, rctx)
if err != nil {

View File

@ -24,6 +24,20 @@ annotations:
# (see issue #181 + docs/BLUEPRINT-AUTHORING.md §11.1).
catalyst.openova.io/no-upstream: "true"
catalyst.openova.io/smoke-render-mode: "default-off"
# 0.3.3 (TBD-P4 B3 #1986, 2026-05-20): inject mcp.json config so agent
# CLIs (claude-code, qwen-code, cursor-agent) auto-discover the
# openova-sandbox-mcp server on session start. New `configmap-mcp-config.yaml`
# carries the canonical "mcpServers" schema; the pty-server StatefulSet
# mounts it at every canonical agent-config path (~/.claude.json,
# ~/.qwen/settings.json, ./.mcp.json, .cursor/mcp.json) via subPath
# projections. FOUNDATION work: pairs with TBD-P4 audit finding B2 —
# the MCP binary itself must also be installed INTO the pty-server
# agent-runner image at /usr/local/bin/openova-sandbox-mcp for the
# stdio child shape to resolve end-to-end. Until B2 lands, this config
# references a path that ENOENTs at spawn; the agent surfaces a clean
# "mcp server not found" error instead of the current silent
# no-discovery state.
#
# 0.3.2 (TBD-V21 #2032, 2026-05-20): ship the 4 remaining MCP env-var
# residuals not covered by PR #1987 — manifests.go now emits
# `SANDBOX_TOKEN` (mounted from the per-Sandbox Secret's
@ -58,8 +72,8 @@ annotations:
# `helm template newapi platform/newapi/chart/ -s templates/service.yaml`
# → metadata.name: newapi-bp-newapi. Walker on t38 (chart 1.4.216,
# substrate be4f78bc872e2c56) caught the live regression.
version: 0.3.2
appVersion: "0.3.2"
version: 0.3.3
appVersion: "0.3.3"
keywords:
- catalyst
- sandbox