feat: add mcp_catalog toolset for on-demand MCP server discovery#2794
feat: add mcp_catalog toolset for on-demand MCP server discovery#2794dgageot wants to merge 10 commits into
Conversation
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity findings in pkg/tools/builtin/mcpcatalog/mcpcatalog.go — see inline comments for details.
missingAPIKeyEnv and ExpandMap perform external I/O (env.Get hits the OS keychain, ExpandMap invokes the expander interface) and were being called while holding t.mu write lock. This blocked all concurrent readers (Tools, handleSearch, handleList) for the full duration of the external round-trips. Both functions operate exclusively on 'server' from t.byID, which is read-only after construction and requires no mutex protection. Moving them before t.mu.Lock() eliminates the lock contention. Addresses review feedback on PR docker#2794.
Tools() snapshots t.enabled under RLock, then iterates outside the lock. If handleDisable runs concurrently and removes a server from t.enabled and calls Stop(), the Tools() iteration could still hold a reference to the same *tools.StartableToolSet and call Start() on it, potentially restarting a server the user explicitly disabled. Add a re-check of t.enabled membership before calling Start() to skip servers that were disabled after the snapshot was taken. Addresses review feedback on PR docker#2794.
3d9e7f4 to
cbbc61d
Compare
|
/review |
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity findings in — both in the newly added toolset code. No issues found in the other changed files.
| } | ||
|
|
||
| if !e.ts.IsStarted() { | ||
| if err := e.ts.Start(ctx); err != nil { |
There was a problem hiding this comment.
[MEDIUM] TOCTOU race: disabled server may be restarted via Start() after Stop() in Tools()
The stillEnabled guard in Tools() has a narrow but real race window:
Tools()acquires RLock, confirms the server is still int.enabled, then releases the lock.- Before
Start()is called on the next line,handleDisableacquires the write lock, removes the server fromt.enabled, releases the lock, and callsStop()on the same*tools.StartableToolSet. Tools()then callsStart()on the already-stopped object.
The PR comments acknowledge this race ("a concurrent handleDisable could remove the server from t.enabled and call Stop(), but we'd still hold a reference and potentially restart it") but rely on the stillEnabled check to prevent it. That check does not eliminate the window — it only narrows it to after the check passes.
Whether this is harmful depends on StartableToolSet's semantics: if Start() after Stop() reconnects (restarting a server the user just disabled) or panics, this is a real correctness bug. A user-visible symptom would be a server that was disabled continuing to surface its tools for one more Tools() cycle.
Suggested fix: Re-check stillEnabled immediately before calling Start() under a short-lived RLock, or have handleDisable set a "stopped" flag on the StartableToolSet that Start() respects.
| msg.WriteString("auth: OAuth — an authorization URL will be elicited on the next turn.\n") | ||
| case "api_key": | ||
| if len(missing) > 0 { | ||
| fmt.Fprintf(&msg, "auth: API key — WARNING: the following env vars are NOT set: %s. Set them and re-enable, otherwise tool calls will fail.\n", strings.Join(missing, ", ")) |
There was a problem hiding this comment.
[MEDIUM] Misleading message: "Set them and re-enable" won't work without disabling first
When enable_remote_mcp_server is called for an api_key server with missing environment variables, the server is added to t.enabled and then the user is told:
"Set them and re-enable, otherwise tool calls will fail."
However, if the user sets the env vars and calls enable_remote_mcp_server again, they will hit the early-return guard at the top of handleEnable:
if _, exists := t.enabled[id]; exists {
t.mu.Unlock()
return tools.ResultSuccess(fmt.Sprintf("server %q is already enabled", id)), nil
}Because the server is already in t.enabled, the second call returns "already enabled" without creating a new toolset or re-expanding the headers. The env var snapshot from the first call is permanently baked into the existing toolset.
The user would need to call disable_remote_mcp_server first, then re-enable — but the message does not say this.
Suggested fix: Change the message to "Set them, then disable and re-enable this server, otherwise tool calls will fail." to give accurate instructions.
persist managedOAuth flag so it applies to servers enabled after SetManagedOAuth was called, honor ctx.Err() when iterating enabled toolsets in Tools() so cancelled contexts return early, fix package doc comment that said three meta-tools instead of four, and add three tests for these fixes.
Adds a fifth tool to the mcp_catalog toolset for clearing persisted OAuth credentials. Useful when a token is revoked, scopes change, or the user signs in to the wrong account. For oauth servers: stops the active MCP session, removes the token from the keyring, and fires the tools-changed handler if enabled. For api_key/ none servers: no-op. Keyring errors surface as tool errors, not panics. Includes 5 new tests validating the auth reset behavior.
The new pkg/tools/builtin/mcpcatalog package embeds servers.json via //go:embed. The .dockerignore default-denies everything and only whitelists specific paths, so the embedded JSON would be missing from the Docker build context, causing the build to fail.
missingAPIKeyEnv and ExpandMap perform external I/O (env.Get hits the OS keychain, ExpandMap invokes the expander interface) and were being called while holding t.mu write lock. This blocked all concurrent readers (Tools, handleSearch, handleList) for the full duration of the external round-trips. Both functions operate exclusively on 'server' from t.byID, which is read-only after construction and requires no mutex protection. Moving them before t.mu.Lock() eliminates the lock contention. Addresses review feedback on PR docker#2794.
Tools() snapshots t.enabled under RLock, then iterates outside the lock. If handleDisable runs concurrently and removes a server from t.enabled and calls Stop(), the Tools() iteration could still hold a reference to the same *tools.StartableToolSet and call Start() on it, potentially restarting a server the user explicitly disabled. Add a re-check of t.enabled membership before calling Start() to skip servers that were disabled after the snapshot was taken. Addresses review feedback on PR docker#2794.
- Fix TOCTOU race in Tools(): re-check t.enabled after Start so a
concurrent disable can no longer leave a started session orphaned.
- Cache catalog parse with sync.OnceValues; Load() returns shallow
copies so test-only mutations stay isolated.
- Stable iteration order in Tools() (sort by id) to keep prompt caches
and TUI rendering steady across turns.
- Cap empty-query search at 25 results so the model doesn't get the
whole catalog dumped into its context window.
- Resolve catalog header ${VAR} placeholders directly through the env
provider (catalog uses bare ${VAR}, not ${env.VAR}); add a
post-expansion scan that surfaces any placeholder still left in the
expanded headers, even for auth.type=none servers.
- Replace misleading 'Set them and re-enable' message with explicit
disable + enable instructions (a plain second enable would short-circuit).
- Document why we pass nil for *RemoteOAuthConfig in NewRemoteToolset.
Tests added for the new behaviours: empty-query truncation, unresolved
header placeholder warnings, sync.OnceValues isolation, stable Tools()
ordering, and the disable/enable wording in the missing-env message.
d9f35e5 to
6cff5b0
Compare
Summary
Adds a new built-in toolset,
mcp_catalog, that lets agents discover and on-demand activate remote (streamable-http) MCP servers from the Docker MCP Catalog without declaring each one ahead of time in YAML.The catalog (45 servers) is embedded at build time from
https://desktop.docker.com/mcp/catalog/v3/catalog.json(filtered totype=remote AND remote.transport_type=streamable-http). The toolset surfaces five meta-tools to the model:search_remote_mcp_servers— case-insensitive fuzzy search over id / title / description / category / tags.list_remote_mcp_servers— show currently enabled servers.enable_remote_mcp_server— instantiate an*mcp.Toolsetfor the server (deferred — no network until tools are next enumerated).disable_remote_mcp_server— stop the toolset and remove its tools.reset_remote_mcp_server_auth— wipe persisted OAuth credentials so the next enable triggers a fresh authorization flow.How it works
*mcp.Toolsetis wrapped in atools.NewStartableso it gets the same lazy single-flight Start semantics as YAML-declared toolsets.Tools()lazily Start()s each enabled wrapped toolset on first enumeration. Treatsmcp.IsAuthorizationRequiredas expected deferral (silent on the non-interactive startup probe; user-facing OAuth dialog on the next interactive turn). Real Start failures get the StartableToolSet de-duplicated warning streak.mcp.remotetoolset.${ENV_VAR}placeholders in catalog headers (e.g. Apify'sAuthorization: Bearer ${APIFY_API_KEY}) resolve at enable time viajs.Expander— same code path the YAML toolsets use.reset_remote_mcp_server_authstops the inner toolset (so the live MCP session drops its now-stale token) before notifying the runtime and clearing the keyring entry. The notify fires even when the keyring removal fails, so the runtime's tool list never desyncs from the actualenabledstate.Configuration
See
examples/mcp_catalog.yamlfor a full example.What's in this branch
pkg/tools/builtin/mcpcatalog/{mcpcatalog.go, servers.go, servers.json}— toolset, embedded catalog (45 servers).pkg/teamloader/registry.go— registers the newmcp_catalogtoolset type.agent-schema.json— addsmcp_catalogto the toolset-type enums.examples/mcp_catalog.yaml— example agent usinganthropic/claude-sonnet-4-6.pkg/tools/builtin/mcpcatalog/mcpcatalog_test.go— 19 tests (load, search, lifecycle, race-friendly concurrent enable/disable, ctx cancellation, lazy-start regression test against a fake MCP server, auth-required deferral, reset_auth happy path / unknown id / no-op for non-oauth / disable-on-reset / error surfacing / notify-on-keyring-failure).Known limitation
The runtime's MCP-prompt discovery uses
tools.As[*mcp.Toolset]to find prompt-capable toolsets. Becausemcp_catalogis a container, prompts exposed by servers activated through this catalog are not surfaced via/prompts. Tools (the primary interface) work fine. Plumbing prompt discovery through container toolsets would require a new interface and changes topkg/runtime/runtime.go— out of scope for this PR.Validation
task build✅go vet ./...✅golangci-lint run ./...→ 0 issues ✅go test ./pkg/tools/builtin/mcpcatalog/... -race -count=1→ all 19 tests pass ✅