MCP Server
The agtOS MCP server exposes voice pipeline capabilities via the Model Context Protocol. External AI clients (Claude Desktop, Cursor, custom agents) can connect and call agtOS tools. Transport: Streamable HTTP (the current MCP specification standard) Port: 4100 (configurable viaMCP_PORT)
Endpoint: POST /mcp
Available Tools
The server exposes 9 tools organized into three categories:Voice Tools
voice.speak— Synthesize and play speechvoice.listen— Capture and transcribe audio
System Tools
system.health— Service health statussession.status— Active session info
Orchestration Tools
workflow.run— Execute a workflowworkflow.list— List workflowsschedule.create— Create a scheduled taskschedule.list— List scheduled tasksschedule.cancel— Cancel a scheduled task
Connecting from Claude Desktop
Add agtOS to your Claude Desktop MCP configuration:voice.speak on your agtOS instance.
The MCP server runs in stateless mode — each request creates a fresh transport. No sticky sessions or persistent connections are required, making it compatible with load balancers and proxies.
Architecture
The server uses the MCP SDK’sStreamableHTTPServerTransport:
- Each incoming
POST /mcpcreates a fresh transport instance (stateless mode) - A shared
McpServerinstance holds all tool registrations GET /mcphandles SSE streaming for server-initiated notificationsDELETE /mcphandles session termination (returns 405 in stateless mode)
agtOS uses MCP SDK v1.27.x with Streamable HTTP transport. The older SSE transport is deprecated and not supported. See ADR-005 for the migration rationale.
MCP Client
The MCP client connects agtOS to external MCP servers, discovering their tools and making them available to the agent reasoning loop. This is how agtOS integrates with smart home servers, knowledge bases, file systems, web search, and other MCP-enabled services.How It Works
Configure Servers
Define external MCP servers in your configuration with their endpoint URLs, optional tool prefix, and reconnection settings.
Auto-Discovery
On startup, the client connects to each server and calls
tools/list to discover available tools.Tool Registration
Discovered tools are registered in the shared
ToolRegistry with their prefixed names (e.g., home.set_temperature for a tool from the “home” server).Configuration
Reconnection
WhenautoReconnect is enabled, the client automatically re-attempts connections at the configured interval when a server drops. Tool registrations are refreshed on reconnection, so newly added tools on the external server become available automatically.
Metrics
Every tool call through the MCP client records latency and errors via the global metrics collector. Health endpoints surface MCP client performance alongside other services.Dynamic Tool Selection
As the number of available tools grows (especially with multiple external MCP servers), loading all tool schemas into the LLM’s context becomes a significant problem. Each tool definition consumes 550-1,400 tokens, and a deployment with 50+ tools can burn 25-50% of the context window on tool definitions alone. agtOS solves this with dynamic tool selection, which loads only the most relevant tools for each request.The Problem
| Tool Count | Token Cost | Context Usage (200K window) |
|---|---|---|
| 10 tools | ~10,000 tokens | 5% |
| 50 tools | ~50,000 tokens | 25% |
| 100 tools | ~100,000 tokens | 50% |
| 200 tools | ~200,000 tokens | 100% (impossible) |
The Solution
Before each LLM call, a lightweight routing step selects the relevant tools:Similarity Search
Find the top-K tools (default: 8) whose description embeddings are most similar to the query embedding using cosine similarity.
Category Boost
If recent conversation context mentions specific categories, boost tools in those categories.
Tool Registry
The tool registry is an in-memory catalog of all available tools across the MCP server, MCP clients, and built-in capabilities. Each entry contains:- Tool name: Unique identifier (e.g.,
home.set_temperature) - Description: Natural language description for the LLM
- Category tags: Semantic categories for routing (e.g.,
smart_home,climate) - Embedding: Dense vector for similarity search
- Full schema: Complete JSON schema, stored but only loaded when selected
- Source: Which MCP server provides this tool