MCP Integration

agtOS implements both an MCP server (exposing agtOS capabilities to external AI clients) and an MCP client (connecting to external MCP servers to expand the agent’s tool set). Dynamic tool selection ensures only relevant tools are loaded into the LLM’s context window.

MCP Server

The agtOS MCP server exposes voice pipeline capabilities via the Model Context Protocol. External AI clients (Claude Desktop, Cursor, custom agents) can connect and call agtOS tools. Transport: Streamable HTTP (the current MCP specification standard) Port: 4100 (configurable via MCP_PORT) Endpoint: POST /mcp

Available Tools

The server exposes 10 built-in tools organized into four categories:

Voice Tools

voice.speak — Synthesize and play speech
voice.listen — Capture and transcribe audio

System Tools

system.health — Service health status
session.status — Active session info

Orchestration Tools

workflow.run — Execute a workflow
workflow.list — List workflows
schedule.create — Create a scheduled task
schedule.list — List scheduled tasks
schedule.cancel — Cancel a scheduled task

Memory Tools

memory.ask_about_user — Ask questions about the user via Dialectic reasoning

When dynamic tool selection is enabled, an additional discover_tools meta-tool is registered for searching the tool registry at runtime.

Connecting from Claude Desktop

Add agtOS to your Claude Desktop MCP configuration:

{
  "mcpServers": {
    "agtos": {
      "url": "http://localhost:4100/mcp"
    }
  }
}

Once connected, Claude Desktop can call agtOS tools directly. For example, asking Claude to “speak the weather forecast” will trigger voice.speak on your agtOS instance.

The MCP server runs in stateless mode — each request creates a fresh transport. No sticky sessions or persistent connections are required, making it compatible with load balancers and proxies.

Architecture

The server uses the MCP SDK’s StreamableHTTPServerTransport:

Each incoming POST /mcp creates a fresh transport instance (stateless mode)
A shared McpServer instance holds all tool registrations
GET /mcp handles SSE streaming for server-initiated notifications
DELETE /mcp handles session termination (returns 405 in stateless mode)

agtOS uses MCP SDK v1.27.x with Streamable HTTP transport. The older SSE transport is deprecated and not supported. See ADR-005 for the migration rationale.

MCP Client

The MCP client connects agtOS to external MCP servers, discovering their tools and making them available to the agent reasoning loop. This is how agtOS integrates with smart home servers, knowledge bases, file systems, web search, and other MCP-enabled services.

How It Works

Configure Servers

Define external MCP servers in your configuration with their endpoint URLs, optional tool prefix, and reconnection settings.

Auto-Discovery

On startup, the client connects to each server and calls tools/list to discover available tools.

Tool Registration

Discovered tools are registered in the shared ToolRegistry with their prefixed names (e.g., home.set_temperature for a tool from the “home” server).

Transparent Routing

When the agent loop invokes a tool, the client routes the call to the correct server automatically. The agent does not need to know which server hosts which tool.

Configuration

External MCP servers are configured in the orchestrator initialization:

const orchestrator = new VoicePipelineOrchestrator({
  mcp: {
    servers: [
      {
        name: 'home',
        url: 'http://localhost:5000/mcp',
        prefix: 'home',
        autoReconnect: true,
        reconnectInterval: 30000,
      },
    ],
  },
});

Reconnection

When autoReconnect is enabled, the client automatically re-attempts connections at the configured interval when a server drops. Tool registrations are refreshed on reconnection, so newly added tools on the external server become available automatically.

Metrics

Every tool call through the MCP client records latency and errors via the global metrics collector. Health endpoints surface MCP client performance alongside other services.

Dynamic Tool Selection

As the number of available tools grows (especially with multiple external MCP servers), loading all tool schemas into the LLM’s context becomes a significant problem. Each tool definition consumes 550-1,400 tokens, and a deployment with 50+ tools can burn 25-50% of the context window on tool definitions alone. agtOS solves this with dynamic tool selection, which loads only the most relevant tools for each request.

The Problem

Tool Count	Token Cost	Context Usage (200K window)
10 tools	~10,000 tokens	5%
50 tools	~50,000 tokens	25%
100 tools	~100,000 tokens	50%
200 tools	~200,000 tokens	100% (impossible)

The Solution

Before each LLM call, a lightweight routing step selects the relevant tools:

Embed the Query

Generate a dense vector embedding for the current user input.

Similarity Search

Find the top-K tools (default: 8) whose description embeddings are most similar to the query embedding using cosine similarity.

Threshold Filter

Only include tools above a minimum similarity threshold (default: 0.7).

Category Boost

If recent conversation context mentions specific categories, boost tools in those categories.

Schema Loading

Only the selected tools’ JSON schemas are included in the LLM request. All other tools exist in the registry but are invisible to the model.

This reduces tool token usage from potentially 100,000+ tokens to approximately 8,000 tokens (8 tools x ~1,000 tokens each) — an 80-90% reduction.

A small set of core tools (like discover_tools) are always included regardless of similarity score. If the agent needs a tool that was not selected, it can call discover_tools to search the registry and request specific tools for the next turn.

Tool Registry

The tool registry is an in-memory catalog of all available tools across the MCP server, MCP clients, and built-in capabilities. Each entry contains:

Tool name: Unique identifier (e.g., home.set_temperature)
Description: Natural language description for the LLM
Category tags: Semantic categories for routing (e.g., smart_home, climate)
Embedding: Dense vector for similarity search
Full schema: Complete JSON schema, stored but only loaded when selected
Source: Which MCP server provides this tool

Embeddings are generated at startup and refreshed when MCP servers report tool changes. The embedding + similarity search step adds approximately 10-20ms per request.

Integration Example

Here is a complete example showing agtOS as both an MCP server (receiving calls from Claude Desktop) and an MCP client (connecting to a smart home server):

Claude Desktop                    agtOS                     Smart Home MCP Server
     │                              │                              │
     │  POST /mcp                   │                              │
     │  tool: voice.speak           │                              │
     │  input: "Turn off lights"    │                              │
     │ ─────────────────────────▶   │                              │
     │                              │  Agent loop recognizes       │
     │                              │  smart home intent           │
     │                              │                              │
     │                              │  POST /mcp                   │
     │                              │  tool: home.lights_off       │
     │                              │ ─────────────────────────▶   │
     │                              │                              │
     │                              │  ◀───── { success: true }    │
     │                              │                              │
     │                              │  TTS: "The lights are off"   │
     │  ◀──── { audio: ... }        │                              │

Configuration Reference

# MCP Server
MCP_PORT=4100                      # Server port (default: 4100)

# Ollama (used for intent classification and embeddings)
OLLAMA_HOST=http://localhost:11434

Tool selection and embedding configuration are set via code config objects (ToolSelectionConfig, SemanticMemoryConfig) rather than environment variables. The defaults work well for most deployments:

Setting	Default	Description
`maxTools`	`8`	Max tools included per request (ADR-009)
`includeMetaTools`	`true`	Include `discover_tools` meta-tool

MCP Integration

MCP Server

Available Tools

Voice Tools

System Tools

Orchestration Tools

Memory Tools

Connecting from Claude Desktop

Architecture

MCP Client

How It Works

Configuration

Reconnection

Metrics

Dynamic Tool Selection

The Problem

The Solution

Tool Registry

Integration Example

Configuration Reference

What’s next

MCP Tools Reference

CLI Reference

​MCP Server

​Available Tools

Voice Tools

System Tools

Orchestration Tools

Memory Tools

​Connecting from Claude Desktop

​Architecture

​MCP Client

​How It Works

​Configuration

​Reconnection

​Metrics

​Dynamic Tool Selection

​The Problem

​The Solution

​Tool Registry

​Integration Example

​Configuration Reference

​What’s next

MCP Tools Reference

CLI Reference

MCP Server

Available Tools

Connecting from Claude Desktop

Architecture

MCP Client

How It Works

Configuration

Reconnection

Metrics

Dynamic Tool Selection

The Problem

The Solution

Tool Registry

Integration Example

Configuration Reference

What’s next