Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agtos.ai/llms.txt

Use this file to discover all available pages before exploring further.

agtOS is an open-source platform that turns AI models into voice-enabled agents. It handles the hard parts — real-time audio processing, model routing, tool execution, and persistent memory — so you can focus on building experiences.

What makes agtOS different

Voice-native, not voice-added

Built from the ground up for voice. The cascade pipeline (STT → LLM → TTS) runs with sub-second latency, sentence-level streaming, and barge-in support.

Local-first architecture

In-process speech engine (sherpa-onnx), local model routing (Ollama), and optional cloud. Works offline — cloud is an enhancement, not a dependency.

Protocol-agnostic

Built on MCP (Model Context Protocol) with A2A readiness. Tools are defined once and work across voice, chat, CLI, and external AI clients.

Progressive infrastructure

Start with just Node.js. Add Redis for memory and scheduling. Add Ollama for local AI. Each piece unlocks more capabilities without breaking what works.

System overview

┌─────────────────────────────────────────────────────────────┐
│                     agtOS Server                            │
│                                                             │
│  :3000 Voice          :4100 MCP           :4102 API         │
│  ┌──────────┐         ┌──────────┐        ┌──────────┐     │
│  │WebSocket │         │Streamable│        │REST API  │     │
│  │Audio     │         │HTTP      │        │Dashboard │     │
│  │WebRTC   │         │10 Tools  │        │Metrics   │     │
│  └──────────┘         └──────────┘        └──────────┘     │
│        │                    │                   │           │
│        └────────────┬───────┴───────────────────┘           │
│                     │                                       │
│              ┌──────┴──────┐                                │
│              │ Orchestrator │                                │
│              │              │                                │
│              │ Model Router │◄── Ollama (local)              │
│              │ Agent Loop   │◄── Claude / OpenAI (cloud)     │
│              │ Tool Registry│◄── OpenRouter (aggregator)     │
│              │ Memory       │◄── Redis (sessions, vectors)   │
│              └─────────────┘                                │
└─────────────────────────────────────────────────────────────┘

Key components

ComponentWhat it doesRequired?
Voice PipelineSTT, TTS, VAD — in-process or externalIncluded
Model RouterRoutes requests to the best model (local or cloud)Included
Agent LoopMulti-step tool execution with progress streamingIncluded
Memory SystemWorking, episodic, and semantic memory tiersWorking always; episodic/semantic need Redis
MCP ServerExposes 10 tools to external AI clientsIncluded
MCP ClientConnects to external MCP servers for tool discoveryIncluded
Device RegistryManages ESP32, browser, CLI, and custom devicesNeeds Redis
Task SchedulerCron, one-time, and interval task schedulingNeeds Redis
Web Dashboard17-page management UIIncluded
Desktop AppTauri 2 with system tray and global PTT hotkeySeparate download
CLI9 commands for setup, management, and interactionIncluded

Requirements

ComponentVersionPurpose
Node.js22+Required runtime
Redis7.2+ with RediSearchMemory, scheduling, devices (optional)
OllamaLatestLocal AI models and embeddings (optional)
DockerLatestConvenient Redis management (optional)

Get started

Download the App

Install the desktop app and start chatting in minutes. No terminal required.

Developer Setup

Clone, install, and run from source with full CLI and API access.

Use Cases

Personal assistant, smart home hub, developer tool, IoT voice device.

FAQ

Common questions about setup, voice, memory, privacy, and integration.