Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agtos.ai/llms.txt

Use this file to discover all available pages before exploring further.

All REST API endpoints are served on the health server (default port 4102) under the /api prefix. Responses are JSON with Content-Type: application/json. CORS headers are applied at the server level. Base URL: http://<host>:4102/api
When the AGTOS_API_KEY environment variable is set, all /api/* endpoints require a Bearer token. See Authentication below.

Health

Comprehensive health status from all registered service checkers (Redis, STT, TTS, Ollama, Claude, MCP).

Response

200 OK — All services healthy.
{
  "status": "healthy",
  "services": {
    "redis": { "status": "healthy", "responseTime": 2 },
    "stt-speaches": { "status": "healthy", "responseTime": 45 },
    "tts-speaches": { "status": "healthy", "responseTime": 38 },
    "ollama": { "status": "healthy", "responseTime": 12 },
    "mcp-server": { "status": "healthy", "responseTime": 5 }
  },
  "timestamp": 1711612800000
}
503 Service Unavailable — One or more services degraded.
{
  "error": "Health check failed",
  "message": "Redis connection refused"
}
Aggregated metrics summary from the internal metrics collector (request counts, error rates, latency percentiles).

Response

200 OK
{
  "requests": { "total": 1420, "perMinute": 15 },
  "errors": { "total": 3, "perMinute": 0 },
  "latency": { "p50": 120, "p95": 450, "p99": 800 },
  "uptime": 3621.45
}
Health status for a single registered service. Returns the same shape as the corresponding entry in GET /api/health, but for one service only.

Path Parameters

serviceName
string
required
Service name (e.g., redis, ollama, stt-sherpa-onnx, tts-sherpa-onnx, mcp-server, memory-maintenance, memory-semantic, nli-pipeline, provider-claude, provider-openai, provider-ollama, provider-openrouter).

Response

200 OK — Service is healthy.
{
  "status": "healthy",
  "responseTime": 2
}
503 Service Unavailable — Service is degraded.
{
  "status": "unhealthy",
  "error": "Connection refused"
}
404 Not Found — No such service registered.
{
  "error": "Unknown service: foo"
}
Prometheus-format metrics endpoint for integration with Prometheus, Grafana, and other monitoring tools. Returns all collected metrics in the standard text exposition format.

Response

200 OK (text/plain; version=0.0.4)
# HELP agtos_http_requests_total Total HTTP requests
# TYPE agtos_http_requests_total counter
agtos_http_requests_total{method="GET",path="/api/health",status="200"} 142
agtos_http_requests_total{method="POST",path="/api/chat",status="200"} 38

# HELP agtos_http_request_duration_seconds HTTP request latency
# TYPE agtos_http_request_duration_seconds histogram
agtos_http_request_duration_seconds_bucket{le="0.1"} 120
agtos_http_request_duration_seconds_bucket{le="0.5"} 155

# HELP agtos_nli_inferences_total NLI cross-encoder verdicts
# TYPE agtos_nli_inferences_total counter
agtos_nli_inferences_total{result="contradiction"} 12
agtos_nli_inferences_total{result="neutral"} 230
agtos_nli_inferences_total{result="entailment"} 58

# HELP agtos_provider_catalog_fetch_total Provider catalog refresh attempts
# TYPE agtos_provider_catalog_fetch_total counter
agtos_provider_catalog_fetch_total{provider="claude",status="success"} 5

# HELP agtos_provider_catalog_models_count Current model count per provider
# TYPE agtos_provider_catalog_models_count gauge
agtos_provider_catalog_models_count{provider="openrouter"} 342
The /metrics endpoint is served at the root path (not under /api). It does not require authentication even when AGTOS_API_KEY is set.

Sessions

List active voice sessions with connection metadata.

Response

200 OK
{
  "sessions": [
    {
      "id": "session-a1b2c3d4",
      "connectionId": "conn-xyz",
      "isActive": true,
      "isSpeaking": false,
      "lastActivity": 1711612800000
    }
  ],
  "count": 1
}
503 Service Unavailable — Voice pipeline not initialized.
{
  "error": "Voice pipeline not available"
}

Voice Status

Voice pipeline availability and active session count. Always returns 200, even when the pipeline is unavailable, so the dashboard can degrade gracefully.

Response

200 OK — Pipeline available.
{
  "available": true,
  "pipeline": "cascade",
  "activeSessions": 2
}
200 OK — Pipeline not wired up.
{
  "available": false
}

Memory

Retrieve recent episodic memory entries.

Query Parameters

limit
number
default:"20"
Max results to return (1—100).

Response

200 OK
{
  "episodes": [
    {
      "id": "ep-abc123",
      "summary": "User asked about weather forecast",
      "keywords": ["weather", "forecast"],
      "topic": "weather",
      "timestamp": 1711612800000,
      "importance": 0.7,
      "type": "conversation",
      "score": 0.95,
      "matchType": "keyword"
    }
  ],
  "count": 1,
  "available": true
}
503 Service Unavailable — Memory system not available.
{
  "error": "Memory system not available"
}

Scheduler

List all scheduled tasks.

Response

200 OK
{
  "tasks": [
    {
      "id": "task-abc123",
      "name": "Morning briefing",
      "status": "active",
      "schedule": { "type": "cron", "expression": "0 7 * * *" },
      "action": {
        "eventTopic": "briefing.morning",
        "payload": {}
      },
      "nextRunAt": 1711699200000,
      "lastRunAt": 1711612800000,
      "runCount": 5,
      "createdAt": 1711526400000
    }
  ],
  "count": 1
}
503 Service Unavailable — Scheduler not available (Redis down or not configured).
{
  "error": "Scheduler not available"
}
Create a new scheduled task.

Request Body

name
string
required
Human-readable task name.
scheduleType
string
required
One of: cron, once, interval.
expression
string
Cron expression. Required when scheduleType is cron.
atTimestamp
number
Unix ms timestamp. Required when scheduleType is once.
intervalMs
number
Interval in milliseconds. Required when scheduleType is interval.
eventTopic
string
required
Event bus topic published when the task fires.
payload
object
Optional payload included in the fired event.

Schedule Type Examples

{
  "name": "Every 5 minutes",
  "scheduleType": "cron",
  "expression": "*/5 * * * *",
  "eventTopic": "check.status"
}

Response

201 Created
{
  "task": {
    "id": "task-def456",
    "name": "Morning briefing",
    "status": "active",
    "schedule": { "type": "cron", "expression": "0 7 * * *" },
    "action": {
      "eventTopic": "briefing.morning",
      "payload": { "workflowId": "morning-routine" }
    },
    "nextRunAt": 1711699200000,
    "runCount": 0,
    "createdAt": 1711612800000
  }
}
400 Bad Request — Validation errors.
{
  "error": "Missing or invalid field: name"
}
{
  "error": "Cron schedule requires an expression field"
}
503 Service Unavailable — Scheduler not available.
Cancel a scheduled task by ID.

Path Parameters

id
string
required
The task ID to cancel.

Response

200 OK
{
  "success": true,
  "taskId": "task-def456"
}
400 Bad Request — Invalid ID format.
{
  "error": "Invalid task ID format"
}
404 Not Found — Task does not exist.
{
  "error": "Failed to cancel task",
  "message": "Task not found"
}
503 Service Unavailable — Scheduler not available.

Workflows

List all registered workflow definitions.

Response

200 OK
{
  "workflows": [
    {
      "id": "morning-routine",
      "name": "Morning Routine",
      "description": "Plays morning briefing and checks calendar",
      "stepCount": 3,
      "steps": [
        { "id": "step-1", "name": "Check Weather", "type": "ACTION" },
        { "id": "step-2", "name": "Read Calendar", "type": "ACTION" },
        { "id": "step-3", "name": "Speak Summary", "type": "ACTION" }
      ]
    }
  ],
  "count": 1
}
503 Service Unavailable — Workflow engine not available.
{
  "error": "Workflow engine not available"
}
Trigger execution of a registered workflow.

Path Parameters

id
string
required
The workflow ID to execute.

Request Body (optional)

input
object
Optional input payload for the workflow.
{
  "input": {
    "location": "San Francisco",
    "units": "metric"
  }
}

Response

200 OK
{
  "execution": {
    "id": "exec-789",
    "workflowId": "morning-routine",
    "state": "completed",
    "startTime": 1711612800000,
    "endTime": 1711612802000,
    "output": { "summary": "Sunny, 72F. No meetings today." },
    "error": null
  }
}
400 Bad Request — Invalid workflow ID or execution failure.
{
  "error": "Invalid workflow ID format"
}
{
  "error": "Workflow execution failed",
  "message": "Step 'Check Weather' timed out"
}
404 Not Found — Workflow not registered.
{
  "error": "Workflow not found",
  "workflowId": "nonexistent"
}
503 Service Unavailable — Workflow engine not available.

System Info

System information including uptime, runtime versions, memory usage, and port configuration.

Response

200 OK
{
  "uptime": 3621.45,
  "nodeVersion": "v22.12.0",
  "platform": "linux",
  "arch": "x64",
  "memoryUsage": {
    "rss": 85983232,
    "heapTotal": 42598400,
    "heapUsed": 38291456,
    "external": 2845696,
    "arrayBuffers": 1048576
  },
  "ports": {
    "voice": 3000,
    "mcp": 4100,
    "health": 4102
  },
  "timestamp": 1711612800000
}

Chat

Text-based chat endpoint. Routes user text through the agent reasoning loop (model router with tool execution) and returns the response.

Request Body

text
string
required
The user’s message text (must be non-empty).
sessionId
string
Session ID for conversation continuity. Omit for stateless.
{
  "text": "What's the weather like today?",
  "sessionId": "session-abc123"
}

Response

200 OK
{
  "text": "Based on current conditions, it's sunny and 72 degrees in your area.",
  "sessionId": "session-abc123",
  "metadata": {
    "stepCount": 2,
    "toolCallCount": 1,
    "durationMs": 1450
  }
}
400 Bad Request — Missing or empty text.
{
  "error": "Missing or empty required field: text"
}
500 Internal Server Error — Chat processing failed.
{
  "error": "Chat processing failed",
  "message": "Model provider timeout"
}
503 Service Unavailable — Voice pipeline not available.
{
  "error": "Voice pipeline not available"
}

Chat Streaming

SSE streaming chat endpoint. Returns a text/event-stream response with real-time content tokens, thinking/reasoning blocks, tool call events, and a final metadata event. Used by the dashboard Chat page.

Request Body

text
string
required
The user’s message (must be non-empty, max 10,000 characters).
sessionId
string
Session ID for conversation continuity.
platform
string
Client platform identifier (defaults to web).
files
array
Images to include. Each entry: { content: string, mimeType: string, encoding: "base64" }.
{
  "text": "Explain how neural networks learn",
  "sessionId": "session-abc123",
  "files": [
    { "content": "base64...", "mimeType": "image/png", "encoding": "base64" }
  ]
}

SSE Events

The response is a stream of data: lines, each containing a JSON object with a type field:
{ "type": "content", "content": "Neural networks learn through" }
400 Bad Request — Missing or empty text.429 Too Many Requests — Rate limit exceeded.500 Internal Server Error — Chat processing failed.
Use fetch() with ReadableStream to consume this endpoint — native EventSource does not support POST requests. The thinking event type carries provider-agnostic reasoning output (Claude thinking, OpenAI reasoning, Ollama think tags, OpenRouter reasoning).

Conversation History

Retrieve conversation messages for a session. Returns an empty array when the session has expired or doesn’t exist.

Path Parameters

sessionId
string
required
The session ID to retrieve history for.

Response

200 OK
{
  "sessionId": "session-abc123",
  "sessionExpired": false,
  "messages": [
    {
      "role": "user",
      "content": "What is my name?",
      "timestamp": 1711612800000
    },
    {
      "role": "assistant",
      "content": "Your name is Alex.",
      "timestamp": 1711612802000
    }
  ],
  "count": 2
}
200 OK — Session expired or not found.
{
  "sessionId": "session-old",
  "sessionExpired": true,
  "messages": [],
  "count": 0
}

Tasks

Submit a background agent task. Accepts a topic string, routes it through the agent reasoning loop, and returns the result with a generated task ID.

Request Body

topic
string
required
The task topic/prompt (must be non-empty).
{
  "topic": "Summarize the latest news about AI safety"
}

Response

200 OK
{
  "taskId": "550e8400-e29b-41d4-a716-446655440000",
  "text": "Here is a summary of recent AI safety developments...",
  "metadata": {
    "stepCount": 3,
    "toolCallCount": 2,
    "durationMs": 4200
  }
}
400 Bad Request — Missing or empty topic.
{
  "error": "Missing or empty required field: topic"
}
500 Internal Server Error — Task processing failed.
{
  "error": "Task processing failed",
  "message": "Model provider timeout",
  "taskId": "550e8400-e29b-41d4-a716-446655440000"
}
503 Service Unavailable — Voice pipeline not available.

Voice Stats

Audio quality metrics including STT/TTS latency percentiles, audio chunk counts, and active session count. Returns zeroed stats when no data has been collected yet.

Response

200 OK
{
  "stt": { "p50": 120, "p95": 300, "p99": 500, "count": 42 },
  "tts": { "p50": 80, "p95": 200, "p99": 400, "count": 38 },
  "audioChunksIn": 1200,
  "audioChunksOut": 950,
  "requestsPerMinute": 15,
  "activeSessions": 2,
  "timestamp": 1711612800000
}

Memory Profile

Retrieve the user profile built from conversation history (name, communication style, patterns, preferences).

Response

200 OK
{
  "profile": {
    "name": "Alex",
    "communicationStyle": "concise, technical",
    "patterns": ["prefers metric units", "morning person"],
    "preferences": { "units": "metric", "language": "en" }
  }
}
503 Service Unavailable — Memory coordinator not available.
Update user profile fields.

Request Body

{
  "name": "Alex",
  "communicationStyle": "concise, technical",
  "patterns": ["prefers metric units"],
  "preferences": { "units": "metric" }
}
FieldTypeRequiredDescription
namestringNoUser display name (max 200 chars)
communicationStylestringNoDescription of communication preferences (max 500 chars)
patternsstring[]NoBehavioral patterns (max 50 items, 500 chars each)
preferencesobjectNoKey-value preference pairs (string values, max 500 chars)

Response

200 OK
{
  "updated": true
}
400 Bad Request — Validation failure.503 Service Unavailable — Memory coordinator not available.

Memory Conclusions

Retrieve conclusions drawn from conversation history by the Dialectic reasoning engine.

Query Parameters

type
string
Filter by conclusion type: explicit, inferred, corrected.
minConfidence
number
Minimum confidence threshold (0.0—1.0).

Response

200 OK
{
  "conclusions": [
    {
      "id": "conc-abc123",
      "text": "User prefers morning meetings",
      "type": "inferred",
      "confidence": 0.85,
      "sources": ["ep-001", "ep-003"]
    }
  ]
}
503 Service Unavailable — Memory coordinator not available.
Delete a specific conclusion by ID.

Path Parameters

id
string
required
The conclusion ID to delete.

Response

200 OK
{
  "deleted": true,
  "conclusionId": "conc-abc123"
}
503 Service Unavailable — Memory coordinator not available.

Memory Ask (Dialectic)

Ask a question about the user via the Dialectic reasoning engine. Gathers profile, conclusions, and episodes to synthesize an answer.

Request Body

question
string
required
The question to ask about the user (max 2,000 characters).
userId
string
Optional user ID scope.
{
  "question": "What are this user's preferred communication channels?"
}

Response

200 OK
{
  "answer": "Based on conversation history, the user prefers Slack for quick questions and email for detailed discussions.",
  "confidence": 0.78,
  "sources": ["ep-abc123", "ep-def456"]
}
400 Bad Request — Missing or invalid question.503 Service Unavailable — Memory coordinator not available.

Memory Maintenance

Trigger an on-demand memory maintenance sweep (memory lint). Runs the Dreamer’s six-step sweep (stale detection, confidence decay, redundancy merge, orphan flagging, contradiction detection, low-confidence pruning) and returns a MaintenanceReport. Gated by ResourceGuard — busy systems defer the run. See ADR-021.

Request Body

The body is validated with .strict() — unknown fields are rejected. The userId is resolved server-side from the profile manager and must not be passed in the request body (per ADR-025 Rule 3).
{}

Response

200 OK — Sweep completed. Returns the full MaintenanceReport.
{
  "timestamp": 1712534400000,
  "durationMs": 4321,
  "conclusionsExamined": 142,
  "episodesChecked": 580,
  "summary": {
    "contradictions": 2,
    "stale": 14,
    "orphans": 3,
    "redundant": 5,
    "pruned": 1,
    "decayed": 14,
    "danglingSources": 0,
    "contradictionPipeline": {
      "candidatesSelected": 210,
      "nliConfirmed": 18,
      "llmConfirmed": 2,
      "stageLatencyMs": { "stage1": 45, "stage2": 812, "stage3": 2104 }
    }
  },
  "issues": [
    {
      "type": "contradiction",
      "conclusionId": "conc-abc123",
      "action": "flagged",
      "description": "Conflicts with conc-def456 (NLI verdict: contradiction, confidence 0.92)"
    }
  ]
}
400 Bad Request — Validation failed (e.g., unknown fields in the body).503 Service Unavailable — errorCode: PROFILE_DISCONNECTED — Memory profile manager is not connected (Redis down, manager disconnected). Persistent — operator action required.
{
  "error": "Memory maintenance unavailable",
  "errorCode": "PROFILE_DISCONNECTED",
  "message": "Profile manager is not connected"
}
503 Service Unavailable — errorCode: RESOURCES_BUSY — ResourceGuard deferred this run (active sessions, high CPU load, Ollama VRAM contention). Transient — retry later. The reason field carries the guard’s diagnosis.
{
  "error": "Memory maintenance deferred",
  "errorCode": "RESOURCES_BUSY",
  "message": "Background work not safe right now",
  "reason": "active sessions: 1"
}
500 Internal Server ErrorDreamer.maintain() threw (genuine internal error, not a defer).
The CLI (agtos memory maintain) reads errorCode to decide between exit code 2 (transient, retry later) and exit code 3 (operator action required).
List recent MaintenanceReport entries. Backs the dashboard’s maintenance history widget. Reports are stored with a 30-day TTL under a sorted-set index capped at 200 entries.

Query Parameters

ParameterTypeDefaultDescription
limitnumber20Max reports to return (1—100)

Response

200 OK
{
  "reports": [
    {
      "timestamp": 1712534400000,
      "durationMs": 4321,
      "conclusionsExamined": 142,
      "episodesChecked": 580,
      "summary": { "contradictions": 2, "stale": 14, "orphans": 3, "redundant": 5, "pruned": 1, "decayed": 14 },
      "issues": []
    }
  ],
  "count": 1
}
500 Internal Server Error — Failed to list maintenance reports.503 Service Unavailable — Memory coordinator or profile manager not connected.
Fetch a single MaintenanceReport by its timestamp (Unix milliseconds).

Path Parameters

timestamp
number
required
Unix milliseconds timestamp of the report to fetch.

Response

200 OK — Returns the full MaintenanceReport (same shape as POST /api/memory/maintain).400 Bad Request — Invalid timestamp.404 Not Found — Report not found or expired (past the 30-day TTL).503 Service Unavailable — Memory coordinator or profile manager not connected.

Memory Import

Scan for available external AI tool memory sources (Claude, ChatGPT, etc.) that can be imported into agtOS.

Response

200 OK
{
  "sources": [
    {
      "name": "claude",
      "available": true,
      "memoryCount": 42,
      "path": "~/.claude/memory"
    }
  ]
}
503 Service Unavailable — Memory coordinator not available.
Import memories from external AI tool sources into the agtOS memory system.

Request Body

sources
string[]
Optional list of source names to import from. If omitted, imports from all available sources.
{
  "sources": ["claude"]
}

Response

200 OK
{
  "imported": 35,
  "skipped": 7,
  "errors": 0,
  "message": "Imported 35 memories from claude"
}
503 Service Unavailable — Memory coordinator not available.

Entities

List or search entities from the entity-centric memory system.

Query Parameters

q
string
Search entities by name (partial match).
type
string
Filter by entity type: person, place, organization, event, thing.
limit
number
default:"20"
Max results (1—100).

Response

200 OK
{
  "entities": [
    {
      "id": "ent-abc123",
      "name": "Alice",
      "type": "person",
      "aliases": ["Alice Smith"],
      "confidence": 0.92,
      "episodeCount": 15,
      "lastMentioned": 1711612800000
    }
  ],
  "count": 1
}
503 Service Unavailable — Entity manager not available (Redis down).
Entity counts grouped by type.

Response

200 OK
{
  "stats": {
    "person": 12,
    "place": 5,
    "organization": 3,
    "event": 8,
    "thing": 15
  },
  "total": 43
}
Get a single entity by ID with full details.

Path Parameters

id
string
required
The entity ID.

Response

200 OK
{
  "entity": {
    "id": "ent-abc123",
    "name": "Alice",
    "type": "person",
    "aliases": ["Alice Smith"],
    "confidence": 0.92,
    "episodeCount": 15,
    "lastMentioned": 1711612800000
  }
}
404 Not Found — Entity does not exist.
Update an entity’s name, aliases, or confidence.

Path Parameters

id
string
required
The entity ID.

Request Body

name
string
Updated entity name.
aliases
string[]
Updated alias list.
confidence
number
Updated confidence score (0.0—1.0).

Response

200 OK — Returns the updated entity.404 Not Found — Entity does not exist.
Soft-delete an entity.

Path Parameters

id
string
required
The entity ID.

Response

200 OK
{
  "deleted": true,
  "entityId": "ent-abc123"
}
404 Not Found — Entity does not exist.
Merge a duplicate entity into this entity. The target entity’s episodes, conclusions, and relationships are transferred to the primary entity.

Path Parameters

id
string
required
The primary entity ID (merge target).

Request Body

targetId
string
required
The duplicate entity ID to merge into the primary.
{
  "targetId": "ent-duplicate456"
}

Response

200 OK — Returns the merged primary entity.404 Not Found — Entity does not exist.
Episodes that mention this entity.

Path Parameters

id
string
required
The entity ID.

Query Parameters

limit
number
default:"20"
Max results (1—100).

Response

200 OK
{
  "entityId": "ent-abc123",
  "episodes": [
    {
      "id": "ep-xyz789",
      "summary": "User discussed project plans with Alice",
      "timestamp": 1711612800000
    }
  ],
  "count": 1
}
Conclusions that reference this entity.

Path Parameters

id
string
required
The entity ID.

Response

200 OK
{
  "entityId": "ent-abc123",
  "conclusions": [
    {
      "id": "conc-abc123",
      "text": "Alice is a software engineer at Acme Corp",
      "type": "explicit",
      "confidence": 0.9
    }
  ],
  "count": 1
}
Relationships involving this entity (as subject or object).

Path Parameters

id
string
required
The entity ID.

Response

200 OK
{
  "entityId": "ent-abc123",
  "relationships": [
    {
      "id": "rel-001",
      "subject": { "id": "ent-abc123", "name": "Alice" },
      "predicate": "works-at",
      "object": { "id": "ent-def456", "name": "Acme Corp" },
      "confidence": 0.85
    }
  ],
  "count": 1
}

Devices

Register a new device (ESP32, browser, CLI, MCP client, or custom).

Request Body

name
string
required
Device display name (max 200 chars).
deviceType
string
required
One of: esp32, browser, cli, mcp-client, custom.
platform
string
required
Platform identifier (max 100 chars).
capabilities
object
required
Device capability flags: audio, microphone, speaker, display, buttons, sensors (all boolean).
secret
string
Optional shared secret for device authentication (8—256 chars). Hashed with SHA-256.
firmwareVersion
string
Firmware version string (max 50 chars).
metadata
object
Arbitrary key-value metadata.
{
  "name": "Kitchen Speaker",
  "deviceType": "esp32",
  "platform": "esp32-s3",
  "capabilities": {
    "audio": true,
    "microphone": true,
    "speaker": true,
    "display": false,
    "buttons": false,
    "sensors": false
  },
  "secret": "my-device-secret",
  "firmwareVersion": "1.2.0"
}

Response

201 Created
{
  "device": {
    "id": "dev-abc123",
    "name": "Kitchen Speaker",
    "deviceType": "esp32",
    "platform": "esp32-s3",
    "status": "pending",
    "trustLevel": 2,
    "capabilities": { "audio": true, "microphone": true, "speaker": true, "display": false, "buttons": false, "sensors": false },
    "firmwareVersion": "1.2.0"
  }
}
409 Conflict — Device already registered.503 Service Unavailable — Device registry not available (Redis down).
List registered devices with optional filters.

Query Parameters

status
string
Filter by status: pending, active, suspended, revoked.
deviceType
string
Filter by device type: esp32, browser, cli, mcp-client, custom.
limit
number
Max results.
offset
number
Pagination offset.

Response

200 OK
{
  "devices": [
    {
      "id": "dev-abc123",
      "name": "Kitchen Speaker",
      "deviceType": "esp32",
      "platform": "esp32-s3",
      "status": "active",
      "trustLevel": 2,
      "capabilities": { "audio": true, "microphone": true, "speaker": true, "display": false, "buttons": false, "sensors": false }
    }
  ],
  "count": 1
}
503 Service Unavailable — Device registry not available.
Get a single device by ID.

Path Parameters

id
string
required
The device ID.

Response

200 OK
{
  "device": {
    "id": "dev-abc123",
    "name": "Kitchen Speaker",
    "deviceType": "esp32",
    "status": "active",
    "trustLevel": 2
  }
}
404 Not Found — Device does not exist.503 Service Unavailable — Device registry not available.
Update a device’s properties.

Path Parameters

id
string
required
The device ID.

Request Body

All fields are optional. Only provided fields are updated.
name
string
Updated device name.
status
string
New status: pending, active, suspended, revoked.
capabilities
object
Updated capability flags.
secret
string
New shared secret (8—256 chars).
firmwareVersion
string
Updated firmware version.

Response

200 OK — Returns the updated device.404 Not Found — Device does not exist.503 Service Unavailable — Device registry not available.
Remove a device from the registry.

Path Parameters

id
string
required
The device ID to remove.

Response

200 OK
{
  "success": true,
  "deviceId": "dev-abc123"
}
404 Not Found — Device does not exist.503 Service Unavailable — Device registry not available.
Authenticate a device by verifying its shared secret.

Path Parameters

id
string
required
The device ID.

Request Body

secret
string
required
The device’s shared secret.
{
  "secret": "my-device-secret"
}

Response

200 OK
{
  "authenticated": true,
  "device": {
    "id": "dev-abc123",
    "name": "Kitchen Speaker",
    "status": "active"
  }
}
401 Unauthorized — Invalid secret.
{
  "authenticated": false,
  "message": "Invalid device secret"
}
503 Service Unavailable — Device registry not available.
Link a device to a user account for personalized preferences.

Path Parameters

id
string
required
The device ID.

Request Body

userId
string
required
The user ID to link (max 200 chars).
{
  "userId": "user-abc123"
}

Response

200 OK
{
  "linked": true,
  "deviceId": "dev-abc123",
  "userId": "user-abc123"
}
503 Service Unavailable — User preferences not available.

User Preferences

Get user preferences (TTS voice, language, wake word, privacy settings).

Path Parameters

userId
string
required
The user ID.

Response

200 OK
{
  "preferences": {
    "ttsVoice": "af_heart",
    "ttsSpeed": 1.0,
    "language": "en",
    "wakeWord": "hey agtos",
    "privacySettings": {
      "storeTranscripts": true,
      "storeAudio": false
    }
  }
}
503 Service Unavailable — User preferences not available (Redis down).
Update user preferences.

Path Parameters

userId
string
required
The user ID.

Request Body

All fields are optional.
ttsVoice
string
Preferred TTS voice (max 100 chars).
ttsSpeed
number
Speech speed (0.5—2.0).
language
string
Preferred language code (max 10 chars).
wakeWord
string
Custom wake word (max 50 chars).
privacySettings
object
Privacy flags: storeTranscripts (boolean), storeAudio (boolean).
{
  "ttsVoice": "nova",
  "ttsSpeed": 1.2,
  "language": "en",
  "privacySettings": {
    "storeTranscripts": true,
    "storeAudio": false
  }
}

Response

200 OK — Returns the updated preferences.400 Bad Request — Validation failure.503 Service Unavailable — User preferences not available.

Setup Token

Retrieve the onboarding setup token. This token is generated at server startup and has a 30-minute TTL. It is used by the onboarding wizard to authenticate credential storage requests without requiring an API key.
This endpoint is localhost-only — requests from non-loopback addresses return 403 Forbidden.

Response

200 OK
{
  "token": "a1b2c3d4e5f6..."
}
403 Forbidden — Request not from localhost.
{
  "error": "Setup token only available from localhost"
}
404 Not Found — No setup token available (server started with AGTOS_API_KEY set, or token has expired).
{
  "error": "No setup token available"
}
The setup token is only generated when AGTOS_API_KEY is not set (onboarding mode). Once the user configures an API key, the setup token is no longer needed.
Auto-configure the slot registry from a mode preset. Used by the setup wizard to configure slots based on the user’s chosen mode and available providers.

Request Body

mode
string
required
Configuration mode: cloud, local, or hybrid.
cloudProvider
string
Cloud provider for cloud/hybrid modes: claude, openai, or openrouter.
ollamaModel
string
Ollama model for local/hybrid modes.
fallbackStrategy
string
Fallback strategy: cloud-backup, ollama-local, or none.
{
  "mode": "hybrid",
  "cloudProvider": "claude",
  "ollamaModel": "qwen3:7b",
  "fallbackStrategy": "cloud-backup"
}

Response

200 OK
{
  "slots": { "chat": { "provider": "claude", "model": "..." }, "..." : "..." },
  "restartRequired": true,
  "mode": "hybrid",
  "message": "Auto-configured 6 slots in hybrid mode. Restart to activate."
}
400 Bad Request — Invalid mode or missing required fields.
Reset the slot configuration to built-in defaults. Removes all custom slots and restores the default provider/model assignments.

Response

200 OK
{
  "slots": { "chat": { "provider": "claude", "model": "..." }, "..." : "..." },
  "restartRequired": true,
  "message": "Slot config reset. Restart the server to activate."
}

Credentials

Store encrypted credentials for a provider. Credentials are encrypted with AES-256-GCM before storage. Must provide either apiKey or setupToken.
This endpoint is localhost-only — requests from non-loopback addresses return 403 Forbidden. It is designed for the setup wizard and Settings UI running on the same machine.

Request Body

provider
string
required
Provider identifier: provider-anthropic, provider-openai, or provider-openrouter.
apiKey
string
API key (e.g., sk-ant-api03-... for Anthropic, sk-... for OpenAI, sk-or-v1-... for OpenRouter). Max 256 chars.
{
  "provider": "provider-anthropic",
  "apiKey": "sk-ant-api03-your-key"
}

Response

200 OK
{
  "success": true,
  "provider": "provider-anthropic"
}
400 Bad Request — Validation failure.403 Forbidden — Request not from localhost.
{
  "error": "Credential storage only available from localhost"
}
503 Service Unavailable — Credential manager not available.
Validate credentials against the actual provider API without storing them. Useful for the setup wizard and Settings UI. Supports all four providers.

Request Body

provider
string
required
Provider to validate: provider-anthropic, provider-openai, provider-ollama, or provider-openrouter.
apiKey
string
API key to validate. Max 256 chars.
{
  "provider": "provider-anthropic",
  "apiKey": "sk-ant-api03-your-key"
}

Response

200 OK
{
  "valid": true,
  "provider": "provider-anthropic",
  "reason": "API key is valid (Claude Haiku 4.5 responded)"
}
{
  "valid": false,
  "provider": "provider-anthropic",
  "reason": "Invalid API key: 401 Unauthorized"
}
400 Bad Request — Missing or invalid fields.
Delete stored credentials for a specific provider.
This endpoint is localhost-only and requires either a valid API key or setup token.

Path Parameters

providerId
string
required
The provider scope to delete (e.g., provider-anthropic, provider-openai, provider-openrouter). Must start with provider-.

Response

200 OK
{
  "success": true,
  "provider": "provider-anthropic"
}
403 Forbidden — Request not from localhost.404 Not Found — No credentials stored for this provider.
Delete all stored credentials.
This endpoint is localhost-only and requires either a valid API key or setup token. This action is irreversible.

Response

200 OK
{
  "success": true,
  "cleared": 3
}
403 Forbidden — Request not from localhost.

Dependencies

Check all optional system dependencies (Node.js, Docker, Ollama, Redis, sherpa-onnx models).

Response

200 OK
{
  "node": { "installed": true, "version": "v22.12.0", "hint": null },
  "docker": { "installed": true, "running": true },
  "ollama": { "installed": true, "reachable": true },
  "redis": { "installed": true, "reachable": true },
  "sherpaModels": { "downloaded": 3, "total": 5 }
}
Install and start Redis via Docker. Creates a redis/redis-stack:latest container with port 6379 exposed and unless-stopped restart policy.

Response

200 OK — Redis installed, started, or already running.
{
  "status": "installed"
}
Possible status values: installed (new container created), started (existing container started), already_running (no action needed).400 Bad Request — Docker is not running.
{
  "error": "Docker is not running",
  "hint": "Start Docker Desktop or run 'sudo systemctl start docker'"
}
500 Internal Server Error — Installation failed.
{
  "error": "Failed to install Redis",
  "details": "docker: Error response from daemon..."
}

Config (Legacy)

List the configuration keys that can be updated at runtime via the legacy config endpoint.

Response

200 OK
{
  "keys": ["ttsVoice", "ttsSpeed", "sttLanguage", "sttModel", "logLevel", "apiRateLimit", "chatRateLimit"],
  "description": {
    "ttsVoice": "TTS voice name (e.g., \"alloy\", \"nova\")",
    "ttsSpeed": "TTS speech speed (0.5-2.0)"
  },
  "current": {
    "ttsVoice": "af_heart",
    "logLevel": "info"
  }
}
Update a single runtime config value. The value is applied immediately to the environment variable and persisted to the config file.

Request Body

key
string
required
Config key name (must be in the writable whitelist).
value
string | number | boolean
required
New value for the key.
{
  "key": "ttsVoice",
  "value": "nova"
}

Response

200 OK
{
  "key": "ttsVoice",
  "value": "nova",
  "envKey": "AGTOS_TTS_VOICE",
  "applied": true,
  "persisted": true
}
403 Forbidden — Key is not writable.

Settings

Retrieve all writable config keys with current values, descriptions, reload types, and categories. Used by the Settings UI for dynamic form generation.

Response

200 OK
{
  "settings": {
    "ttsVoice": {
      "value": "af_heart",
      "description": "TTS voice name",
      "reloadType": "immediate",
      "category": "tts"
    },
    "llmProvider": {
      "value": "auto",
      "description": "Primary LLM provider",
      "reloadType": "provider-restart",
      "category": "llm"
    }
  }
}
Update multiple configuration values at once. All values are validated with Zod schemas, applied to environment variables, persisted to the config file, and announced via config:changed event.

Request Body

Pass a flat object with config keys and their new values:
{
  "ttsVoice": "nova",
  "ttsSpeed": 1.2,
  "logLevel": "debug"
}
See Configuration > Environment Variables for the full list of writable keys.

Response

200 OK
{
  "changed": ["ttsVoice", "ttsSpeed", "logLevel"],
  "persisted": true
}
400 Bad Request — Validation failure with field-level details.
{
  "error": "Invalid settings update",
  "details": [
    { "field": "ttsSpeed", "message": "Number must be less than or equal to 2" }
  ]
}
Return the Zod schema as a JSON description for dynamic form generation in the Settings UI. Includes type, description, reload behavior, category, and constraints for each key.

Response

200 OK
{
  "schema": {
    "ttsSpeed": {
      "type": "number",
      "description": "TTS speech speed (0.5–2.0)",
      "reloadType": "immediate",
      "category": "tts",
      "constraints": { "min": 0.5, "max": 2.0, "step": 0.1 }
    },
    "llmProvider": {
      "type": "enum",
      "description": "Primary LLM provider",
      "reloadType": "provider-restart",
      "category": "llm",
      "constraints": { "options": ["claude", "ollama", "auto"] }
    }
  }
}

Slots

Get the live Model Slot Registry with current configuration, runtime stats, and slot names.

Response

200 OK
{
  "slots": {
    "chat": { "provider": "claude", "model": "claude-sonnet-4-20250514", "temperature": 0.7 },
    "reasoning": { "provider": "openai", "model": "gpt-4o", "temperature": 0.3, "fallback": "chat" },
    "coding": { "provider": "ollama", "model": "qwen2.5-coder", "temperature": 0.2, "fallback": "chat" }
  },
  "stats": {
    "requests": 142,
    "errors": 3,
    "fallbackCount": 5
  },
  "builtInSlotNames": ["chat", "reasoning", "coding", "tool_calling", "creative", "maintenance"],
  "activeSlotNames": ["chat", "reasoning", "coding"]
}
Update the Model Slot Registry configuration. Each slot maps a named capability (e.g., chat, reasoning) to a provider and model. The chat slot is required and cannot be removed.Before writing, the server validates each slot’s model against the ProviderCatalog:
  • Unknown models (not in any catalog): warning returned, write allowed — the model may be private or unlisted.
  • Future-deprecated models: warning returned, write allowed.
  • Past-deprecated models: write blocked with HTTP 400.
  • Catalog fetch failure: warning returned, write allowed — transient upstream issue.

Request Body

{
  "chat": { "provider": "claude", "model": "claude-sonnet-4-20250514" },
  "reasoning": { "provider": "openai", "model": "gpt-4o" },
  "maintenance": { "provider": "openrouter", "model": "openai/gpt-4o-mini" }
}

Response

200 OK
{
  "slots": {
    "chat": { "provider": "claude", "model": "claude-sonnet-4-20250514" },
    "reasoning": { "provider": "openai", "model": "gpt-4o" },
    "maintenance": { "provider": "openrouter", "model": "openai/gpt-4o-mini" }
  },
  "restartRequired": true,
  "message": "Slot configuration updated",
  "warnings": [
    {
      "slot": "maintenance",
      "model": "openai/gpt-4o-mini",
      "reason": "not found in catalog (private / unlisted / pinned model)"
    }
  ]
}
The warnings array is only present when at least one warning was generated.400 Bad Request — Deprecated model(s) detected.
{
  "error": "Deprecated model(s)",
  "message": "Cannot use retired models",
  "deprecated": [
    {
      "slot": "chat",
      "model": "gpt-3.5-turbo-0301",
      "deprecatedAt": "2024-06-13T00:00:00Z"
    }
  ]
}

Provider Catalog

Aggregate model catalog from all configured providers. Returns models with capabilities, pricing, context length, and deprecation status.

Query Parameters

refresh
string
Set to 1 to bypass the 1-hour cache TTL and force a network fetch.
capability
string
Comma-separated list of required capabilities (AND semantics). E.g., tool-use,vision.

Response

200 OK
{
  "providers": [
    {
      "providerId": "claude",
      "models": [
        {
          "id": "claude-sonnet-4-20250514",
          "name": "Claude Sonnet 4",
          "contextLength": 200000,
          "maxOutputTokens": 8192,
          "capabilities": ["tool-use", "vision", "thinking"],
          "pricing": { "inputPer1M": 3.0, "outputPer1M": 15.0 }
        }
      ],
      "latencyMs": 245
    }
  ],
  "count": 1,
  "providerIds": ["claude", "openai", "ollama", "openrouter"]
}
Credit balance and usage snapshot for a single provider. Not all providers support account info.

Path Parameters

providerId
string
required
Provider identifier: claude, openai, openrouter, ollama.

Response

200 OK
{
  "creditBalance": 42.50,
  "creditCurrency": "USD"
}
404 Not Found — Provider not configured or does not support account info.

Models

List all models from the model registry with download status.

Response

200 OK
{
  "models": [
    {
      "id": "moonshine-tiny-en-v2",
      "name": "Moonshine Tiny (English)",
      "description": "Lightweight STT model",
      "type": "stt",
      "family": "moonshine",
      "sizeBytes": 48000000,
      "sizeMb": 45.8,
      "downloaded": true,
      "downloading": false,
      "format": "onnx"
    }
  ],
  "downloadedCount": 3,
  "totalCount": 5
}
Start downloading a model. Returns an SSE stream with progress events.

Request Body

modelId
string
required
The model ID to download.
{
  "modelId": "kokoro-int8-multi-v1"
}

Response

200 OK (text/event-stream)
{ "type": "progress", "modelId": "kokoro-int8-multi-v1", "downloaded": 50000000, "total": 100000000, "percent": 50 }
{ "type": "complete", "modelId": "kokoro-int8-multi-v1", "durationMs": 12000 }
{ "type": "error", "modelId": "kokoro-int8-multi-v1", "error": "Network timeout", "retryable": true }
400 Bad Request — Unknown model ID.
Check download status of a single model.

Path Parameters

modelId
string
required
The model ID.

Response

200 OK
{
  "modelId": "kokoro-int8-multi-v1",
  "name": "Kokoro TTS (Int8 Multi)",
  "type": "tts",
  "downloaded": true,
  "downloading": false,
  "sizeBytes": 100000000,
  "sizeMb": 95.4,
  "path": "/home/user/.agtos/models/kokoro-int8-multi-v1"
}
404 Not Found — Unknown model ID.
Delete a downloaded model.

Path Parameters

modelId
string
required
The model ID to delete.

Response

200 OK
{
  "success": true,
  "modelId": "kokoro-int8-multi-v1"
}
404 Not Found — Model not found or not downloaded.
Remove all downloaded models.

Response

200 OK
{
  "success": true,
  "removed": ["moonshine-tiny-en-v2", "kokoro-int8-multi-v1"],
  "count": 2
}

Ollama

Check Ollama installation and running status.

Response

200 OK
{
  "installed": true,
  "running": true,
  "version": "0.3.12",
  "models": [
    { "name": "qwen3:7b", "size": 4700000000, "digest": "abc123..." }
  ],
  "hint": null
}
200 OK — Ollama not installed.
{
  "installed": false,
  "running": false,
  "hint": "Install Ollama from https://ollama.com"
}
List models installed in Ollama.

Response

200 OK
{
  "models": [
    {
      "name": "qwen3:7b",
      "model": "qwen3:7b",
      "size": 4700000000,
      "digest": "abc123...",
      "family": "qwen3",
      "parameterSize": "7.6B",
      "quantization": "Q4_K_M"
    }
  ],
  "count": 1
}
503 Service Unavailable — Ollama not reachable.
Pull (download) a model from Ollama. Returns an SSE stream with progress events.

Request Body

model
string
required
The model name to pull (e.g., qwen3:7b, llama3.1:8b).
{
  "model": "qwen3:7b"
}

Response

200 OK (text/event-stream)
{ "type": "progress", "modelId": "qwen3:7b", "status": "pulling", "digest": "...", "total": 4700000000, "completed": 1200000000, "percent": 25 }
{ "type": "complete", "modelId": "qwen3:7b", "durationMs": 45000 }
503 Service Unavailable — Ollama not reachable.
Attempt to start the Ollama service.

Response

200 OK
{
  "started": true
}
500 Internal Server Error — Failed to start Ollama.

Billing

Aggregate billing status across all cloud providers. Shows exhaustion state, balance, and active fallback strategy.

Response

200 OK
{
  "providers": {
    "claude": {
      "exhausted": false,
      "balance": 42.50,
      "currency": "USD",
      "billingUrl": "https://console.anthropic.com/billing",
      "fallbackActive": false
    },
    "openai": {
      "exhausted": true,
      "billingUrl": "https://platform.openai.com/account/billing",
      "fallbackActive": true,
      "fallbackTarget": "claude"
    }
  },
  "strategy": "cloud-backup"
}
Reset billing exhaustion state for a provider. Use after adding credits to retry the provider.

Path Parameters

providerId
string
required
The provider to retry (e.g., openai, claude, openrouter).

Response

200 OK — Exhaustion state cleared, affected slots re-marked healthy.404 Not Found — Provider not configured.

Capture Protocol (PACT)

Start a new multimodal capture stream. Validates consent before beginning capture.

Request Body

deviceId
string
required
The device ID initiating capture.
modalities
string[]
required
Capture modalities: audio, video, neural.
captureMode
string
Capture mode: active-participant (default), passive-observation, or ambient.
{
  "deviceId": "dev-abc123",
  "modalities": ["audio"],
  "captureMode": "active-participant"
}

Response

201 Created
{
  "streamId": "stream-xyz789",
  "state": "active",
  "correlationId": "corr-abc123"
}
400 Bad Request — Missing consent or invalid modality.503 Service Unavailable — Capture protocol not available.
Stop an active capture stream.

Request Body

streamId
string
required
The stream ID to stop.

Response

200 OK404 Not Found — Stream not found or already stopped.
List active capture streams for the current user.

Response

200 OK
{
  "streams": [
    {
      "streamId": "stream-xyz789",
      "deviceId": "dev-abc123",
      "modalities": ["audio"],
      "state": "active",
      "startedAt": 1711612800000
    }
  ],
  "count": 1
}

System (Extended)

Redis connection status and diagnostics.

Response

200 OK
{
  "connected": true,
  "url": "redis://localhost:6379",
  "version": "7.4.1"
}
Force Redis reconnection and hot-create Redis-dependent services (memory, scheduling, devices) without a server restart. Requires API key when AGTOS_API_KEY is set.

Response

200 OK
{
  "reconnected": true,
  "services": ["memory", "scheduler", "device-registry"]
}
503 Service Unavailable — Redis connection failed.
Hardware and software capability detection for the current host.

Response

200 OK
{
  "cpu": { "cores": 8, "model": "Apple M2 Pro" },
  "memory": { "totalMb": 16384, "availableMb": 8192 },
  "gpu": { "available": true, "type": "metal" },
  "ollama": { "available": true },
  "docker": { "available": true }
}
Reset the onboarding state so the setup wizard runs again on next app load.

Response

200 OK
{
  "success": true,
  "message": "Onboarding reset. Reload the app to re-run setup."
}

Common Error Responses

All endpoints may return the following error shapes:

400 Bad Request

Returned when the request is malformed or missing required fields. The error field contains a human-readable description.
{
  "error": "Invalid JSON body"
}

404 Not Found

Returned when a requested resource (task, workflow) does not exist.
{
  "error": "Not found"
}

500 Internal Server Error

Returned when an unexpected exception occurs during request processing.
{
  "error": "Internal server error",
  "message": "Detailed error description"
}

503 Service Unavailable

Returned when a required dependency (Redis, voice pipeline, memory system, scheduler, workflow engine) is not initialized or has failed.
{
  "error": "Component not available"
}

Notes

Body size limit: Request bodies are capped at 1 MB. Requests exceeding this limit receive no response (connection closed).
  • ID validation: Path parameter IDs (task IDs, workflow IDs) must match ^[a-zA-Z0-9\-_]+$ and be 1—128 characters. Invalid IDs return 400.
  • CORS: Configurable via CORS_ORIGIN environment variable (default: http://localhost:5173). Tauri desktop origins are auto-added. Preflight OPTIONS requests return 204.
  • Metrics: All API requests are tracked in the internal metrics system. Latency is recorded per route.
  • Authentication: See Authentication below for opt-in API key auth.
  • Rate Limiting: See Rate Limiting below for per-endpoint limits.

Authentication

API key authentication is opt-in. When the AGTOS_API_KEY environment variable is set, all /api/* endpoints require a valid Bearer token.

Headers

HeaderValueRequired
AuthorizationBearer <your-api-key>When AGTOS_API_KEY is set

Response (401 Unauthorized)

{
  "error": "Unauthorized",
  "message": "Valid API key required. Set Authorization: Bearer <key> header."
}

Exempt Paths

These paths do not require authentication:
  • GET /health (and sub-paths like /health/metrics, /health/:serviceName)
  • GET /metrics (Prometheus endpoint)
  • POST /api/credentials/validate (read-only key validation)
  • GET /api/setup-token (localhost-only, onboarding token retrieval)
POST /api/credentials is not auth-exempt. It requires either a valid Authorization: Bearer <key> header (when AGTOS_API_KEY is set) or a valid X-Setup-Token header (30-minute TTL token from GET /api/setup-token, used during onboarding).

Rate Limiting

All API endpoints are rate-limited using a token bucket algorithm.

Default Limits

ScopeLimitConfigurable Via
General API (/api/*)100 requests/minAPI_RATE_LIMIT
Chat, Tasks, Credentials (/api/chat, /api/tasks, /api/credentials)20 requests/minCHAT_RATE_LIMIT

Rate Limit Headers

Included on all API responses:
HeaderDescription
X-RateLimit-RemainingTokens remaining in current window
Included on 429 Too Many Requests responses:
HeaderDescription
Retry-AfterSeconds until rate limit resets
X-RateLimit-LimitMaximum requests per window
X-RateLimit-RemainingAlways 0 on rate-limited responses
X-RateLimit-ResetSeconds until rate limit resets

429 Response Body

{
  "error": "Too many requests",
  "retryAfter": 60
}