Skip to content

The capability contract

The contract is what makes the layers swappable. A client reads it and adapts — it never hard-codes which engine is behind the address.

Every engine in the stack — OdyssAI-X, Telemak, and even third parties like Ollama, LM Studio or vLLM behind a shim — advertises what it can do at a single well-known URL:

GET /.well-known/inference-engine.json

Companion reads this during pairing to learn which models support tools, vision, thinking and embeddings, and how to route requests. Because the discovery is data, not code, you can put a different engine behind the same address and the client adapts on the next probe.

The document

{
  "engine": "telemak",
  "version": "0.6.x",
  "capabilities": {
    "stream": true,
    "tools": true,
    "vision": false,
    "embeddings": true,
    "max_context": 32768,
    "session_cache": true,
    "openai_compat": "v1",
    "anthropic_compat": "v1"
  },
  "models": [ /* per-model namespace with backend, nodes, tools, vision flags */ ]
}

Top-level fields

Field	Meaning
`engine`	Which runtime answers — `odyssai-x`, `telemak`, …
`version`	The engine version. Some features gate on it (e.g. mixed-quant models need Telemak `v0.6.33+`).
`capabilities`	The engine-wide defaults (below).
`models`	Per-model capability overrides — a model can disable tools or vision even when the engine supports them.

Capability flags

Flag	What it tells the client
`stream`	SSE streaming is available.
`tools`	Tool / function calling is supported.
`vision`	Image inputs are accepted.
`embeddings`	An embeddings endpoint is served.
`max_context`	The context window in tokens.
`session_cache`	The engine keeps a KV prefix cache scoped by `session_id`.
`openai_compat` / `anthropic_compat`	Which API dialects, and at which version.

Per-model overrides

The engine-wide flags are defaults; the models array refines them. A cluster serving a coder MoE and a small VLM advertises vision: true only on the VLM entry, with each model’s backend (local pool, Telemak proxy, cloud alias) and node layout. This is how a client builds an accurate model picker without probing each model by hand.

Why it matters

Replaceable layers. Swap Telemak for an OdyssAI-X cluster behind the same address; the client re-reads the contract and adjusts. No client redeploy.
Honest routing. Companion won’t offer a tool call to a model whose contract says tools: false, and won’t toggle thinking on a model that doesn’t ship it.
One picker, many engines. Pair several engines; each contract feeds one unified catalog with correct per-model capabilities.

Read next

HTTP API — the endpoints the contract describes.
Architecture overview — where the contract sits in the four layers.
Companion · engine pairing — how the client consumes it.