The capability contract
The contract is what makes the layers swappable. A client reads it and adapts — it never hard-codes which engine is behind the address.
Every engine in the stack — OdyssAI-X, Telemak, and even third parties like Ollama, LM Studio or vLLM behind a shim — advertises what it can do at a single well-known URL:
GET /.well-known/inference-engine.jsonCompanion reads this during pairing to learn which models support tools, vision, thinking and embeddings, and how to route requests. Because the discovery is data, not code, you can put a different engine behind the same address and the client adapts on the next probe.
The document
Section titled “The document”{ "engine": "telemak", "version": "0.6.x", "capabilities": { "stream": true, "tools": true, "vision": false, "embeddings": true, "max_context": 32768, "session_cache": true, "openai_compat": "v1", "anthropic_compat": "v1" }, "models": [ /* per-model namespace with backend, nodes, tools, vision flags */ ]}Top-level fields
Section titled “Top-level fields”| Field | Meaning |
|---|---|
engine | Which runtime answers — odyssai-x, telemak, … |
version | The engine version. Some features gate on it (e.g. mixed-quant models need Telemak v0.6.33+). |
capabilities | The engine-wide defaults (below). |
models | Per-model capability overrides — a model can disable tools or vision even when the engine supports them. |
Capability flags
Section titled “Capability flags”| Flag | What it tells the client |
|---|---|
stream | SSE streaming is available. |
tools | Tool / function calling is supported. |
vision | Image inputs are accepted. |
embeddings | An embeddings endpoint is served. |
max_context | The context window in tokens. |
session_cache | The engine keeps a KV prefix cache scoped by session_id. |
openai_compat / anthropic_compat | Which API dialects, and at which version. |
Per-model overrides
Section titled “Per-model overrides”The engine-wide flags are defaults; the models array refines them. A cluster
serving a coder MoE and a small VLM advertises vision: true only on the VLM
entry, with each model’s backend (local pool, Telemak proxy, cloud alias) and
node layout. This is how a client builds an accurate model picker without
probing each model by hand.
Why it matters
Section titled “Why it matters”- Replaceable layers. Swap Telemak for an OdyssAI-X cluster behind the same address; the client re-reads the contract and adjusts. No client redeploy.
- Honest routing. Companion won’t offer a tool call to a model whose contract
says
tools: false, and won’t toggle thinking on a model that doesn’t ship it. - One picker, many engines. Pair several engines; each contract feeds one unified catalog with correct per-model capabilities.
Read next
Section titled “Read next”- HTTP API — the endpoints the contract describes.
- Architecture overview — where the contract sits in the four layers.
- Companion · engine pairing — how the client consumes it.