Skip to content

Telemak — architecture

One process serves the model, one icon controls it, one agent keeps it alive. No Docker, no Python, no orchestrator.

Telemak is a single native Swift app built on mlx-swift-lm. It has three parts, all on the one machine.

The serving process. It loads MLX models into wired memory and answers /v1/chat/completions and /v1/messages on http://localhost:8003. Built on mlx-swift-lm, so there’s no Python venv to maintain and no MLX to rebuild on every toolchain bump.

Key behaviours:

  • Multi-model concurrent loading. Several models stay warm at once — a chat MoE plus an embedder plus a small VLM. Telemak tracks each model’s wired_limit_mb so the OS doesn’t page weights out from under the GPU.
  • Cross-turn KV cache. Conversations keep their KV prefix on disk under ~/.telemak/sessions/, LRU-evicted when the budget is hit. The second turn of a long conversation comes back much faster than the first.
  • Request serialization. MLX work is serialized so concurrent requests don’t interleave on the GPU; tokens are routed back to the right caller.

The control surface — a menu-bar icon, no Dock icon on purpose. Start, stop, restart; load and unload models; watch the live phase (prefill / decode / streaming / idle), tokens, tok/s, and the last error. It’s a thin client over the daemon’s own admin surface. Full tour in The menu bar.

What keeps it running. Telemak installs a per-user LaunchAgent that:

  • starts the daemon at login and restarts it on crash,
  • writes JSON logs, daily-rotated, under ~/.telemak/logs/.

So Telemak survives reboots and process death without you babysitting it. To stop it for good you unload from the menu bar or remove the LaunchAgent.

PathWhat
/Applications/Telemak.appthe app (daemon + menu bar)
~/.telemak/sessions/per-conversation KV prefix caches
~/.telemak/logs/daily-rotated JSON logs
http://localhost:8003the OpenAI + Anthropic API

Same API surface, different runtime target. Telemak is not a fork of OdyssAI-X — it’s a sibling: native Swift, mono-Mac, in-process. A Telemak can be enrolled in an OdyssAI-X cluster as a single-node provider (see Cluster enrolment), or run completely standalone with no orchestrator at all.