Telemak — getting started
One
.app, one menu-bar icon, one port. No Docker, no Python, no SSH.
From a fresh Mac to a first inference in about five minutes.
1. Install the app
Section titled “1. Install the app”Download Telemak-<version>.dmg from the
releases page, open it, and
drag Telemak.app to Applications.
Telemak is not signed with an Apple Developer ID (distributed outside the App Store), so the first launch is blocked by Gatekeeper. Unblock it once:
- No terminal: open it, dismiss the warning, then System Settings → Privacy & Security → Open Anyway.
- One line:
xattr -dr com.apple.quarantine /Applications/Telemak.app
Open the app. A small icon appears in your menu bar — no Dock icon, on
purpose. Telemak installs a LaunchAgent that survives reboots and restarts on
crash, and starts serving on http://localhost:8003.
2. Load a model
Section titled “2. Load a model”Click the menu-bar icon → Load model → pick one of the curated MLX models. The dialog tells you which models fit your RAM before you commit:
| Your Mac | Comfortable model class |
|---|---|
| 64 GB | a 35 B MoE 8-bit (e.g. Qwen3.6-35B-A3B) |
| 96–128 GB | a 70–80 B MoE 8-bit |
| 256 GB | a 122 B MoE 8-bit |
| 512 GB | a 200 B+ MoE (mixed-quant) |
Loading takes 10–60 seconds depending on the model and your SSD. When the status
flips from loading to idle, the model is warm. You can keep several models
co-loaded (a chat MoE + an embedder + a small VLM) — Telemak tracks each
model’s wired-memory budget so the OS doesn’t steal pages from the GPU.
3. Sanity-check
Section titled “3. Sanity-check”curl http://localhost:8003/v1/modelsYou should see your loaded model. Then a first completion:
curl http://localhost:8003/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "<your-model-id>", "messages": [{"role": "user", "content": "Hello, sea."}] }'Telemak speaks both OpenAI (/v1/chat/completions) and Anthropic
(/v1/messages) against the same loaded models, so Claude-style clients work
too.
4. Add a chat window (optional)
Section titled “4. Add a chat window (optional)”Telemak is just the engine — it ships no chat UI. To get conversations, history, memory and projects on top, point Companion at it:
- Install Companion (Companion getting started).
- Companion → Settings → Infrastructure → Engine → add
http://localhost:8003(orhttp://<telemak-ip>:8003from another machine) → Test endpoint. - Companion reads Telemak’s capability contract and loads its model catalog. You now chat through the UI, with KV cache surviving across turns of the same conversation.
5. Point an IDE agent at it (optional)
Section titled “5. Point an IDE agent at it (optional)”Any OpenAI- or Anthropic-compatible coding agent (Cline, Continue.dev, Claude Code, Codex) can drive Telemak directly:
export OPENAI_BASE_URL="http://localhost:8003/v1"export OPENAI_API_KEY="dummy" # no key required on a LANYou’re running
Section titled “You’re running”Telemak keeps the model warm in wired memory, the KV cache survives across turns, and the daemon restarts itself after a reboot or crash. From the menu bar you can watch the live phase (prefill / decode / streaming / idle), tokens/s, and the last error.
Read next
Section titled “Read next”- The menu bar — every control and status indicator.
- Architecture — the daemon, the menu bar, the LaunchAgent.
- Performance — what to expect from each model class on real hardware.
- Cluster enrolment — add this Telemak to an OdyssAI-X cluster in 30 seconds.