Skip to content

The model picker

One list, every engine. Pick a model by hand, or let Companion pick for you.

At the top of every conversation sits the model picker. It shows a single catalog built from every engine you’ve paired — local clusters, Telemak Macs, cloud providers — each with its real capabilities (tools, vision, thinking) read from the capability contract.

ModeWhat you chooseFor
Easynothing — Companion auto-routes”just answer me well”
Advancedone model, by handyou know which model you want
Expertmodel + sampling, thinking, efforttuning a specific behaviour

In Easy mode Companion picks the model for you with a fast semantic router: it embeds your message and routes it to one of a few buckets (e.g. quick / code / deep). This is Companion’s own chat router — cheap (~milliseconds) and tuned for the chat use case.

Auto-routing in chat is separate from CoeOS. CoeOS is the engine-side, benchmark-grounded router for agents; the chat picker’s router is for you, in the window. Both can be in play, at different layers.

Open the picker, choose a model. It sticks for the conversation until you change it. The list shows the engine each model belongs to and flags what it supports, so you won’t, say, attach an image to a text-only model.

Expert mode exposes the dials: temperature and the sampling parameters, the Thinking toggle (enable_thinking), and reasoning effort (minimalhigh). Useful when you want a model to stop over-thinking a simple ask, or to reason harder on a deep one.

You can change the model at any turn — the history stays, the next answer comes from the new model. Because the KV prefix cache is keyed per conversation, switching to a local model re-prefills (a one-time cost); switching to a cloud model just changes who bills the tokens.