Skip to content

Chat basics

Read once, refer back when something doesn’t behave the way you expect.

KeyDoes
Entersend
Shift+Enternewline in the input
Escstop the current stream (the partial reply stays)
⌘/Ctrl+Entersend and open a fresh chat

The input bar shows the active model on the left and Send on the right. The paperclip opens attachments; you can also drop files or paste images.

A conversation is a list of messages bound to a model (changeable mid-chat), an optional project (inherits its system prompt + memory), a frozen memory snapshot, and a session_id (the conversation UUID, passed to the engine for KV-cache reuse).

  • New chat — top-left Chat, or ⌘/Ctrl+N. Inside a project, the new chat inherits it.
  • Switch — click a sidebar row; the header (model, agent mode, memory) follows that conversation’s state.
  • Rename / Pin / Move to project — hover a row for the actions. Moving only re-tags; messages and snapshot stay intact (the project’s system prompt applies to the next turn).
  • Search — ⌘K matches title + last user message.

Hover a user bubble → pencil, or double-click it. ⌘/Ctrl+Enter resends, Esc cancels.

Editing truncates. Every message after the edited one is permanently deleted, then the model re-runs from the edit. There’s no branch tree — copy an assistant reply elsewhere first if you want to keep it.

You can also edit an assistant message (fix a typo, drop a hallucinated paragraph, pre-inject a tool result). Edited assistant content counts toward the next turn’s history and KV prefix.

  • Regenerate — same truncation rule; re-invokes with the original prompt. Regenerate with… lets you switch model or sampling first.
  • Esc stops a stream; the partial is saved. It does not auto-resume.
  • Continue — for a reply that hit max_tokens mid-sentence; sends a “continue from here” without truncating. (If the reply ended cleanly, Continue starts a new turn — use Regenerate instead.)

A header toggle, off by default.

  • Off — no tool definitions sent (~250-token prompt overhead, clean streaming).
  • On — the full toolset (fs_*, rag_search, web, every enabled MCP server). Skills are always on regardless.

When on, the model can emit tool_call blocks mid-reply; Companion runs them server-side and feeds results back, so one user turn can produce several tool calls + the final reply under one assistant message.

Each conversation freezes global + project memory at creation — that’s why a chat from yesterday still sees yesterday’s memory. Remember now (per-conv memory menu) re-snapshots at the current state for the next turn. The header Memory toggle opts a conversation out entirely. Full lifecycle in Memory.

  • chat (default) — text + multimodal.
  • talk — Voice Live mode, large-mic UI (see Voice).

The legacy hermes kind was retired (2026-05-19); old rows read as chat.

Hover a row → Delete (or selection mode for several). Confirmation required; no trash — deletes are immediate and cascade through the messages.