Voice & Talk
Hold a key, speak, release. The transcript lands in the composer for you to check before it goes anywhere.
Companion has two voice surfaces — one shipping, one coming.
Push-to-talk (shipping)
Section titled “Push-to-talk (shipping)”Hold Space anywhere outside a text input to dictate; release to send the transcript to the composer.
- Fully client-side. Transcription uses your browser’s Web Speech API — no audio ever leaves your device. The text drops into the input bar, not the conversation, so you can edit it before pressing Enter. (Mishears happen; a one-shot “you said this, sending it” flow is more frustrating than the half-second it saves.)
- Visual cue. While Space is held, the input shows a red dot + “Listening…”.
- Language follows your browser’s speech-recognition locale (usually your OS locale). Set the browser language to the one you speak.
Space inserts a literal space instead of recording? Your focus is inside a text input — click anywhere on the message area first. The dot won’t appear while a typeable element is focused.
Voice mode & Talk (roadmap)
Section titled “Voice mode & Talk (roadmap)”Auto-speak — assistant replies streamed through TTS, plus a full-screen Talk surface for hands-free conversation — is on the roadmap. The speaker icon and the 🎙 Talk button are reserved; the per-message Listen / Stop / Save controls appear once the backend ships. Push-to-talk (above) works today and is independent of this.
There’s also a Voice Live add-on (full-duplex voice via Gemini Live, requires a Gemini key) — a separate, key-gated path from the always-on, browser-side push-to-talk.
What stays local
Section titled “What stays local”| Where it runs | |
|---|---|
| Push-to-talk transcription | 100% in your browser — no audio leaves the device |
| Assistant text | through the inference engine, like any chat |
Troubleshooting
Section titled “Troubleshooting”- Browser asks for the mic on every reload — some browsers don’t persist Web Speech permission. Pin Companion as a PWA, or grant the site permanent mic permission in browser settings.
- Transcribes empty / wrong language — set the browser language to the one you’re speaking; Web Speech follows it.
Read next
Section titled “Read next”- Chat basics — the regular chat flow.
- The chat window — where the composer and voice cue live.
- Troubleshooting — voice-specific issues.