Voice Routing
Voice routing is the core mechanism that connects your spoken words to a terminal. Callipso implements this as a state machine that handles the full lifecycle: listening, transcription, target selection, and text delivery.
The routing pipeline
Voice Input → STT Engine → Transcription → Target Selection → Terminal Delivery
Each step happens in sequence:
- Voice input — You speak while the STT engine is listening. Audio is captured via the macOS microphone API.
- STT engine — Parakeet (or your configured engine) converts audio to text. This produces a transcription string.
- Transcription — The text is placed on the system clipboard and a routing event is triggered.
- Target selection — Callipso determines which terminal should receive the text based on the current routing mode.
- Terminal delivery — The text is sent to the selected terminal via the IDE extension's API.
Target selection modes
Auto-routing (default)
In auto-routing mode, Callipso sends text to the most recently focused terminal. The overlay tracks focus changes across all connected IDEs and maintains a "last active" pointer.
This works well when you have one or two terminals open and you naturally focus whichever one you want to talk to.
Pinned routing
When a terminal is pinned, all voice input goes to that terminal regardless of focus. This is useful when:
- You are reading documentation in one pane but want commands to go to a specific terminal
- You have many terminals open and want predictable routing
- You are running a Claude Code session and want to ensure voice input stays in that session
Pin a terminal by pressing Ctrl+Shift+P while it is focused, or by clicking the pin icon in the Callipso terminal list.
Session-aware routing
When Claude Code sessions are detected, Callipso uses session IDs for routing. Each Claude Code session registers itself via HTTP hooks, and Callipso maintains a bidirectional map of session IDs to terminal UUIDs.
This prevents cross-pollination: if you have two Claude Code sessions running in parallel, voice input is always routed to the correct one based on the session context.
How sessions register
Claude Code sessions register via a POST to http://localhost:3000/hooks/prompt-start when they begin. The session ID is generated by Claude Code and passed to Callipso in the hook payload.
Terminal discovery
Callipso discovers terminals through two mechanisms:
IDE extension polling — The IDE extension (running in VS Code, Cursor, or Windsurf) polls its terminal API every second and reports the list of open terminals to Callipso via HTTP.
Application adapters — Callipso also directly queries running applications (Terminal.app, iTerm2, Warp) via AppleScript or accessibility APIs for non-IDE terminals.
Each discovered terminal gets a stable unique ID based on its IDE, window, and index. This ID persists across polls so that pinned routing and session associations survive terminal list refreshes.
Delivery mechanism
Once the target terminal is selected, text delivery depends on the terminal type:
| Terminal type | Delivery method |
|---|---|
| VS Code / Cursor / Windsurf | IDE extension API (sendText) |
| Terminal.app | AppleScript |
| iTerm2 | AppleScript |
| Warp | Accessibility API |
| Claude Desktop | HTTP API |
The IDE extension method is the most reliable because it uses the editor's built-in terminal API. AppleScript and accessibility-based methods work but may have edge cases with certain terminal configurations.