
The only voice input tool that reads your active coding session and recognizes your project's variable names, class names, and API parameters in real-time.
Generic dictation tools have zero awareness of your codebase. They guess at technical terms and get them wrong every time.
Raw speech-to-text dumps a stream of words with no punctuation, no capitalization, and no structure. You spend more time fixing than coding.
Speaking Chinese with English code terms? Most tools force you to pick one language and mangle everything else.
A two-stage pipeline that understands both your voice and your code.
VoiceVibeCode reads your active Claude Code / Codex session, extracts variable names, class names, function names, and project-specific terminology.
Choose WhisperKit (on-device), Qwen3-ASR (advanced Chinese), or Volcengine (cloud-based Flash API). Pick the engine that fits your workflow.
An LLM polishes the raw transcription using your coding context: fixes homophones, adds punctuation, removes fillers, and ensures technical terms are correct. This step can also run locally via NovaMLX.
Text appears instantly in your terminal, IDE, browser, or any active application. No copy-paste needed.
VoiceVibeCode needs two system permissions to work. Both are standard macOS security prompts.
Required for global hotkeys (Push-to-Talk, language switching) and inserting text into other applications. VoiceVibeCode uses the macOS Accessibility API to type text directly at your cursor position.
Required for recording your voice. Audio is processed entirely on-device via WhisperKit — no audio data is ever sent to any server.
Same voice input, different results. Active project: a React + TypeScript codebase.
"Refactor the user service class to use the
authenticate method instead of login and add
error handling for time out exceptions""Refactor the UserService class to use the authenticate method instead of login, and add error handling for TimeoutException."
Detects your active Claude Code or Codex session and extracts project-specific terminology. Works best with tmux for reliable working directory detection.
Post-processing via LLM removes fillers, fixes homophones, adds smart punctuation, and ensures technical terms match your project.
WhisperKit (on-device), Qwen3-ASR (advanced Chinese), or Volcengine (cloud Flash API). On-device or cloud — your choice.
Speak Chinese with English code terms, or any combination of 9 supported languages. Seamlessly.
Say "Enter" at the end to press Return. More commands coming soon.
iTerm2, Terminal, VS Code, Xcode, Chrome, Safari, WeChat, Slack. Direct text insertion into any app.
Speech recognition runs on-device. LLM post-processing can also run locally via NovaMLX — zero data leaves your Mac.
Switch between formal, casual, and concise modes depending on where you're typing.
Your voice data never has to leave your Mac. Both stages of the pipeline can run entirely on-device.
VoiceVibeCode offers 3 ASR engines: WhisperKit and Qwen3-ASR run entirely on your Mac. Volcengine uses cloud-based Flash API for instant results. Choose on-device for privacy, or cloud for convenience.
Pair with NovaMLX to run the LLM post-processing stage locally as well. With this setup, both stages run on your machine — your voice data never touches the internet.
VoiceVibeCode is purpose-built for Vibe Coding, but it's also an excellent voice input tool for everyday productivity. Smart punctuation, multi-language support, and direct text insertion work great in any application.
Designed for the AI-powered development workflow.