"Natural voice conversations with AI"
View on GitHubRuns on: Linux • macOS • Windows (WSL) | Python: 3.10+ | Tested: Ubuntu 24.04 LTS, Fedora 42
Overview
Voice Mode brings natural voice conversations to AI assistants like Claude and ChatGPT. Built on the Model Context Protocol (MCP), it provides a clean, reliable interface for adding voice capabilities to your AI workflows.
See It In Action
Voice Mode with Gemini CLI
Watch Voice Mode in action with Google's Gemini CLI (their implementation of Claude Code). This demo shows the seamless voice interaction capabilities that Voice Mode brings to AI coding assistants.
Quick Start
claude mcp add --scope user voice-mode uvx voice-mode
export OPENAI_API_KEY=your-openai-key
claude
> /converse
Featured LiveKit Integration
Enable room-based voice communication with LiveKit for distributed teams and advanced voice workflows. Perfect for multi-participant voice interactions and production deployments.
Optional Self-Hosted ASR & TTS
For complete privacy, run speech recognition and text-to-speech locally instead of using OpenAI. Both whisper.cpp and Kokoro-FastAPI provide OpenAI-compatible APIs for seamless integration.
Features
Voice Conversations
Natural voice interactions with Claude through your microphone and speakers
Multiple Transports
Local microphone access or LiveKit rooms for distributed voice communication
OpenAI Compatible
Works with OpenAI's API and compatible services for speech processing
Simple Integration
Clean MCP protocol implementation that works seamlessly with Claude Desktop
Requirements
- OpenAI API key (or self-hosted STT & TTS services)
- Microphone and speakers for voice I/O (or LiveKit)
- Python 3.8+ (handled by uvx)
.png)


