A minimalist, privacy-focused desktop application for offline speech-to-text. Converts voice input directly into any active window (editors, browsers, IDEs, AI assistants) Uses the Whisper model locally for speech recognition. Written in Go, an optimized desktop application for Linux.
speak-to-ai-preview.mp4- Offline speech-to-text and voice typing
- Portable: AppImage package
- Cross-platform support for X11 and Wayland
- Desktop Environment Support: Native integration with GNOME, KDE, and other Linux DEs
- Privacy-first: desktop, no data sent to external servers
- Support: multi-language, global hotkeys, automatic typing & clipboard, system tray integration, visual notifications, WebSocket API (optional)
Download the latest AppImage from Releases:
Help us test different desktop environments:
📋 Desktop Environment Support Guide
For system tray integration on GNOME, install the AppIndicator extension ↑
KDE and other DEs have built-in system tray support out of the box
For automatic typing on Wayland (GNOME and others) — set up ydotool ↑
X11 has native typing support with xdotool out of the box
If automatic typing doesn't appear automatically, the app falls back to clipboard (Ctrl + V) mode
AppImage release - main distribution format. I'd appreciate feedback about your experience on your system!
Flatpak bundle is planned.
For issues and bug reports: GitHub Issues
See changes: CHANGELOG.md
Start onboarding with:
- ARCHITECTURE.md — system architecture and component design
- DEVELOPMENT.md — development workflow and build instructions
- CONTRIBUTING.md — contribution guidelines and how to help improve the project
- docker/README.md — Docker-based development
- OS: Linux (Ubuntu 20.04+, Fedora 35+, or similar)
- Desktop: X11 or Wayland environment
- Audio: Microphone/recording capability
- Storage: 277.2MB (whisper small q5 model, dependencies, go-binary)
- Memory: ~300MB RAM during operation
- CPU: AVX-capable processor (Intel/AMD 2011+)
- whisper.cpp for the excellent C++ implementation of OpenAI Whisper
- fyne.io/systray for cross-platform system tray support
- OpenAI for the original Whisper model
Sharing with the community for privacy-conscious Linux users
MIT — see LICENSE.
If you find Speak-to-AI useful, please consider supporting development.
.png)
