Press a keybind, speak, and get instant text output. A background speech-to-text tool that transcribes audio using OpenAI Whisper and either types directly or copies to clipboard.
- Signal-driven: Press keybind → speak → get text (no GUI needed)
- Dual output modes: Direct typing or clipboard copy
- Background operation: Runs continuously, always ready
- Audio feedback: Beeps confirm recording start/stop and success
- Wayland native: Works with modern Linux desktops (Hyprland, Niri, etc.)
- Wayland desktop (Hyprland, Niri, GNOME, KDE, etc.)
- OpenAI API key (for Whisper transcription)
- System packages:
Setup ydotool permissions:
- Download from GitHub Releases
- Install:
- Setup configuration:
- Start the service:
- Use with signals:
Add to your ~/.config/hypr/hyprland.conf:
These keybindings will:
- Super+R: Start waystt if not running, or send SIGUSR1 to transcribe and type directly
- Super+Shift+R: Start waystt if not running, or send SIGUSR2 to transcribe and copy to clipboard
Add to your ~/.config/niri/config.kdl:
Configuration is read from ~/.config/waystt/.env by default. You can override this location using the --envfile flag:
waystt supports two transcription providers: OpenAI Whisper (default) and Google Speech-to-Text. Choose the one that best fits your needs.
OpenAI Whisper offers excellent accuracy and supports automatic language detection.
Required: Create ~/.config/waystt/.env with your OpenAI API key:
Optional OpenAI settings:
Google Speech-to-Text provides fast, accurate transcription with support for many languages and dialects.
Setup Steps:
-
Enable Google Cloud Speech-to-Text API:
- Go to Google Cloud Console
- Create a new project or select existing one
- Enable the "Cloud Speech-to-Text API"
- Create a service account and download the JSON key file
-
Configure waystt for Google:
Popular Google language codes:
- en-US - English (United States)
- en-GB - English (United Kingdom)
- es-ES - Spanish (Spain)
- fr-FR - French (France)
- de-DE - German (Germany)
- ja-JP - Japanese
- zh-CN - Chinese (Simplified)
Audio and system settings (apply to both providers):
If audio recording fails:
- Ensure PipeWire is running: systemctl --user status pipewire
- Check microphone permissions
- Verify microphone is not muted
If direct text typing (SIGUSR1) fails:
- Ensure ydotool is installed and user is in input group
- Check ydotool permissions: sudo usermod -a -G input $USER (requires re-login)
- Verify ydotool daemon is running: systemctl --user status ydotool
If clipboard operations (SIGUSR2) fail:
- Ensure you're running under Wayland: echo $WAYLAND_DISPLAY
- Install wtype: Required for clipboard pasting functionality
OpenAI Provider:
- Verify your OpenAI API key is valid and has sufficient credits
- Check internet connectivity
- Review logs for specific error messages
Google Provider:
- Verify your service account JSON file path is correct
- Ensure the Speech-to-Text API is enabled in your Google Cloud project
- Check that your service account has the necessary permissions
- Verify your Google Cloud project has billing enabled
- Review logs for specific error messages
Licensed under GPL v3.0 or later. Source code: https://github.com/sevos/waystt
See LICENSE for full terms.
.png)

