Show HN: Dumb STT/diction script for sway-Linux

4 months ago 4

A simple, hotkey-driven voice transcription script designed for the Sway window manager. Captures audio, transcribes via API, and inserts text at cursor position. Nvidia Parakeet backend included if you want it.
Purely for personal use, satisfaction not guaranteed.

Install the required dependencies:

# Ubuntu/Debian sudo apt install alsa-utils curl wtype libnotify-bin jq # Arch Linux sudo pacman -S alsa-utils curl wtype libnotify jq

Put your own configuration: cp config.env.example config.env then at config.env:

API_ENDPOINT="http://localhost:8000/transcribe" # Or where-ever your OAI compliant audio STT API is at

Set up hotkey in Sway: Add to your ~/.config/sway/config:

bindsym $mod+Shift+v exec /path/to/steno/voice-to-text.sh

Start Recording: Press your configured hotkey
- Shows "🎤 Recording started..." notification
Stop Recording: Press the same hotkey again
- Shows "🔄 Transcribing..." notification
- Transcribes audio and inserts text at cursor
- Shows "✅ Text inserted..." confirmation

I like to live dangerously and have an nvidia GPU > 12.1 CUDA and containers don't scare me (in small dosage)

Fine, here you go:

git clone https://github.com/Shadowfita/parakeet-fastapi.git cd parakeet-fastapi docker build -t parakeet-stt . docker run -d -p 8000:8000 --gpus all parakeet-stt # Then go back to the dir before cd ../ git clone https://github.com/winston-bosan/steno.git cd steno chmod +x voice-to-text.sh ./voice-to-text.sh # SAY YOUR STUFF ./voice-to-text.sh

Read Entire Article