TUI app- Give it a YouTube URL (or a path to a video or audio file) and you get a transcription with possible speaker identification (WIP) and optional summary or translation, all thanks to open-source AI tooling and my lack of enough free time to watch content-sparse YouTube videos
- transcribe YouTube videos by URL
- output metadata about the video
- speaker identification (probably using an LLM in conjunction with a speaker diarization library)
- summarization via summarize (requires OPENAI_API_KEY to be set)
- translation via translate <language_name> (requires OPENAI_API_KEY to be set)
- can use almost any audio or video format that ffmpeg can handle as input, not just YouTube URLs
- Test suite (run it with yt-transcriber TEST or TEST=1 yt-transcriber)
- support for other video platforms
- convert all this to a web service or web app
Speaker identification ("diarization"), summarization and translation will probably require an API key for Claude or OpenAI and/or one from Huggingface.
NEW: If you have Nix installed or are running on NixOS, just symlink yt-transcriber, summarize and translate to any directory (usually ~/bin or XDG_BIN_HOME which is usually ~/.local/bin) in your PATH and you're good to go (the last two require OPENAI_API_KEY to be defined in your environment). The shell script will automatically procure all dependencies deterministically and locally and cache them.
If you do not have Nix installed, I recommend using the Determinate Nix Installer from here: https://github.com/DeterminateSystems/nix-installer
If you refuse to use Nix, you can try to install the following dependencies manually, but I make no guarantees:
(glow is optional; if using the --markdown|-md argument with summarize, this makes things prettier in the terminal if you pipe to it) The Python dependencies will be installed via pip into a venv cached in $XDG_CACHE_HOME/yt-transcriber/.venv and XDG_CACHE_HOME defaults to ~/.cache if not set. The Whisper model will be downloaded to $XDG_CACHE_HOME/yt-transcriber/.whisper.
the flake.nix file manages all deps, so just nix develop when in there. ./test_flake.sh tests whether everything's set up correctly. ./yt_transcriber TEST tests the app itself. No app keys needed, Whisper runs locally. Setup was only tested on Mac with a Nix install thus far. Will add tests for it working without Nix next.
./yt-transcriber by itself will list options and usage (such as -m modelsize).
By default the app uses the small (second smallest) model; I recommend using at least small for better transcription results without costing too much extra processing time. The options are: base, small, medium, large, large-v2
Transcript will be sent to stdout, so you can redirect it to a file or pipe it to another program such as the provided ./summarize[--markdown] or ./translate [language] scripts (see below).
If you set the DEBUG env var (to anything), you'll get additional logging/debug info to stderr.
For a full debug run try this:
.png)
![New Glenn Mission NG-2 Webcast [video]](https://www.youtube.com/img/desktop/supported_browsers/edgium.png)
