Transcribe videos from YouTube, local or stream link locally

3 months ago 2

CLI tool for extracting searchable transcriptions from YouTube videos, local files, and M3U8 streams using local Whisper models.

Multi-source: YouTube, local videos (.mp4, .avi, .mov, .mkv, .m4v), M3U8 streams
Local transcription: MLX Whisper (macOS) or OpenAI Whisper (cross-platform)
SQLite storage: Searchable database with Datasette web interface
Batch processing: YAML configuration for multiple videos

Prerequisites: Python 3.9+, FFmpeg

# macOS brew install ffmpeg # Ubuntu/Debian sudo apt install ffmpeg # Windows # Download from https://ffmpeg.org/download.html

vid2text youtube "https://youtu.be/VIDEO_ID" vid2text local "/path/to/video.mp4" vid2text m3u8 "https://example.com/stream.m3u8" # With options vid2text --model small.en --verbose youtube "https://youtu.be/..." vid2text --dry-run local video.mp4 # Preview only

Create config.yaml:

videos: youtube: - url: "https://youtu.be/dQw4w9WgXcQ" - url: "https://youtu.be/jNQXAC9IVRw" title: "Custom Title" # Optional local: - path: "/path/to/video.mp4" - path: "/path/to/folder/" # Process all videos in folder title: "Folder Videos" m3u8: - url: "https://example.com/video.m3u8" title: "Live Stream" order: 1 settings: # Optional whisper_model: "small.en" # Override default model log_level: "DEBUG"

Process:

vid2text process config.yaml vid2text --dry-run process config.yaml # Preview

vid2text stats # Show video count vid2text --db-path custom.db stats # Custom database vid2text view # Launch web interface (requires datasette) vid2text view --port 8080 # Custom port

Variable Default Description

VIDEO_DB_PATH	~/.vid2text/knowledge.db	Database file location
LOG_LEVEL	INFO	Logging verbosity
TRANSCRIPTION_ENGINE	Auto-detected	mlx-whisper or openai-whisper
WHISPER_MODEL	Auto-selected	Model name (see below)

macOS (Apple Silicon) - MLX Whisper:

mlx-community/whisper-medium.en-mlx (default) - Good balance
mlx-community/whisper-large-v3-mlx - Best accuracy, slower
mlx-community/whisper-small.en-mlx - Faster, less accurate

Cross-platform - OpenAI Whisper:

base.en (default) - Good balance
tiny.en - Fastest
small.en - Better accuracy
medium.en - High accuracy
large - Best accuracy

--db-path PATH - Custom database location
--model MODEL - Override Whisper model
--verbose/-v - Increase logging (use -vv for debug)
--dry-run - Preview operations without processing

# Custom model and database WHISPER_MODEL=small.en vid2text --db-path ./videos.db youtube "https://youtu.be/..." # Debug processing issues vid2text -vv local problematic_video.mp4 # Batch process with custom settings VIDEO_DB_PATH=./project.db LOG_LEVEL=DEBUG vid2text process videos.yaml # Quick stats check vid2text stats | grep "Total videos"

FFmpeg not found:

# Verify installation ffmpeg -version # Add to PATH if needed

Out of memory during transcription:

Try smaller Whisper model: --model tiny.en
Close other applications
Use MLX Whisper on Apple Silicon for better memory efficiency

Database locked error:

Close any open Datasette instances
Check if another vid2text process is running

vid2text youtube <url> - Process YouTube video
vid2text local <path> - Process local video/folder
vid2text m3u8 <url> - Process M3U8 stream
vid2text process <config.yaml> - Batch process from YAML
vid2text stats - Show database statistics
vid2text view [--port PORT] - Launch Datasette web interface

git clone https://github.com/yourusername/vid2text.git cd vid2text python -m venv venv && source venv/bin/activate pip install -e ".[test]" # Run CLI vid2text --help # Run tests pytest

Read Entire Article

Transcribe videos from YouTube, local or stream link locally

Related

Show HN: Speak your mind and get a prioritized action plan i...

Fuel Price Scaper

Fast Readers Think Ahead