Show HN: Opensource CLI to semantically ask your Gmail using local LLMs
4 months ago
9
Lightweight CLI agent to semantically search and ask your emails. Downloads inbox, generates embeddings using local (or external) LLMs, and stores everything in a vector database on your machine. Supports incremental sync for fast updates.
By default, Semantic Mail uses Ollama for local embeddings and ChromaDB for vector storage
Semantic Mail requires Python 3.11+, Ollama, and Gmail API credentials.
# Install Ollama first
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve
# Clone and install
git clone https://github.com/yahorbarkouski/semantic-mail.git
cd semantic-mail
uv pip install -e .
--model / -m - Embedding model (default: nomic-embed-text)
--query / -q - Gmail search query
--limit / -l - Maximum emails to sync
--incremental / -i - Only sync new emails
--clear - Clear existing data before sync
Note: Each embedding model creates a separate collection. You must search using the same model you used for indexing.
The search command finds emails by semantic meaning:
# Basic search (uses default collection)
smail search "Emails I sent to Apple support team in April"# Search with specific embedding model (must match indexed model)
smail search "Plump.ai project deadline" --model nomic-embed-text
smail search "Technical discussion around openmail architecture" -m mxbai-embed-large
# Search with more results, detailed
smail search "Brex transactions" --limit 20 --detailed
# Use specific provider/model combination
smail search "User/pass from slack sent from sre guy" \
--provider ollama --model nomic-embed-text
# Short syntax
smail search "Blockchain password from 2015" -p ollama -m nomic-embed-text
Options:
--provider / -p - Embedding provider (must match indexed data)
--model / -m - Embedding model (must match indexed data)
--limit / -l - Number of results (default: 10)
--detailed / -d - Show full email preview
Ask Questions About Your Emails
The ask command uses AI to answer questions based on your email content:
# Find specific information across multiple emails
smail ask "What was the total amount I spent on Amazon last month?"
smail ask "Summarize all emails about the production outage last week"
smail ask "What's the WiFi password the Airbnb host sent me?"# Use specific LLM model for better answers
smail ask "Create a timeline of the hiring process for the senior engineer position" --model deepseek-r1:8b
# Use specific LLM model for answers
smail ask "Summarize the project status updates" --model deepseek-r1:8b
# Use custom models for both search and answers
smail ask "Find the AWS credentials that DevOps sent last month" \
--embedding-provider ollama --embedding-model nomic-embed-text \
--provider ollama --model deepseek-r1:8b
# Short syntax
smail ask "What's my United frequent flyer number?" \
-ep ollama -em nomic-embed-text \
-p ollama -m deepseek-r1:8b
# Search more emails and get longer answers
smail ask "Based on the email threads, what are the main blockers for the Q4 launch?" \
--search-limit 10 --max-tokens 1000
Options:
--provider / -p - LLM provider for answering (ollama/openai)
--model / -m - LLM model for answering (e.g., deepseek-r1:8b, gpt-4o-mini)
--embedding-provider / -ep - Provider for email search
--embedding-model / -em - Model for email search (must match your indexed model)
--search-limit / -sl - Number of emails to search (default: 5)
--max-tokens / -t - Maximum response length (default: 500)
smail stats # shows all collections, email counts, and last sync dates
smail models # shows available embedding and LLM models
Local models (via Ollama):
nomic-embed-text - 768d, fast (default)
mxbai-embed-large - 1024d, more accurate
all-minilm - 384d, lightweight
Cloud models (OpenAI is the only supported provider for now):
text-embedding-3-small - 1536d, good balance
text-embedding-3-large - 3072d, best quality
Semantic Mail uses environment variables stored in .env: