You can set the directories to monitor using the MONITOR_DIRECTORIES environment variable (as comma separated values) :
# Monitor single directoryexport MONITOR_DIRECTORIES="/path/to/documents,/another_path/to/documents"
If you want to use an alternative embeddings provider (Ollama being the default) you will need to set the provider details through environment variables:
By default:
EMBEDDINGS_PROVIDER="ollama"
EMBEDDINGS_MODEL="mxbai-embed-large"# or any other model
EMBEDDINGS_VECTOR_DIM=1024
For VoyageAI:
EMBEDDINGS_PROVIDER="voyageai"
EMBEDDINGS_MODEL="voyage-3.5"# or any other model
EMBEDDINGS_VECTOR_DIM=1024
VOYAGE_API_KEY="your-api-key"
For OpenAI:
EMBEDDINGS_PROVIDER="openai"
EMBEDDINGS_MODEL="text-embedding-3-small"# or text-embedding-3-large
EMBEDDINGS_VECTOR_DIM=1536
OPENAI_API_KEY="your-api-key"
haiku.rag includes a CLI application for managing documents and performing searches from the command line:
# List all documents
haiku-rag list
# Add document from text
haiku-rag add "Your document content here"# Add document from file or URL
haiku-rag add-src /path/to/document.pdf
haiku-rag add-src https://example.com/article.html
# Get and display a specific document
haiku-rag get 1
# Delete a document by ID
haiku-rag delete 1
# Search documents
haiku-rag search "machine learning"# Search with custom options
haiku-rag search "python programming" --limit 10 --k 100
# Start file monitoring & MCP server (default HTTP transport)
haiku-rag serve # --stdio for stdio transport or --sse for SSE transport
All commands support the --db option to specify a custom database path. Run
to see additional parameters for a command.
File Monitoring & MCP server
You can start the server (using Streamble HTTP, stdio or SSE transports) with:
# Start with default HTTP transport
haiku-rag serve # --stdio for stdio transport or --sse for SSE transport
You need to have set the MONITOR_DIRECTORIES environment variable for monitoring to take place.
haiku.rag can watch directories for changes and automatically update the document store:
Startup: Scan all monitored directories and add any new files
File Added/Modified: Automatically parse and add/update the document in the database
File Deleted: Remove the corresponding document from the database
haiku.rag includes a Model Context Protocol (MCP) server that exposes RAG functionality as tools for AI assistants like Claude Desktop. The MCP server provides the following tools:
add_document_from_file - Add documents from local file paths
add_document_from_url - Add documents from URLs
add_document_from_text - Add documents from raw text content
search_documents - Search documents using hybrid search
get_document - Retrieve specific documents by ID
list_documents - List all documents with pagination
delete_document - Delete documents by ID
Using haiku.rag from python
frompathlibimportPathfromhaiku.rag.clientimportHaikuRAG# Use as async context manager (recommended)asyncwithHaikuRAG("path/to/database.db") asclient:
# Create document from textdoc=awaitclient.create_document(
content="Your document content here",
uri="doc://example",
metadata={"source": "manual", "topic": "example"}
)
# Create document from file (auto-parses content)doc=awaitclient.create_document_from_source("path/to/document.pdf")
# Create document from URLdoc=awaitclient.create_document_from_source("https://example.com/article.html")
# Retrieve documentsdoc=awaitclient.get_document_by_id(1)
doc=awaitclient.get_document_by_uri("file:///path/to/document.pdf")
# List all documents with paginationdocs=awaitclient.list_documents(limit=10, offset=0)
# Update document contentdoc.content="Updated content"awaitclient.update_document(doc)
# Delete documentawaitclient.delete_document(doc.id)
# Search documents using hybrid search (vector + full-text)results=awaitclient.search("machine learning algorithms", limit=5)
forchunk, scoreinresults:
print(f"Score: {score:.3f}")
print(f"Content: {chunk.content}")
print(f"Document ID: {chunk.document_id}")
print("---")
asyncwithHaikuRAG("database.db") asclient:
results=awaitclient.search(
query="machine learning",
limit=5, # Maximum results to return, defaults to 5k=60# RRF parameter for reciprocal rank fusion, defaults to 60
)
# Process resultsforchunk, relevance_scoreinresults:
print(f"Relevance: {relevance_score:.3f}")
print(f"Content: {chunk.content}")
print(f"From document: {chunk.document_id}")