Self-hosted Retrieval Augmented Generation (RAG) Platform
Nosia is a platform that allows you to run AI models on your own data with complete privacy and control. It is designed to be easy to install and use, providing OpenAI-compatible APIs that work seamlessly with existing AI applications.
- 🔒 Private & Secure - Your data stays on your infrastructure
- 🤖 OpenAI-Compatible API - Drop-in replacement for OpenAI clients
- 📚 RAG-Powered - Augment AI responses with your documents
- 🔄 Real-time Streaming - Server-sent events for live responses
- 📄 Multi-format Support - PDFs, text files, websites, and Q&A pairs
- 🎯 Semantic Search - Vector similarity search with pgvector
- 🐳 Easy Deployment - Docker Compose with one-command setup
- 🔑 Multi-tenancy - Account-based isolation for secure data separation
- 📖 Nosia Guides - Step-by-step tutorials
- 🏗️ Architecture Documentation - Technical deep dive
- 💬 Community Support - Get help
- 📐 Architecture - Detailed system design and implementation
- 📊 System Diagrams - Visual representations of system components
- 🚀 Deployment Guide - Production deployment strategies and best practices
- 📋 Documentation Index - Complete documentation overview
- 🤝 Code of Conduct - Community guidelines
- Quickstart
- Configuration
- Using Nosia
- Managing Your Installation
- Troubleshooting
- Contributing
- License
Get Nosia up and running in minutes on macOS, Debian, or Ubuntu.
- macOS, Debian, or Ubuntu operating system
- Internet connection
- sudo/root access (for Docker installation if needed)
The installation script will:
- Install Docker and Docker Compose if not already present
- Download Nosia configuration files
- Generate a secure .env file
- Pull all required Docker images
You should see the following output:
Start all services with:
Once started, access Nosia at:
- Web Interface: https://nosia.localhost
- API Endpoint: https://nosia.localhost/v1
Note: The default installation uses a self-signed SSL certificate. Your browser will show a security warning on first access. For production deployments, see the Deployment Guide for proper SSL certificate configuration.
By default, Nosia uses:
- Completion model: ai/granite-4.0-h-tiny
- Embeddings model: ai/granite-embedding-multilingual
You can use any completion model available on Docker Hub AI by setting the LLM_MODEL environment variable during installation.
Example with Granite 4.0 32B:
Model options:
- ai/granite-4.0-h-micro - 3B long-context instruct model by IBM
- ai/granite-4.0-h-tiny - 7B long-context instruct model by IBM (default)
- ai/granite-4.0-h-small - 32B long-context instruct model by IBM
- ai/mistral - Efficient open model (7B) with top-tier performance and fast inference by Mistral AI
- ai/magistral-small-3.2 - 24B multimodal instruction model by Mistral AI
- ai/devstral-small - Agentic coding LLM (24B) fine-tuned from Mistral-Small 3.1 by Mistral AI
- ai/llama3.3 - Meta's Llama 3.3 model
- ai/gemma3 - Google's Gemma 3 model
- ai/qwen3 - Alibaba's Qwen 3 model
- ai/deepseek-r1-distill-llama - DeepSeek's distilled Llama model
- Browse more at Docker Hub AI
By default, Nosia uses ai/granite-embedding-multilingual for generating document embeddings.
To change the embeddings model:
-
Update the environment variables in your .env file:
EMBEDDING_MODEL=your-preferred-embedding-model EMBEDDING_DIMENSIONS=768 # Adjust based on your model's output dimensions -
Restart Nosia to apply changes:
docker compose down docker compose up -d -
Update existing embeddings (if you have documents already indexed):
docker compose run web bin/rails embeddings:change_dimensions
Important: Different embedding models produce vectors of different dimensions. Ensure EMBEDDING_DIMENSIONS matches your model's output size, or vector search will fail.
Docling provides enhanced document processing capabilities for complex PDFs and documents.
To enable Docling:
-
Start Nosia with the Docling compose file:
docker compose -f docker-compose-docling.yml up -
Configure the Docling URL in your .env file:
DOCLING_SERVE_BASE_URL=http://localhost:5001
This starts a Docling serve instance on port 5001 that Nosia will use for advanced document parsing.
Enable Retrieval Augmented Generation to enhance AI responses with relevant context from your documents.
To enable RAG:
Add to your .env file:
When enabled, Nosia will:
- Search your document knowledge base for relevant chunks
- Include the most relevant context in the AI prompt
- Generate responses grounded in your specific data
Additional RAG configuration:
Nosia validates required environment variables at startup to prevent runtime failures. If any required variables are missing or invalid, the application will fail to start with a clear error message.
| SECRET_KEY_BASE | Rails secret key for session encryption | Generate with bin/rails secret |
| AI_BASE_URL | Base URL for OpenAI-compatible API | http://model-runner.docker.internal/engines/llama.cpp/v1 |
| LLM_MODEL | Language model identifier | ai/mistral, ai/granite-4.0-h-tiny |
| EMBEDDING_MODEL | Embedding model identifier | ai/granite-embedding-multilingual |
| EMBEDDING_DIMENSIONS | Embedding vector dimensions | 768, 384, 1536 |
| AI_API_KEY | API key for the AI service | empty | Any string |
| LLM_TEMPERATURE | Model creativity (lower = more factual) | 0.1 | 0.0 - 2.0 |
| LLM_TOP_K | Top K sampling parameter | 40 | 1 - 100 |
| LLM_TOP_P | Top P (nucleus) sampling | 0.9 | 0.0 - 1.0 |
| RETRIEVAL_FETCH_K | Number of document chunks to retrieve for RAG | 3 | 1 - 10 |
| AUGMENTED_CONTEXT | Enable RAG for chat completions | false | true, false |
| DOCLING_SERVE_BASE_URL | Docling document processing service URL | empty | http://localhost:5001 |
See .env.example for a complete list of configuration options.
The installation script automatically generates a .env file. To customize:
-
Edit the .env file in your installation directory:
-
Update values as needed and restart:
docker compose down docker compose up -d
-
Copy the example environment file:
-
Generate a secure secret key:
SECRET_KEY_BASE=$(bin/rails secret) echo "SECRET_KEY_BASE=$SECRET_KEY_BASE" >> .env -
Update other required values in .env:
AI_BASE_URL=http://your-ai-service:11434/v1 LLM_MODEL=ai/mistral EMBEDDING_MODEL=ai/granite-embedding-multilingual EMBEDDING_DIMENSIONS=768 -
Test your configuration:
bin/rails runner "puts 'Configuration valid!'"
If validation fails, you'll see a detailed error message indicating which variables are missing or invalid.
After starting Nosia, access the web interface at https://nosia.localhost:
- Create an account or log in
- Upload documents - PDFs, text files, or add website URLs
- Create Q&A pairs - Add domain-specific knowledge
- Start chatting - Ask questions about your documents
Nosia provides an OpenAI-compatible API that works with existing OpenAI client libraries.
- Log in to Nosia web interface
- Navigate to https://nosia.localhost/api_tokens
- Click "Generate Token" and copy your API key
- Store it securely - it won't be shown again
Configure your OpenAI client to use Nosia:
Python Example:
cURL Example:
Node.js Example:
For more API examples and details, see the API Guide.
Start all Nosia services:
Check that all services are running:
Stop all running services:
Keep Nosia up to date with the latest features and security fixes:
Upgrade checklist:
- Backup your data before upgrading (see Deployment Guide)
- Review release notes for breaking changes
- Pull latest images
- Restart services
- Verify functionality
View logs for troubleshooting:
Verify Nosia is running correctly:
Docker not found:
Permission denied:
Services won't start:
Slow AI responses:
- Check background jobs: https://nosia.localhost/jobs
- View job logs:
docker compose logs -f solidq
- Ensure your hardware meets minimum requirements (see Deployment Guide)
Can't access web interface:
Database connection errors:
Documents not processing:
- Check background jobs: https://nosia.localhost/jobs
- View processing logs:
docker compose logs -f web
- Verify embedding service is running:
docker compose ps embedding
Embedding errors:
| Installation | ./log/production.log | tail -f log/production.log |
| Runtime errors | Docker logs | docker compose logs -f web |
| Background jobs | Jobs dashboard | Visit https://nosia.localhost/jobs |
| Database | PostgreSQL logs | docker compose logs postgres-db |
| AI model | LLM container logs | docker compose logs llm |
If you need further assistance:
-
Check Documentation:
- Architecture Guide - Understand how Nosia works
- Deployment Guide - Advanced configuration
-
Search Existing Issues:
- GitHub Issues
- Someone may have encountered the same problem
-
Open a New Issue:
- Include your Nosia version: docker compose images | grep web
- Describe the problem with steps to reproduce
- Include relevant logs (remove sensitive information)
- Specify your OS and Docker version
-
Community Support:
- GitHub Discussions
- Share your use case and get advice from the community
We welcome contributions! Here's how you can help:
- Report bugs - Open an issue with details and reproduction steps
- Suggest features - Share your ideas in GitHub Discussions
- Improve documentation - Submit PRs for clarity and accuracy
- Write code - Fix bugs or implement new features
- Share your experience - Write blog posts or tutorials
See CONTRIBUTING.md if available, or start by opening an issue to discuss your ideas.
Nosia is open source software. See LICENSE for details.
- Website: nosia.ai
- Documentation: guides.nosia.ai
- Source Code: github.com/nosia-ai/nosia
- Docker Hub: hub.docker.com/u/ai
Built with ❤️ by the Nosia community
.png)




