Show HN: I built a Privacy First local AI RAG GUI for your own documents

3 months ago 3

Status: Beta - Under active development

Byte-Vision is a privacy-first document intelligence platform that transforms static documents into an interactive, searchable knowledge base. Built on Elasticsearch with RAG (Retrieval-Augmented Generation) capabilities, it offers document parsing, OCR processing, and conversational AI interfaces—all running locally to ensure complete data privacy.

📄 Universal Document Processing - Parse PDFs, text files, and CSVs with built-in OCR for image-based content
🔍 AI-Enhanced Search - Semantic search powered by Elasticsearch and vector embeddings
💬 Conversational AI - Document-specific Q&A and free-form chat with local LLM integration
📊 Research Management - Automatically save and organize insights from document analysis
🔒 Privacy-First - Runs entirely locally with no external data transmission
🖥️ Intuitive Interface - Full-featured UI that simplifies complex document operations

For detailed setup instructions, see Installation Guide.

Interface Tour
Installation
Configuration
Usage
Troubleshooting
Development
Contributing
Roadmap
License
Contact

The main "Document Search" screen allows you to locate and analyze documents after they have been parsed and indexed in Elasticsearch.

Click the "View" button to display the original parsed document.

Question and Answer Interface

Default View - History Tab

View previously saved question-answer history items for the selected document.

Enter your questions about the document using this interface.

The system processes your question and searches through the document.

View the AI-generated answers based on your document content.

Export your question-answer sessions to PDF format for documentation.

Document Processing Features

Document Parsing and Chunking

Parse PDF, text, and CSV files for processing and analysis.

View the results of document parsing and chunking operations.

Configure OCR settings for processing scanned documents.

Review extracted text from image-based documents.

Primary inference screen for general AI conversations.

View previous conversations and responses.

Export inference conversations to PDF format.

Component Version Purpose

Go	1.23+	Backend services
Node.js	18+	Frontend build system
Elasticsearch	8.x	Document indexing and search
Wails	v2	Desktop application framework

OS: Windows 10+, macOS 10.13+, or Linux
RAM: 8GB minimum (16GB recommended)
Storage: 5GB free space
CPU: Multi-core processor recommended

CUDA: Enables GPU acceleration for AI models
Docker: Containerize Elasticsearch for easier deployment

1. Clone and Install Dependencies

git clone https://github.com/kbrisso/byte-vision.git cd byte-vision # Install Go dependencies go mod download && go mod tidy # Install Wails CLI go install github.com/wailsapp/wails/v2/cmd/wails@latest # Install frontend dependencies cd frontend && npm install && cd ..

Option A: Docker (Recommended)

-p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.11.0

Option B: Local Installation

Download from Elasticsearch Downloads
Extract and run:

# Windows bin\elasticsearch.bat # macOS/Linux bin/elasticsearch

Option A: Download Pre-built Binaries (Recommended)

Visit LlamaCpp releases
Download for your platform:
- Windows: llama-*-bin-win-x64.zip (CPU) or llama-*-bin-win-cuda-cu*.zip (GPU)
- Linux: llama-*-bin-ubuntu-x64.tar.gz
- macOS: brew install llama.cpp
Extract to llamacpp/ directory

Option B: Build from Source

git clone https://github.com/ggerganov/llama.cpp.git temp-llama cd temp-llama && mkdir build && cd build cmake .. -DLLAMA_CUDA=ON # Add for GPU support cmake --build . --config Release cp bin/llama-cli ../llamacpp/ cd ../.. && rm -rf temp-llama

curl -L -o models/llama-2-7b-chat.Q4_K_M.gguf \ https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf curl -L -o models/all-MiniLM-L6-v2.gguf \ https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2-gguf/resolve/main/all-MiniLM-L6-v2.gguf

Download and install xpdf-tools for PDF processing:

Option A: Download Pre-built Binaries (Recommended)

Visit Xpdf downloads
Download the appropriate version for your platform:
- Windows: xpdf-tools-win-*-setup.exe
- Linux: xpdf-tools-linux-*-static.tar.gz
- macOS: xpdf-tools-mac-*-setup.dmg
Extract or install to the xpdf-tools/ directory in your project root

Option B: Package Manager Installation

# macOS brew install xpdf # Ubuntu/Debian sudo apt-get install xpdf-utils # Windows (using Chocolatey) choco install xpdf-utils

Install Tesseract-OCR for optical character recognition:

Windows:

Download from Tesseract releases
Install the executable
Add Tesseract to your system PATH:
- Add C:\Program Files\Tesseract-OCR to your PATH environment variable
- Or add custom path in byte-vision-cfg.env: TESSERACT_PATH=C:\path\to\tesseract.exe

macOS:

Linux (Ubuntu/Debian):

sudo apt-get install tesseract-ocr

Verify Installation:

Create byte-vision-cfg.env:

ELASTICSEARCH_URL=http://localhost:9200 ELASTICSEARCH_USERNAME=elastic ELASTICSEARCH_PASSWORD=your_password

LLAMA_CLI_PATH=./llamacpp/llama-cli LLAMA_EMBEDDING_PATH=./llamacpp/llama-embedding

MODEL_PATH=./models DEFAULT_INFERENCE_MODEL=llama-2-7b-chat.Q4_K_M.gguf DEFAULT_EMBEDDING_MODEL=all-MiniLM-L6-v2.gguf

MAX_CHUNK_SIZE=1000 CHUNK_OVERLAP=200 LOG_LEVEL=INFO

The application will launch with hot reload enabled.

The built application will be in the build/ directory

The application uses environment variables defined in byte-vision-cfg.env:

Variable Description Default

ELASTICSEARCH_URL	Elasticsearch server URL	http://localhost:9200
ELASTICSEARCH_USERNAME	Elasticsearch username	elastic
ELASTICSEARCH_PASSWORD	Elasticsearch password	-
LLAMA_CLI_PATH	Path to llama-cli executable	./llamacpp/llama-cli
LLAMA_EMBEDDING_PATH	Path to llama-embedding executable	./llamacpp/llama-embedding
MODEL_PATH	Directory containing AI models	./models
DEFAULT_INFERENCE_MODEL	Default model for inference	-
DEFAULT_EMBEDDING_MODEL	Default model for embeddings	-
MAX_CHUNK_SIZE	Maximum text chunk size	1000
CHUNK_OVERLAP	Overlap between chunks	200
LOG_LEVEL	Application log level	INFO

Start Elasticsearch: Ensure Elasticsearch is running
Launch Byte-Vision: Run the application
Configure Models: Go to Settings → LlamaCpp Settings and set paths
Test Connection: Verify Elasticsearch connection in Settings

Upload Documents: Use the document parser to upload and process files
Configure Chunking: Adjust text chunking settings for optimal search
Index Documents: Process documents for embedding and search

Select a document from the search results
Click "Ask Questions" to open the Q&A interface
Enter your questions and receive AI-generated answers
View answer sources and confidence scores
Export Q&A sessions to PDF

Ask Questions: Use the document question modal to query your documents
Export Results: Export chat history to PDF for documentation
Compare Responses: Use the comparison feature to evaluate different model outputs

Access the AI Inference screen for general conversations
Chat with your local LLM models
Export conversation history
Compare different model responses

❌ Elasticsearch Connection Failed

Symptoms: Cannot connect to Elasticsearch service

Solutions:

Verify Elasticsearch is running:
curl http://localhost:9200
Check if port 9200 is available:
Verify configuration in byte-vision-cfg.env
Check firewall settings
For Docker: Ensure container is running
docker ps | grep elasticsearch

❌ LlamaCpp Model Loading Error

Symptoms: Model fails to load or produces errors

Solutions:

Verify model file exists in models/ directory
Check model format (must be .gguf)
Ensure sufficient RAM for model size
Verify LLAMA_CLI_PATH in configuration
Test LlamaCpp directly:
./llamacpp/llama-cli --model ./models/your-model.gguf --prompt "Hello"

❌ Frontend Build Errors

Symptoms: npm install or build failures

Solutions:

Clear npm cache:
cd frontend rm -rf node_modules package-lock.json npm cache clean --force npm install
Check Node.js version: node --version
Update npm: npm install -g npm@latest

❌ Port Already in Use

Symptoms: Application fails to start due to port conflicts

Solutions:

Find process using port:
# Windows netstat -ano | findstr :3000 # macOS/Linux lsof -ti:3000
Kill process:
# Windows taskkill /PID <PID> /F # macOS/Linux kill -9 <PID>

GPU Acceleration: Install CUDA/ROCm for faster model inference
Model Selection: Use smaller quantized models for better performance
Memory Management: Adjust Elasticsearch heap size for large document collections
Chunking Optimization: Tune MAX_CHUNK_SIZE and CHUNK_OVERLAP for your use case

Enable debug logging:

Check logs in ./logs/ directory for detailed error information.

Wails - Desktop application framework
Go - Backend services and APIs
React - Frontend user interface
Elasticsearch - Document indexing and search
Llama.cpp - Local AI model inference

React Bootstrap - UI components
Bootstrap 5 - CSS framework
React PDF - PDF generation and viewing
Vite - Build tooling

go-ocr - OCR processing
chunker - Text chunking

byte-vision/ ├── 📁 build/ # Built application files ├── 📁 document/ # Document storage ├── 📁 frontend/ # React frontend source │ ├── 📁 src/ │ └── 📁 public/ ├── 📁 llamacpp/ # LlamaCpp binaries ├── 📁 logs/ # Application logs ├── 📁 models/ # AI model files (.gguf) ├── 📁 prompt-cache/ # Cached prompts ├── 📁 prompt-temp/ # Prompt templates ├── 📁 xpdf-tools/ # PDF processing tools ├── 📄 byte-vision-cfg.env # Configuration file ├── 📄 wails.json # Wails configuration └── 📄 go.mod # Go dependencies

Application logs: ./logs/
Elasticsearch logs: Check Elasticsearch installation directory
Debug mode: wails dev -debug
Frontend logs: Browser developer console
Backend logs: Terminal output during development

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also open an issue with the tag "enhancement." Remember to give the project a star! Thanks again!