Status: Beta - Under active development
Byte-Vision is a privacy-first document intelligence platform that transforms static documents into an interactive, searchable knowledge base. Built on Elasticsearch with RAG (Retrieval-Augmented Generation) capabilities, it offers document parsing, OCR processing, and conversational AI interfaces—all running locally to ensure complete data privacy.
- 📄 Universal Document Processing - Parse PDFs, text files, and CSVs with built-in OCR for image-based content
- 🔍 AI-Enhanced Search - Semantic search powered by Elasticsearch and vector embeddings
- 💬 Conversational AI - Document-specific Q&A and free-form chat with local LLM integration
- 📊 Research Management - Automatically save and organize insights from document analysis
- 🔒 Privacy-First - Runs entirely locally with no external data transmission
- 🖥️ Intuitive Interface - Full-featured UI that simplifies complex document operations
For detailed setup instructions, see Installation Guide.
- Interface Tour
- Installation
- Configuration
- Usage
- Troubleshooting
- Development
- Contributing
- Roadmap
- License
- Contact
The main "Document Search" screen allows you to locate and analyze documents after they have been parsed and indexed in Elasticsearch.
Click the "View" button to display the original parsed document.
View previously saved question-answer history items for the selected document.
Enter your questions about the document using this interface.
The system processes your question and searches through the document.
View the AI-generated answers based on your document content.
Export your question-answer sessions to PDF format for documentation.
Parse PDF, text, and CSV files for processing and analysis.
View the results of document parsing and chunking operations.
Configure OCR settings for processing scanned documents.
Review extracted text from image-based documents.
Primary inference screen for general AI conversations.
View previous conversations and responses.
Export inference conversations to PDF format.
| Go | 1.23+ | Backend services |
| Node.js | 18+ | Frontend build system |
| Elasticsearch | 8.x | Document indexing and search |
| Wails | v2 | Desktop application framework |
- OS: Windows 10+, macOS 10.13+, or Linux
- RAM: 8GB minimum (16GB recommended)
- Storage: 5GB free space
- CPU: Multi-core processor recommended
- CUDA: Enables GPU acceleration for AI models
- Docker: Containerize Elasticsearch for easier deployment
Option A: Docker (Recommended)
Option B: Local Installation
- Download from Elasticsearch Downloads
- Extract and run:
Option A: Download Pre-built Binaries (Recommended)
- Visit LlamaCpp releases
- Download for your platform:
- Windows: llama-*-bin-win-x64.zip (CPU) or llama-*-bin-win-cuda-cu*.zip (GPU)
- Linux: llama-*-bin-ubuntu-x64.tar.gz
- macOS: brew install llama.cpp
- Extract to llamacpp/ directory
Option B: Build from Source
Download and install xpdf-tools for PDF processing:
Option A: Download Pre-built Binaries (Recommended)
- Visit Xpdf downloads
- Download the appropriate version for your platform:
- Windows: xpdf-tools-win-*-setup.exe
- Linux: xpdf-tools-linux-*-static.tar.gz
- macOS: xpdf-tools-mac-*-setup.dmg
- Extract or install to the xpdf-tools/ directory in your project root
Option B: Package Manager Installation
Install Tesseract-OCR for optical character recognition:
Windows:
- Download from Tesseract releases
- Install the executable
- Add Tesseract to your system PATH:
- Add C:\Program Files\Tesseract-OCR to your PATH environment variable
- Or add custom path in byte-vision-cfg.env: TESSERACT_PATH=C:\path\to\tesseract.exe
macOS:
Linux (Ubuntu/Debian):
Verify Installation:
Create byte-vision-cfg.env:
ELASTICSEARCH_URL=http://localhost:9200 ELASTICSEARCH_USERNAME=elastic ELASTICSEARCH_PASSWORD=your_password
LLAMA_CLI_PATH=./llamacpp/llama-cli LLAMA_EMBEDDING_PATH=./llamacpp/llama-embedding
MODEL_PATH=./models DEFAULT_INFERENCE_MODEL=llama-2-7b-chat.Q4_K_M.gguf DEFAULT_EMBEDDING_MODEL=all-MiniLM-L6-v2.gguf
MAX_CHUNK_SIZE=1000 CHUNK_OVERLAP=200 LOG_LEVEL=INFO
The application will launch with hot reload enabled.
The built application will be in the build/ directory
The application uses environment variables defined in byte-vision-cfg.env:
| ELASTICSEARCH_URL | Elasticsearch server URL | http://localhost:9200 |
| ELASTICSEARCH_USERNAME | Elasticsearch username | elastic |
| ELASTICSEARCH_PASSWORD | Elasticsearch password | - |
| LLAMA_CLI_PATH | Path to llama-cli executable | ./llamacpp/llama-cli |
| LLAMA_EMBEDDING_PATH | Path to llama-embedding executable | ./llamacpp/llama-embedding |
| MODEL_PATH | Directory containing AI models | ./models |
| DEFAULT_INFERENCE_MODEL | Default model for inference | - |
| DEFAULT_EMBEDDING_MODEL | Default model for embeddings | - |
| MAX_CHUNK_SIZE | Maximum text chunk size | 1000 |
| CHUNK_OVERLAP | Overlap between chunks | 200 |
| LOG_LEVEL | Application log level | INFO |
- Start Elasticsearch: Ensure Elasticsearch is running
- Launch Byte-Vision: Run the application
- Configure Models: Go to Settings → LlamaCpp Settings and set paths
- Test Connection: Verify Elasticsearch connection in Settings
- Upload Documents: Use the document parser to upload and process files
- Configure Chunking: Adjust text chunking settings for optimal search
- Index Documents: Process documents for embedding and search
- Select a document from the search results
- Click "Ask Questions" to open the Q&A interface
- Enter your questions and receive AI-generated answers
- View answer sources and confidence scores
- Export Q&A sessions to PDF
- Ask Questions: Use the document question modal to query your documents
- Export Results: Export chat history to PDF for documentation
- Compare Responses: Use the comparison feature to evaluate different model outputs
- Access the AI Inference screen for general conversations
- Chat with your local LLM models
- Export conversation history
- Compare different model responses
Symptoms: Cannot connect to Elasticsearch service
Solutions:
- Verify Elasticsearch is running:
curl http://localhost:9200
- Check if port 9200 is available:
- Verify configuration in byte-vision-cfg.env
- Check firewall settings
- For Docker: Ensure container is running
docker ps | grep elasticsearch
Symptoms: Model fails to load or produces errors
Solutions:
- Verify model file exists in models/ directory
- Check model format (must be .gguf)
- Ensure sufficient RAM for model size
- Verify LLAMA_CLI_PATH in configuration
- Test LlamaCpp directly:
./llamacpp/llama-cli --model ./models/your-model.gguf --prompt "Hello"
Symptoms: npm install or build failures
Solutions:
- Clear npm cache:
cd frontend rm -rf node_modules package-lock.json npm cache clean --force npm install
- Check Node.js version: node --version
- Update npm: npm install -g npm@latest
Symptoms: Application fails to start due to port conflicts
Solutions:
- Find process using port:
# Windows netstat -ano | findstr :3000 # macOS/Linux lsof -ti:3000
- Kill process:
# Windows taskkill /PID <PID> /F # macOS/Linux kill -9 <PID>
- GPU Acceleration: Install CUDA/ROCm for faster model inference
- Model Selection: Use smaller quantized models for better performance
- Memory Management: Adjust Elasticsearch heap size for large document collections
- Chunking Optimization: Tune MAX_CHUNK_SIZE and CHUNK_OVERLAP for your use case
Enable debug logging:
Check logs in ./logs/ directory for detailed error information.
- Wails - Desktop application framework
- Go - Backend services and APIs
- React - Frontend user interface
- Elasticsearch - Document indexing and search
- Llama.cpp - Local AI model inference
- React Bootstrap - UI components
- Bootstrap 5 - CSS framework
- React PDF - PDF generation and viewing
- Vite - Build tooling
- Application logs: ./logs/
- Elasticsearch logs: Check Elasticsearch installation directory
- Debug mode: wails dev -debug
- Frontend logs: Browser developer console
- Backend logs: Terminal output during development
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also open an issue with the tag "enhancement." Remember to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (git checkout -b feature/AmazingFeature)
- Commit your Changes (git commit -m 'Add some AmazingFeature')
- Push to the Branch (git push origin feature/AmazingFeature)
- Open a Pull Request
- Follow Go formatting standards (go fmt)
- Write tests for new features
- Update documentation for API changes
- Use semantic commit messages
- Ensure all tests pass before submitting
- Settings persistence for llama-cli configuration
- Settings persistence for llama-embedding configuration
- Enhanced documentation and examples
- Additional document format support (DOCX, PPT, etc.)
- Advanced search filters and operators
- Batch document processing capabilities
- RESTful API for external integrations
- Docker deployment configuration
- User authentication and access control
- Cloud storage integration (S3, Google Drive, etc.)
- Multi-language support
- Advanced analytics and reporting
- Distributed processing for large document collections
- Plugin architecture for custom processors
- Integration with external AI services
- Mobile application companion
See open issues for detailed feature requests and bug reports.
This project is licensed under the terms of the MIT license.
Kevin Brisson - LinkedIn - [email protected]
Project Link: https://github.com/kbrisso/byte-vision
⭐ Star this project if you find it helpful!
.png)


















