OpenDeepWiki – the open-source multi-repo AI chat Copilot wishes it were

4 months ago 4

OpenDeepWiki is an advanced AI-powered tool that helps you understand and interact with multiple codebases simultaneously. It automatically analyzes repositories, generates comprehensive documentation, and provides an intelligent chat interface where you can ask questions about your code across multiple projects.

🔄 Multi-Repository Support

Multiple Repository Management: Load and manage multiple repositories simultaneously
Unified Chat Interface: Ask questions across all your repositories in a single conversation
Optimized Pipeline: Efficient processing with individual context retrieval but unified AI response generation
Repository Session Management: Thread-safe handling of multiple repository sessions
Smart Repository Toggling: Activate/deactivate repositories for targeted queries

Glass-morphism Design: Beautiful modern interface with backdrop blur effects
Animated Interactions: Smooth hover effects, transitions, and loading animations
Smart Status System: Context-aware status messages with emoji indicators
Professional Repository Cards: Modern card design with gradient borders and hover effects
Intuitive Repository Manager: Easy-to-use interface for adding, removing, and managing repositories

🧠 Advanced AI Capabilities

🔍 Intelligent Code Analysis: Automatically classifies and analyzes code files, documentation, and configuration files
💬 Multi-Repository AI Chat: Ask questions about your codebase and get contextual answers from AI models that understand your specific code across multiple projects
📚 Cross-Repository Documentation: Extracts and processes docstrings, README files, and documentation from all loaded repositories
🤖 Dynamic Model Selection: Type any model name from any provider - supports Gemini, Claude, and OpenAI models with automatic routing
⚡ Optimized Context Caching: Gemini Context Caching with 30-minute TTL for cost-effective AI responses
🎯 Universal Model Support: Use cutting-edge models like gpt-4.1, claude-4-sonnet, o3, gemini-2.5-pro and more

🌐 Modern Web UI: Clean, responsive React interface with conversation history, markdown rendering, and syntax highlighting
🔗 Flexible Input: Supports both GitHub repositories (via URL) and local repositories (via ZIP upload)
🐋 Containerized: Fully containerized with Docker for easy deployment
📊 Advanced Conversation Management: Save, load, and manage multiple conversation threads with repository context

OpenDeepWiki uses an optimized microservice architecture designed for multi-repository processing:

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Frontend │ │ Controller │ │ Indexer │ │ (React) │◄──►│ (Flask) │◄──►│ (FastAPI) │ │ Port: 7860 │ │ Port: 5050 │ │ Port: 8002 │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ Repo Chat │ │ (FastAPI) │ │ Port: 8001 │ └─────────────────┘

Optimized Multi-Repository Pipeline

The architecture implements an efficient multi-repository processing pipeline:

Individual Repository Processing: Each repository runs through steps 1-8 (query rewriting, context caching, retrieval) independently
Unified Response Generation: All retrieved contexts are combined for a single call to the Final Response Generator
Cost Optimization: Reduces AI API calls while maintaining comprehensive multi-repository awareness
Session Management: Thread-safe handling of multiple repository sessions with conflict resolution

Frontend (React + Vite): Modern web interface with TypeScript support and glass-morphism design
Controller (Flask): Enhanced API gateway with multi-repository session management
Indexer Service (FastAPI): Analyzes and classifies repository files, extracts documentation with conflict resolution
Repo Chat Service (FastAPI): Provides AI-powered responses using multi-repository context aggregation

Docker and Docker Compose
Git (for cloning repositories)

git clone https://github.com/Flopsky/OpenDeepWiki.git cd OpenDeepWiki

Create a .env file with your API keys:

# Copy the example environment file make env # Edit .env with your API keys GEMINI_API_KEY=your_gemini_api_key_here ANTHROPIC_API_KEY=your_anthropic_api_key_here # Optional OPENAI_API_KEY=your_openai_api_key_here # Optional # Optional: Langfuse tracing LANGFUSE_PUBLIC_KEY=your_langfuse_public_key LANGFUSE_SECRET_KEY=your_langfuse_secret_key LANGFUSE_HOST=https://cloud.langfuse.com

Using the Makefile (recommended):

# Setup everything (environment, build, and run) make setup # Or step by step: make build # Build Docker image make run # Run the container

Or using Docker directly:

# Build the Docker image docker build -t opendeepwiki . # Run the container docker run -d --name opendeepwiki_app \ -p 7860:7860 \ -p 5050:5050 \ -p 8001:8001 \ -p 8002:8002 \ --env-file .env \ opendeepwiki

4. Access the Application

Open your browser and navigate to: http://localhost:7860

Managing Multiple Repositories

Adding Repositories:
- GitHub Repository: Click "Add Repository", paste the GitHub URL (e.g., https://github.com/username/repo)
- Local Repository: Click "Upload ZIP" and select your zipped repository
Repository Management:
- View All Repositories: See all loaded repositories in the modern repository manager
- Toggle Active/Inactive: Use the toggle button to activate/deactivate repositories for queries
- Remove Repositories: Hover over repository cards to reveal the delete button
- Status Monitoring: Visual indicators show repository status (Ready, Loading, Error)

Dynamic AI Model Selection

Choose Any Model: Type any model name directly in the model selector
- OpenAI: gpt-4o, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini, o3-mini
- Anthropic: claude-3.5-sonnet-20241022, claude-3-haiku-20240307, claude-3-opus-20240229
- Google: gemini-2.5-pro-preview-03-25, gemini-1.5-flash-8b-001, gemini-1.5-pro-002
Smart Auto-Complete: Get suggestions for popular models while typing
Automatic Routing: The system automatically detects which provider to use based on model name
API Key Management: Configure API keys for each provider in the settings

Multi-Repository Chat Experience

Repository Selection:
- Activate the repositories you want to query by toggling them "on"
- The active repository count is displayed in the header
- Blue indicators show which repositories are active for queries
Cross-Repository Queries:
- Ask questions that span multiple repositories: "Compare the authentication systems in my projects"
- Get unified responses that understand relationships between different codebases
- Responses automatically indicate which repositories contributed to the answer
Smart Context Management:
- Each repository maintains its own optimized context cache
- Queries intelligently combine context from all active repositories
- Single AI call processes all repository contexts for cost efficiency

Example Multi-Repository Queries

"How do the authentication systems differ between my frontend and backend repositories?"
"What are the common patterns used across all my projects?"
"Show me how to integrate the API from repo A with the frontend from repo B"
"Compare the database schemas in my different microservices"
"What dependencies are shared across my repositories?"

New Chat: Click "New Chat" to start a fresh conversation
Switch Conversations: Click on any saved conversation in the sidebar
Delete Conversations: Use the trash icon next to conversations
Repository Context: Conversations remember which repositories were active
Persistent History: All conversations are automatically saved with repository context

🤖 Universal AI Model Support

OpenDeepWiki features Dynamic Model Selection that automatically routes requests to the appropriate AI provider based on the model name you type. This revolutionary approach means you can use any model from any supported provider without changing settings or configurations.

Type Any Model Name: Simply enter the model name in the model selector
Automatic Detection: The system detects the provider based on naming patterns
Smart Routing: Your request is automatically routed to the correct API
Seamless Experience: All models work identically through the same interface

Provider Model Examples Naming Pattern

OpenAI	gpt-4o, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini, o3-mini	Contains gpt or starts with o
Anthropic	claude-3.5-sonnet-20241022, claude-3-haiku-20240307, claude-3-opus-20240229	Contains claude
Google	gemini-2.5-pro-preview-03-25, gemini-1.5-flash-8b-001, gemini-1.5-pro-002	Starts with gemini-

🎯 Zero Configuration: No need to change settings when switching models
🚀 Future-Proof: New models work automatically if they follow naming conventions
💡 Intelligent: Case-insensitive detection with smart fallbacks
⚡ Unified Interface: All models provide the same rich experience
🔄 Easy Switching: Try different models instantly to compare results

🧠 Advanced Gemini Context Caching Technology

OpenDeepWiki leverages Gemini Context Caching with an optimized multi-repository architecture to provide efficient and cost-effective AI responses across multiple codebases.

Multi-Repository Context Caching

Individual Repository Analysis: Each repository gets its own:
- Comprehensive documentation extraction and analysis
- Unique cached context with repository-specific display names
- Conflict resolution for duplicate repository names
- Independent cache lifecycle management
Optimized Cache Strategy:
- Unique Display Names: Repositories get unique identifiers using timestamps and content hashes
- Cache Reuse: Identical repositories automatically reuse existing caches
- Cleanup Management: Maintains 2 most recent caches per repository
- Conflict Resolution: Handles multiple repositories with similar names gracefully
Unified Query Processing:
- Individual Processing: Steps 1-8 (query rewriting, context retrieval) run separately for each active repository
- Combined Context: All repository contexts are aggregated for final response generation
- Single AI Call: Only one call to Final Response Generator, reducing costs while maintaining comprehensive awareness
- Attribution: Responses indicate which repositories contributed to the answer

# Multi-repository pipeline optimization def run_multi_repo_pipeline(query, repositories): contexts = [] # Process each repository individually (steps 1-8) for repo in repositories: context = run_pipeline_up_to_context_retrieval(query, repo) contexts.append(context) # Single unified response generation (step 9) return generate_final_response(query, combined_contexts=contexts) # Enhanced cache creation with unique naming cache = caching.CachedContent.create( model=CONTEXT_CACHING_RETRIVER, display_name=f"{repo_name}_{timestamp}_{hash}", contents=documentation_json, system_instruction=system_prompt, ttl=datetime.timedelta(minutes=30) )

Benefits for Multi-Repository Workflows

⚡ Scalable Performance: Parallel processing of repositories with optimized caching
💰 Cost Efficiency: Single AI call for final response while maintaining full multi-repo context
🎯 Comprehensive Understanding: AI has complete awareness of all active repository structures
🔄 Smart Reuse: Automatic cache detection and reuse across sessions
📊 Advanced Management: Sophisticated cache lifecycle with conflict resolution
🔗 Cross-Repository Intelligence: Understands relationships and patterns across multiple codebases

Python 3.12+
Node.js 18+
Docker

Backend Services:

# Install Python dependencies pip install -r requirements.txt # Run indexer service python -m indexer.server # Run repo chat service python -m repo_chat.server # Run controller python frontend/src/controler.py
Frontend:

cd frontend npm install npm run dev

Testing Multi-Repository Features

# Test multi-repository API endpoints curl -X POST http://localhost:5050/api/add_repo \ -H "Content-Type: application/json" \ -d '{"repo_url": "https://github.com/user/repo1"}' curl -X GET http://localhost:5050/api/list_repos # Test multi-repository chat curl -X POST http://localhost:8001/multi_repo_score \ -H "Content-Type: application/json" \ -d '{"repositories": [...]}'

make help # Show available commands make env # Create .env from template make build # Build Docker image make run # Run container make stop # Stop container make restart # Restart container make logs # View container logs make clean # Remove container and image make prune-all # Full cleanup including unused Docker objects

React 19 with TypeScript
Vite for build tooling
Modern CSS with glass-morphism effects
React Router for navigation
React Markdown for rendering
React Syntax Highlighter for code display
Advanced Animations with CSS keyframes

FastAPI for microservices (Indexer, Repo Chat)
Flask for API gateway (Controller) with session management
Pydantic for data validation
Python 3.12 runtime
Thread-safe multi-repository handling

Google Gemini with context caching (gemini-*, automatic API routing)
Anthropic Claude (claude-*, automatic API routing)
OpenAI GPT & Reasoning Models (gpt-, o, automatic API routing)
Dynamic Model Selection with intelligent provider detection
Langfuse (optional tracing)
Optimized Pipeline for multi-repository processing

Docker for containerization
Supervisord for process management
Nginx for static file serving (in container)

OpenDeepWiki/ ├── frontend/ # React frontend application │ ├── src/ │ │ ├── controler.py # Flask API gateway with multi-repo support │ │ ├── components/ │ │ │ └── RepositoryManager.jsx # Multi-repository management │ │ ├── services/ │ │ │ └── api.js # Enhanced API with multi-repo endpoints │ │ ├── styles/ │ │ │ └── opendeepwiki-theme.css # Modern UI styles │ │ └── ... # React components and pages │ ├── package.json │ └── vite.config.js ├── indexer/ # File classification service │ ├── server.py # FastAPI server │ ├── service.py # Classification logic │ └── schema.py # Data models ├── repo_chat/ # AI chat service │ ├── server.py # FastAPI server with multi-repo endpoints │ ├── service.py # Enhanced chat logic with multi-repo pipeline │ └── schema.py # Data models ├── src/ # Core utilities and shared code │ ├── core/ # Core functionality with cache management │ ├── utils/ # Utility functions │ └── schemas/ # Shared data models ├── MULTI_REPO_ARCHITECTURE.md # Detailed architecture documentation ├── Dockerfile # Container definition ├── supervisord.conf # Process management ├── Makefile # Build and deployment commands ├── requirements.txt # Python dependencies └── README.md # This file

Variable Required Description

GEMINI_API_KEY	✅ Yes	Google Gemini API key (required for gemini-* models)
ANTHROPIC_API_KEY	❌ No	Anthropic Claude API key (required for claude-* models)
OPENAI_API_KEY	❌ No	OpenAI API key (required for gpt-* and o* models)
LANGFUSE_PUBLIC_KEY	❌ No	Langfuse public key for tracing
LANGFUSE_SECRET_KEY	❌ No	Langfuse secret key for tracing
LANGFUSE_HOST	❌ No	Langfuse host URL

The system automatically routes to the correct provider based on model name:

Gemini Models: Any model starting with gemini- (e.g., gemini-2.5-pro-preview-03-25)
OpenAI Models: Any model containing gpt or starting with o (e.g., gpt-4o, o1-preview)
Claude Models: Any model containing claude (e.g., claude-3.5-sonnet-20241022)
Fallback: Unknown models default to Gemini for backward compatibility

Service Port Description

Frontend	7860	Main web interface
Controller	5050	API gateway with multi-repo support
Repo Chat	8001	AI chat service with multi-repo endpoints
Indexer	8002	File analysis service

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Follow Python PEP 8 style guidelines
Use TypeScript for frontend development
Maintain backward compatibility when possible
Add tests for new features
Update documentation as needed
Consider multi-repository implications for new features

✅ Basic repository analysis and indexing
✅ AI-powered chat interface
✅ Multiple LLM support (Gemini, Claude, OpenAI)
✅ Dynamic model selection with automatic provider routing
✅ Universal model support (gpt-4o, claude-3.5-sonnet, o1-preview, etc.)
✅ Conversation history management
✅ Local repository upload via ZIP
✅ Modern React UI with TypeScript
✅ Docker containerization
✅ Multi-repository support with optimized pipeline
✅ Modern glass-morphism UI with animations
✅ Enhanced Gemini Context Caching with conflict resolution
✅ Thread-safe session management
🔄 Add support for anthropic extended context caching
🔄 Even more Advanced RAG techniques for better cross-repository context
🔄 File browser for multi-repository exploration
🔄 Code generation and modification capabilities across repositories
🔄 Integration with IDEs and editors
🔄 Team collaboration features with shared repository collections
🔄 Repository dependency analysis and visualization
🔄 Advanced repository comparison and diff features

Services not starting: Check that all required ports are available
API errors: Verify your API keys are correctly set in .env
Repository analysis fails: Ensure the repository URL is accessible
Docker build fails: Make sure you have sufficient disk space
Multi-repository conflicts: Check repository manager for status indicators
Context caching errors: Verify Gemini API key and check cache management

"API key required" errors:
- For gpt-* or o* models: Configure OPENAI_API_KEY
- For claude-* models: Configure ANTHROPIC_API_KEY
- For gemini-* models: Configure GEMINI_API_KEY
Model not recognized:
- Check the model name spelling
- Verify the model follows supported naming patterns
- Unknown models automatically default to Gemini
Model switching not working:
- Clear your browser cache
- Check the model selector shows your typed value
- Verify the correct API key is configured for the model type

Multi-Repository Specific Issues

Repository not appearing: Check the repository manager status and error messages
Queries not working across repositories: Ensure repositories are toggled "active"
Cache conflicts: Repository names are automatically made unique with timestamps
Performance issues: Consider reducing the number of active repositories for large queries

Check the Issues page
Review the logs: make logs
Test services: python test_services.py
Review the detailed MULTI_REPO_ARCHITECTURE.md for technical details

This project is licensed under the terms specified in the license file.

Built with ❤️ using modern web technologies and advanced AI capabilities
Powered by Google Gemini Context Caching for optimal performance
Inspired by the need for better multi-repository code documentation and understanding
Special thanks to the open-source community for the amazing tools and frameworks

Happy Multi-Repository Coding! 🚀

Transform your development workflow with OpenDeepWiki's powerful multi-repository AI assistance. Whether you're working on microservices, multiple projects, or complex codebases, OpenDeepWiki helps you understand and navigate your code like never before.

For questions, issues, or contributions, please visit our GitHub repository.

Read Entire Article