A semantic code search tool for intelligent, cross-repo context retrieval.
- AST-Based Chunking: Intelligent code parsing using Abstract Syntax Trees for optimal chunk boundaries
- Embedding & Semantic Search: Using OpenAI's text-embedding-3-small model (support for voyage-code-3 planned)
- Vector Database: PostgreSQL with pgvector extension for efficient similarity search
- Multi-Language Support: TypeScript, JavaScript, and extensible for other languages
- Multi-Project Support: Index and search multiple projects
- MCP Integration: Seamlessly connects with AI coding assistants through Model Context Protocol
h-codex can be integrated with AI assistants through the Model Context Protocol.
Edit your claude_mcp_settings.json file:
{
"mcpServers": {
"h-codex": {
"command": "npx",
"args": ["@hpbyte/h-codex-mcp"],
"env": {
"OPENAI_API_KEY": "your_openai_api_key_here",
"DB_CONNECTION_STRING": "postgresql://postgres:password@localhost:5432/h-codex"
}
}
}
}
- Node.js (v18+)
- pnpm - Package manager
- Docker - For running PostgreSQL with pgvector
- OpenAI API key for embeddings
-
Clone the repository
git clone https://github.com/hpbyte/h-codex.git cd h-codex -
Set up environment variables
cp packages/core/.env.example packages/core/.envEdit the .env file with your OpenAI API key and other configuration options.
-
Install dependencies
-
Start PostgreSQL database
cd dev && docker compose up -d -
Set up the database
-
Start development server
OPENAI_API_KEY | OpenAI API key for embeddings | Required |
EMBEDDING_MODEL | OpenAI model for embeddings | text-embedding-3-small |
CHUNK_SIZE | Maximum chunk size in characters | 1000 |
SEARCH_RESULTS_LIMIT | Max search results returned | 10 |
SIMILARITY_THRESHOLD | Minimum similarity for results | 0.5 |
DB_CONNECTION_STRING | PostgreSQL connection string | postgresql://postgres:password@localhost:5432/h-codex |
graph TD
subgraph "Core Package"
subgraph "Ingestion Pipeline"
Explorer["Explorer<br/>(file discovery)"]
Chunker["Chunker<br/>(AST parsing & chunking)"]
Embedder["Embedder<br/>(semantic embeddings)"]
Indexer["Indexer<br/>(orchestration)"]
Explorer --> Chunker
Chunker --> Embedder
Embedder --> Indexer
end
subgraph "Storage Layer"
Repository["Repository"]
end
Indexer --> Repository
Repository --> Database[(PostgreSQL Vector Database)]
end
subgraph "MCP Package"
MCPServer["MCP Server"]
CodeIndexTool["Code Index Tool"]
CodeSearchTool["Code Search Tool"]
MCPServer --> CodeIndexTool
MCPServer --> CodeSearchTool
end
CodeIndexTool --> Indexer
CodeSearchTool --> Repository
- Support for additional embedding providers (Voyage AI)
- Enhanced language support with more tree-sitter parsers
This project is licensed under the MIT License