semcache is a semantic caching layer for your LLM applications.
Start the Semcache Docker image:
Configure your application e.g with the OpenAI Python SDK:
Node.js follows a similar pattern of changing the base URL to point to your Semcache host:
- 🧠 Completely in-memory - Prompts, responses and the vector database are stored in-memory
- 🎯 Flexible by design - Can work with your custom or private LLM APIs
- 🔌 Support for major LLM APIs - OpenAI, Anthropic, Gemini, and more
- ⚡ HTTP proxy mode - Drop-in replacement that reduces costs and latency
- 📈 Prometheus metrics - Full observability out of the box
- 📊 Build-in dashboard - Monitor cache performance at /admin
- 📤 Smart eviction - LRU cache eviction policy
Semcache is still in beta and being actively developed.
Semcache accelerates LLM applications by caching responses based on semantic similarity.
When you make a request Semcache first searches for previously cached answers to similar prompts and delivers them immediately. This eliminates redundant API calls, reducing both latency and costs.
Semcache also operates in a "cache-aside" mode, allowing you to load prompts and responses yourself.
For comprehensive provider configuration and detailed code examples, visit our LLM Providers & Tools documentation.
Point your existing SDK to Semcache instead of the provider's endpoint.
OpenAI
Anthropic
LangChain
LiteLLM
Install with:
Configure via environment variables or config.yaml:
Environment variables (prefix with SEMCACHE_):
Semcache emits comprehensive Prometheus metrics for production monitoring.
Check out our /monitoring directory for our custom Grafana dashboard.
Access the admin dashboard at /admin to monitor cache performance.
Our managed version of Semcache provides you with semantic caching as a service.
Features we offer:
- Custom text embedding models for your specific business
- Persistent storage allowing you to build application memory over time
- In-depth analysis of your LLM responses
- SLA support and dedicated engineering resources
Contact us at [email protected]
Interested in contributing? Contributions to Semcache are welcome! Feel free to make a PR.
Built with ❤️ in Rust • Documentation • GitHub Issues