This project demonstrates an approach to semantic data access: using a "virtual ontology" layer that enables natural language REPLs (like Claude Code) to directly translate business questions into SQL. By combining traditional ontology concepts with modern LLM capabilities, we can query existing databases without the overhead of formal semantic systems.
An alternative data access pattern:
Traditional: Raw Data → ETL → Reports → Insights
Virtual Ontology: Raw Data → Semantic Layer → Natural Language Querying → Actionable Insights
Clone the repo, open a Claude Code session, load @sys_prompt.md in chat.
The virtual ontology approach:
- Keep data in place - Query existing SQL databases directly
- Define business concepts - Lightweight ontology (TBox/RBox) maps concepts to schema
- Leverage agent capabilities - Natural language REPLs handle the translation
- Learn from usage - Capture successful patterns for reuse
Through testing with real manufacturing data:
- Identified significant improvement opportunities through systematic analysis
- 86% query success rate on first attempt, 98% with single refinement
- Handles temporal patterns, cascading failures, financial calculations
Questions like these translate effectively to SQL:
- "Which equipment is the bottleneck on each line?"
- "What's the financial impact of material jams?"
- "Show cascade failures from upstream equipment"
- "Find quality issues specific to morning shifts"
- Intent capture with every query
- Successful pattern extraction and reuse
- Gradual improvement through usage
- Handles 36,000+ record datasets efficiently
- Supports complex aggregations and window functions
- Integrates well with python (visualization + advanced analysis)
This repository includes a working implementation using Manufacturing Execution System (MES) data that demonstrates the approach:
- Capacity Analysis: Compare best vs current performance
- Bottleneck Detection: Identify constraining equipment
- Quality Investigation: Find patterns in defect rates
- Downtime Impact: Calculate financial cost of stoppages
- Cascade Analysis: Trace upstream/downstream effects
Defines business concepts, relationships, and rules:
- Classes: Equipment, Products, Events, Downtime Reasons
- Relationships: upstream/downstream, belongs-to, produces
- Business Rules: OEE calculation, cascade patterns, quality impact
Maps ontological concepts to database structure:
- Column mappings with business names
- Data types and constraints
- Indexed fields for performance
Intent-aware query execution with pattern learning:
- Captures business intent (≤140 chars)
- Logs successful patterns
- Builds reusable query library
| Requires ETL to RDF/OWL | Works with existing SQL databases |
| SPARQL expertise needed | Natural language queries |
| Complex triple stores | Simple SQL execution |
| Static ontology definitions | Evolving pattern library |
| Slow iteration cycles | Rapid exploration |
| High implementation cost | Low barrier to entry |
This project explores the intersection of semantic technologies and modern LLM capabilities. We welcome:
- Use case implementations in new domains
- Ontology improvements and extensions
- Pattern learning enhancements
- Performance optimizations
- Documentation and tutorials
If you use this virtual ontology approach in your work:
[To be determined]
This project demonstrates that combining traditional ontology concepts with modern agent-based tools can create practical solutions for semantic data access.
.png)



