Datus-agent, Claude-Code-like CLI for data engineers

1 day ago 1

Datus is an open-source data engineering agent that builds evolvable context for your data system.

Data engineering needs a shift from "building tables and pipelines" to "delivering scoped, domain-aware agents for analysts and business users.

DatusArchitecure

  • Datus-CLI: An AI-powered command-line interface for data engineers—think "Claude Code for data engineers." Write SQL, build subagents, and construct context interactively.
  • Datus-Chat: A web chatbot providing multi-turn conversations with built-in feedback mechanisms (upvotes, issue reports, success stories) for data analysts.
  • Datus-API: APIs for other agents or applications that need stable, accurate data services.

🧩 Contextual Data Engineering

Automatically builds a living semantic map of your company’s data — combining metadata, metrics, SQL history, and external knowledge — so engineers and analysts collaborate through context instead of raw SQL.

A Claude-Code-like CLI for data engineers.
Chat with your data, recall tables or metrics instantly, and run agentic actions — all in one terminal.

🧠 Subagents for Every Domain

Turn data domains into domain-aware chatbots.
Each subagent encapsulates the right context, tools, and rules — making data access accurate, reusable, and safe.

🔁 Continuous Learning Loop

Every query and feedback improves the model.
Datus learns from success stories and user corrections to evolve reasoning accuracy over time.


Requirements: Python >= 3.12

pip install datus-agent==0.2.1 datus-agent init

For detailed installation instructions, see the Quickstart Guide.

A Data Engineer (DE) starts by chatting with the database using /chat. They run simple questions, test joins, and refine prompts using @table or @file. Each round of feedback (e.g., "Join table1 and table2 by PK") helps the model improve accuracy. datus-cli --namespace demo /Check the top 10 bank by assets lost @Table duckdb-demo.main.bank_failures

Learn more: CLI Introduction

The DE imports SQL history and generates summaries or semantic models:

/gen_semantic_model xxx @subject They edit or refine models in @subject, combining AI-generated drafts with human corrections. Now, /chat can reason using both SQL history and semantic context.

Learn more: Knowledge Base Introduction

When the context matures, the DE defines a domain-specific chatbot (Subagent):

.subagent add mychatbot

They describe its purpose, add rules, choose tools, and limit scope (e.g., 5 tables). Each subagent becomes a reusable, scoped assistant for a specific business area.

Learn more: Subagent Introduction

4️⃣ Delivering to Analysts

The Subagent is deployed to a web interface: http://localhost:8501/?subagent=mychatbot

Analysts chat directly, upvote correct answers, or report issues for feedback. Results can be saved via !export.

Learn more: Web Chatbot Introduction

5️⃣ Refinement & Iteration

Feedback from analysts loops back to improve the subagent: engineers fix SQL, add rules, and update context. Over time, the chatbot becomes more accurate, self-evolving, and domain-aware.

For detailed guidance, please follow our tutorial.

Read Entire Article