Show HN: Tabsdata – Pub/Sub for Tables to Replace ETL Pipelines

2 hours ago 1

Hi HN,

I am one of the founders of Tabsdata, a new system for data integration built around publish/subscribe for tables.

Tabsdata is a self-managed system you can deploy in your own environment. It is built in Rust for performance and safety, with Python bindings for expressing data flows. You get both a browser-based UI and a CLI that covers the full surface area of the platform.

We started this project after spending nearly two decades in data integration. Over that time, we have seen the modern data stack emerge in layers: ingestion, preparation, quality, productization, and so on. Many of those layers were designed to handle semi-structured data and schema drift. But that complexity often left data teams with brittle workflows and limited visibility.

One of the key lessons for us was this: no matter how data starts out, it always turns into tables. APIs, logs, Parquet files, event streams - everything eventually lands in tabular form. So we made tables the first-class unit in the system, right from ingestion.

The second lesson was about semantics and metadata. Every layer between producer and consumer tends to dilute meaning, reshaping schemas, masking lineage, or dropping ownership context. This forces teams to build new layers just to patch over what was lost. We believe a better approach is to carry the table, its metadata, and its semantic intent all together from start to finish. Tabsdata makes that default.

Instead of stitching together ingestion, transformation, orchestration and governance, across fragmented tools and platforms, Tabsdata provides a single declarative system where producers publish, transformers enrich, and consumers subscribe. Every step is versioned and every change is explainable.

If you are curious, here is a short overview and demo we shared on Practical Data Community:

https://www.youtube.com/watch?v=qCZIRC9khmA

You can also learn more here:

https://tabsdata.com

Typical Use Cases

* Instant ETL - No jobs to schedule. Every update flows through the dependency graph immediately, so consumer-facing tables are always in sync with the latest from producers.

* Instant Lineage - Each table refresh results in a lineage graph that includes versions of inputs, outputs, and the transformations that connected them.

* Declarative DataOps - No pipelines to wrangle. Transformations are attached to tables directly, using declarative steps with clear inputs and outputs. All changes to tables are fully reproducible.

* Legacy ETL Modernization - Tabsdata maps cleanly to Informatica and Talend-style mappings. You can migrate without rewriting logic from scratch or disrupting downstream users.

What Tabsdata Is Not

* It is not a message broker or event system. You should still use Kafka, Kinesis, or equivalent if you are building real-time streaming applications.

* It is not a database or lakehouse. Tabsdata works with the data platforms you already have, delivering versioned, trusted tables into them as outputs.

We are still early and are looking for feedback. If you have dealt with pipeline sprawl, slow reconciliation, or unclear ownership across data teams, we would love to hear your perspective. What's missing? What doesn't make sense? What would make this more useful in your setup?

Best regards,

Arvind

Read Entire Article