Mochi v0.10.5: A LINQ-style query language with a bytecode VM written in Go

4 months ago 6

Mochi v0.10.5: Dataset Querying

Mochi v0.10.5 introduces a streamlined and expressive way to query structured data—whether you’re working with in-memory lists or loading from files like JSON, YAML, CSV, or JSONL. Queries are fully type-checked and compiled, and the new syntax is familiar to anyone who’s used SQL or LINQ.

1. Loading, Saving, and Declaring Structured Data

Mochi supports structured data input/output with strong typing. You can define a schema and load external files directly into typed collections.

type Person { name: string age: int } let people = load "people.yaml" as Person save people to "people.json"

This works well for ETL pipelines, file processing scripts, and quick data transformations.

2. Filtering and Selecting Rows

Use the from ... where ... select pattern to filter data and extract fields cleanly and concisely.

let adults = from p in people where p.age >= 18 select { name: p.name, age: p.age } save adults to "adults.json"

The syntax is designed to be readable and type-safe, with no boilerplate.

3. Sorting, Skipping, and Taking Results

You can sort data by any field, then skip or limit rows to support pagination or ranking logic.

let top = from p in products sort by -p.price skip 1 take 3 select p

This is useful for dashboards, leaderboard views, and preview screens.

4. Joining Datasets

Mochi supports joining two collections on matching keys. This makes it easy to express relationships between records, such as orders and customers:

let result = from o in orders join from c in customers on o.customerId == c.id select { orderId: o.id, customerName: c.name, total: o.total }

Joins are readable, type-checked, and integrate cleanly into the rest of the query pipeline.

5. Grouping and Aggregation

Mochi adds first-class support for group by, with standard aggregation functions like count, sum, avg, min, and max.

Single-Key Grouping

let stats = from p in people group by p.city select { city: p.city, count: count(p), averageAge: avg(p.age), oldest: max(p.age) }

Multi-Key Grouping

let summary = from order in orders group by order.country, order.status select { country: order.country, status: order.status, totalOrders: count(order), totalAmount: sum(order.total) }

Group Filtering (having)

let popular = from p in people group by p.city select { city: p.city, residents: count(p) } having residents > 100

Aggregation functions are first-class—usable inside select blocks, conditionals, or expressions.

6. Nested Queries and Inline Data

Mochi supports nesting queries and working with inline datasets, making it easy to prototype and test.

Inline Lists

let people = [ { name: "Alice", age: 30, city: "Hanoi" }, { name: "Bob", age: 17, city: "Saigon" }, { name: "Charlie", age: 45, city: "Hanoi" } ]

Nested Selects

let result = from c in customers select { name: c.name, orders: from o in orders where o.customerId == c.id select o.total }

Flattening Nested Structures

let people = load "people_with_hobbies.json" let rows = from p in people from h in p.hobbies select { name: p.name, hobby: h }

7. Schema Inference and Type Hints

If you don’t have a predefined type, you can inline a structural hint during load:

let logs = load "logs.jsonl" as { timestamp: string, userId: string, action: string }

Mochi will use the hint for type-checking and completion, improving safety and editor integration.

TPC-H Query Testing

Mochi v0.10.5 includes full coverage of the TPC-H benchmark, with all 22 queries implemented as type-checked, self-contained test files.

Each query is written using Mochi’s dataset query syntax (from, join, group by, select, etc.) and includes inline test data for correctness. These tests are organized under tests/dataset/tpc-h/ and named q1.mochi through q22.mochi.

Query List

Query File Description
Q01 q1.mochi Pricing summary report (lineitem aggregation by return flag + status)
Q02 q2.mochi Minimum cost supplier in each region
Q03 q3.mochi Revenue by customer market segment and order date
Q04 q4.mochi Order priority check by order/commit date
Q05 q5.mochi Revenue by region with joins across 5 tables
Q06 q6.mochi Discounted revenue for a date range
Q07 q7.mochi Volume of trade between two nations
Q08 q8.mochi National market share by part type
Q09 q9.mochi Profit by nation and part
Q10 q10.mochi Revenue loss due to returned items
Q11 q11.mochi Important stock by value
Q12 q12.mochi Orders filtered by shipping mode and order priority
Q13 q13.mochi Customer distribution by number of orders
Q14 q14.mochi Promotion revenue effect
Q15 q15.mochi Top supplier by revenue
Q16 q16.mochi Suppliers who don’t sell certain parts
Q17 q17.mochi Average quantity filtering per brand
Q18 q18.mochi High-volume customers
Q19 q19.mochi Discounted revenue for specific part-brand combinations
Q20 q20.mochi Suppliers that can fulfill certain part demand
Q21 q21.mochi Suppliers with late deliveries and no alternatives
Q22 q22.mochi Revenue distribution by phone prefix and customer account balance

Example: q1.mochi

let lineitem = [ { l_quantity: 17, l_extendedprice: 1000.0, l_discount: 0.05, l_tax: 0.07, l_returnflag: "N", l_linestatus: "O", l_shipdate: "1998-08-01" }, { l_quantity: 36, l_extendedprice: 2000.0, l_discount: 0.10, l_tax: 0.05, l_returnflag: "N", l_linestatus: "O", l_shipdate: "1998-09-01" }, { l_quantity: 25, l_extendedprice: 1500.0, l_discount: 0.00, l_tax: 0.08, l_returnflag: "R", l_linestatus: "F", l_shipdate: "1998-09-03" // excluded } ] let result = from row in lineitem where row.l_shipdate <= "1998-09-02" group by { returnflag: row.l_returnflag, linestatus: row.l_linestatus } into g select { returnflag: g.key.returnflag, linestatus: g.key.linestatus, sum_qty: sum(from x in g select x.l_quantity), sum_base_price: sum(from x in g select x.l_extendedprice), sum_disc_price: sum(from x in g select x.l_extendedprice * (1 - x.l_discount)), sum_charge: sum(from x in g select x.l_extendedprice * (1 - x.l_discount) * (1 + x.l_tax)), avg_qty: avg(from x in g select x.l_quantity), avg_price: avg(from x in g select x.l_extendedprice), avg_disc: avg(from x in g select x.l_discount), count_order: count(g) } json(result) test "Q1 aggregates revenue and quantity by returnflag + linestatus" { expect result == [ { returnflag: "N", linestatus: "O", sum_qty: 53, sum_base_price: 3000, sum_disc_price: 950.0 + 1800.0, // 2750.0 sum_charge: (950.0 * 1.07) + (1800.0 * 1.05), // 1016.5 + 1890 = 2906.5 avg_qty: 26.5, avg_price: 1500, avg_disc: 0.07500000000000001, count_order: 2 } ] }

Join Order Benchmark (JOB) Query Testing

Mochi v0.10.5 includes full support for the Join Order Benchmark (JOB), a widely used benchmark for evaluating join reordering and query planning behavior. All 33 queries from the benchmark have been implemented as inline test files in idiomatic Mochi.

Each query is saved as q1.mochi through q33.mochi under the tests/dataset/job/ directory. These test files include representative inline data, the full query logic, and an expected output block to verify correctness.

Query List

Query File Description
Q01 q01.mochi Join on cast_info, movie, and actor
Q02 q02.mochi Join with info_type and keyword
Q03 q03.mochi Star join with cast_info, movie_keyword
Q04 q04.mochi Disjunctive join filters with role_id and link_type
Q05 q05.mochi Join with multiple foreign key paths
Q06 q06.mochi Selective join with node and keyword
Q07 q07.mochi Join with multiple join conditions
Q08 q08.mochi Join with info_type and char_name
Q09 q09.mochi Multiple joins with small filters
Q10 q10.mochi Join on title with year range
Q11 q11.mochi Join chain with 5+ hops across cast_info and info_type
Q12 q12.mochi Join actor, movie_keyword, and keyword
Q13 q13.mochi Join with selective filters and union-like conditions
Q14 q14.mochi Complex role-based join
Q15 q15.mochi Join with info_type and complicated conditions
Q16 q16.mochi Join with node_type and multiple keys
Q17 q17.mochi Join with cast_info and role filtering
Q18 q18.mochi Role-based join with info_type and char_name
Q19 q19.mochi Join with optional outer edge
Q20 q20.mochi Join across multiple keys with filter on name
Q21 q21.mochi Join using cast_info and keyword links
Q22 q22.mochi Deep join with small output projection
Q23 q23.mochi Join chain with additional filtering on role_id
Q24 q24.mochi Selective filter with info_type
Q25 q25.mochi Join with optional keyword filters
Q26 q26.mochi Join on movie_companies and company_name
Q27 q27.mochi Join with multiple filters and subkeys
Q28 q28.mochi Join with info_type, keyword, and cast_info
Q29 q29.mochi Join with char_name and gender-based filtering
Q30 q30.mochi Join with deep chain including movie_keyword
Q31 q31.mochi Star-shaped join on info_type and keyword
Q32 q32.mochi Join with keyword and company_name filters
Q33 q33.mochi Multi-branch join with filter and projection

Example: q1.mochi

let company_type = [ { id: 1, kind: "production companies" }, { id: 2, kind: "distributors" } ] let info_type = [ { id: 10, info: "top 250 rank" }, { id: 20, info: "bottom 10 rank" } ] let title = [ { id: 100, title: "Good Movie", production_year: 1995 }, { id: 200, title: "Bad Movie", production_year: 2000 } ] let movie_companies = [ { movie_id: 100, company_type_id: 1, note: "ACME (co-production)" }, { movie_id: 200, company_type_id: 1, note: "MGM (as Metro-Goldwyn-Mayer Pictures)" } ] let movie_info_idx = [ { movie_id: 100, info_type_id: 10 }, { movie_id: 200, info_type_id: 20 } ] let filtered = from ct in company_type join mc in movie_companies on ct.id == mc.company_type_id join t in title on t.id == mc.movie_id join mi in movie_info_idx on mi.movie_id == t.id join it in info_type on it.id == mi.info_type_id where ct.kind == "production companies" && it.info == "top 250 rank" && (!mc.note.contains("(as Metro-Goldwyn-Mayer Pictures)")) && (mc.note.contains("(co-production)") || mc.note.contains("(presents)")) select { note: mc.note, title: t.title, year: t.production_year } let result = { production_note: min(from r in filtered select r.note), movie_title: min(from r in filtered select r.title), movie_year: min(from r in filtered select r.year) } json([result]) test "Q1 returns min note, title and year for top ranked co-production" { expect result == { production_note: "ACME (co-production)", movie_title: "Good Movie", movie_year: 1995 } }

Changelog

  • d272837 Add Dart JOB dataset tests and min/max support
  • e300b75 Add F# JOB dataset golden outputs
  • e542d9a Add Haskell golden test for JOB q1
  • 8e116e0 Add JOB dataset golden tests and extend Racket compiler
  • 6975c2c Add JOB dataset golden tests for Lua
  • 95bfa15 Add JOB q1 Go compiler test and contains support
  • 3ef28a5 Add JOB q1 Kotlin golden files and selector indexing
  • de4c7b9 Add JOB q23 example
  • e41ca85 Add JOB q24 example
  • 75797dd Add JOB query 11 example
  • 337c233 Add JOB query 16 example
  • 6ddfdb9 Add JOB query 17 example
  • d736797 Add JOB query 18 example
  • 5f809a1 Add JOB query 19 example
  • f113fbc Add JOB query 20 example
  • af21971 Add JOB query 21 example
  • cddde6c Add JOB query 22 example
  • 97b478a Add JOB query 25 example
  • 1df9145 Add JOB query 27 example
  • 54bf98e Add JOB query 28 example
  • 08e3555 Add JOB query 29 example
  • 92b4ce2 Add JOB query 30 example
  • 050700a Add JOB query 31 example
  • c02adb7 Add JOB query 32 example
  • 1dbf378 Add JOB query 33 example
  • 3c3ae75 Add Lua JOB Q1 support and tests
  • bcd0f48 Add Swift JOB q1 tests and min helper
  • 20e32ab Add Zig JOB q1 compilation support and tests
  • a273509 Add golden outputs for JOB queries 3-10 and update C compiler tasks
  • b3aa056 Add golden tests for JOB dataset and Python compiler support
  • 7a3a0be Add job query IR outputs and minor fixes
  • 3b6f8d9 Add null literal and update JOB IR
  • 59cd9bc Clean up TASK in dataset/job
  • d3a1439 Fix JOB C++ golden compile
  • ab09411 Move JOB ST output to dataset directory
  • c2a420d Move JOB q1 outputs and use dataset source
  • 431af33 Remove root TASKS.md
  • 96bdaef Revert "vm: add like operator and generate job ir"
  • e0155a3 Skip Kotlin compile errors in JOB test and run q1-10
  • c09e6f9 Skip failing Smalltalk JOB tests
  • efc5df5 Update Go compiler and golden outputs for JOB dataset
  • de916f8 Update benchmark outputs
  • 1052b27 Update job dataset golden outputs
  • 6025adc Use String.contains? in Elixir
  • da661db c: add JOB dataset tests and generated code
  • 4162c83 clj: support starts_with and string comparisons
  • 6b81bca clojure backend notes and contains stub
  • 6199a95 clojure backend: job dataset tests and fixes
  • dd7599a cobol: make formatter optional and document nondeterminism
  • ceb556b cobol: run JOB queries
  • 43663d8 compiler(ts): support in on untyped values
  • 416a481 compiler/fs: add JOB q1-10 tests
  • 774fe78 compiler/go: add JOB query tests and golden outputs
  • 6b9632d cpp: add JOB q1/q2 golden tests
  • 0ade122 cs compiler: add job golden tests and update tasks
  • e5ea5dc cs: support min/max and contains
  • e7314e7 dart compiler: add job q1-q10 tests
  • 8439d55 erlang compiler: document JOB issues and refactor test handling
  • 8e83e18 erlang: add min/max and job q1 q2 golden
  • a04297b ex compiler: handle string contains in 'in' operator
  • fd103c7 feat: add JOB q26 example
  • bc40192 fix c++ type inference for map literals
  • 83b2d5c fortran: add min builtin and test
  • 7b22300 hs: add JOB q1-10 compiler tests
  • ee0d0b6 java backend: add JOB compilation tests
  • ac40007 java: extend JOB tests and update outputs
  • efc2823 kt: add JOB golden test skeleton and update tasks
  • 799533b pascal: add JOB q1-10 tests
  • 743d455 pascal: add JOB q1-q2 golden tests
  • 8ed7e5d php backend: support job q1 and contains
  • c99f0cb php: add job dataset q1-q10 tests
  • 0822121 pl: add JOB q1-q10 golden tests
  • cd29785 prolog backend: add job q1 support and tests
  • 251d85f rb: add JOB q1-q10 golden tests
  • 68eba20 release: v0.10.5
  • dd5b4f1 ruby: add JOB q1 support
  • 4b84439 runtime/vm: add register optimization tools
  • 585c6c7 runtime/vm: support string min/max; add JOB dataset VM tests
  • 07ec926 rust: add job dataset golden tests
  • 8fd3764 rust: compile job q1-q10
  • 6a5ec29 scala: add JOB golden tests q1-q10
  • dbebf0b scala: add JOB q2 golden tests and update compiler
  • 471030a scheme: add JOB dataset tests
  • 84eb246 scheme: add JOB q2 golden tests
  • f3959fd scheme: add job q1 compilation test
  • cee2c6e st compiler: add JOB golden test and prepare globals
  • 29d40d4 swift: add job dataset golden tests
  • 6f45680 swift: improve map key inference and update JOB golden
  • 5905a7b test(ex): extend JOB compiler tests
  • 812c79a test(job): add Erlang golden test
  • 73ee1c2 tests: add C JOB golden tests
  • d1ea7cb tests: add JOB dataset golden tests for q1-10
  • c91ab67 tests: add Scala JOB q1 golden
  • af1a291 ts compiler: support contains and job q1
  • d423fda update IR for job q7
  • c7dec6a vm: add like operator and generate job ir
  • 75cf607 vm: add starts_with support and add JOB ir
  • 7accdb0 vm: fix starts_with for shorter strings
  • 36ce105 vm: handle null ordering and refresh tpch golden
  • afbf38a vm: re-enable constant folding
  • 59250ef vm: remove interpreter dependency
  • 965cde4 vm: support job dataset queries
  • e6674a6 zig: add JOB q1-q10 golden tests
  • e11f017 zig: add precedence fix and job run tests
Read Entire Article