Show HN: Selecta – Natural language analytics for BigQuery

8 hours ago 1

Natural-language analytics for BigQuery, powered by Google’s ADK and a streaming Next.js experience.

Selecta ships two tightly-integrated pieces:

Backend (backend/) – a Google ADK agent that parses English questions, generates GoogleSQL, executes BigQuery jobs, and streams structured results (rows, charts, summaries, meta).
Frontend (frontend/) – a Shadcn-flavoured Next.js 14 client that renders the conversation, reasoning trace, data visualisation, SQL, and a Meta tab with execution telemetry.

Together they deliver a “Gemini + BigQuery” workflow that feels immediate: ask a question, watch the reasoning stream, see the Vega-Lite chart materialise, and browse job details without reloading.

Structured answers – every response arrives with Summary, Results, and Business Insights sections plus a generated Vega-Lite spec.
Meta-aware streaming – the UI separates reasoning from the final answer and exposes execution metrics (job id, latency, dataset provenance) in a dedicated tab.
Drop-in backend – runs on top of the official ADK HTTP server; no custom FastAPI layer or bespoke protocol to maintain.
Modern frontend – Next.js App Router, TypeScript, Zustand, Shadcn UI, and Tailwind 4 powering a responsive glassmorphic layout.

[User] → [Next.js Frontend (SSE client)] │ ▼ [Google ADK HTTP Server] │ ▼ [BigQuery + Gemini]

The frontend posts a question to POST /run_sse.
The ADK agent streams intermediate reasoning followed by a structured payload:
- rows + columns + row count
- Vega-Lite chart suggestion
- markdown summary/results/insights
- BigQuery job metadata (executionMs, jobId, dataset info)
The frontend normalises those increments into the chat bubble, a hidden “View reasoning” accordion, and the Chart / Table / SQL / Meta tabs.

See backend/api-contract.md for the exact event schema.

Python 3.12+
uv for backend dependency management
Node.js 22+ and npm for the frontend
A Google Cloud project with BigQuery access (the repo ships with the public thelook_ecommerce dataset configured)
A Gemini API key (or Vertex AI credentials) for ADK

cd backend uv venv source .venv/bin/activate uv pip install -e . # Configure credentials export GOOGLE_API_KEY="YOUR_GEMINI_KEY" # or set Vertex AI env vars export SELECTA_DATASET_CONFIG="$(pwd)/selecta/datasets/thelook.yaml" # Run the ADK server uv run adk api_server app --allow_origins "*" --port 8080

cd frontend npm install npm run dev

Visit http://localhost:3000 and ask a question like “What are the top products by revenue in the last 150 days?”.

Use the helper script to build the container and roll out the ADK + Next.js stack:

./deploy-cloud-run.sh \ --project PROJECT_ID \ --region us-central1 \ --service selecta \ --service-account [email protected] \ --secret GOOGLE_API_KEY=selecta-gemini-key:latest \ --allow-unauthenticated

The script mirrors the verified manual command. The ADK backend listens on 8081, the Next.js frontend on 8080, and the default dataset config comes from /opt/venv/lib/python3.12/site-packages/selecta/datasets/thelook.yaml inside the image.
Ensure the chosen service account has roles/secretmanager.secretAccessor on the Gemini key secret (or pass --env GOOGLE_API_KEY=... if you prefer a literal env var).
To swap datasets, include --env SELECTA_DATASET_CONFIG=/app/path/to/custom.yaml.

Once deployed, the service URL printed by gcloud (or the script) becomes the base for the frontend; no additional proxy configuration is required.

backend/ # Google ADK agent, dataset config, tooling frontend/ # Next.js + Shadcn UI client roadmap.md # Upcoming workstreams

Backend docs – backend/README.md (setup, configuration, payload reference)
API contract – backend/api-contract.md (endpoints + streaming schema)
Frontend docs – frontend/README.md (app structure, dev commands)

Every streamed result includes:

Field Description

summary, resultsMarkdown, businessInsights	Markdown sections rendered in the chat bubble.
rows, columns, rowCount	Lightweight preview of the BigQuery result set.
chart	Vega-Lite spec used by the Chart tab.
executionMs, jobId	BigQuery latency + job identifier (shown in the Meta tab).
dataset	Active dataset descriptor (project, billing project, location, table allowlist).

Use these fields to build your own client, persist history, or trigger follow-up analyses.

We welcome issues and pull requests! A few guidelines:

Keep docs and contracts up to date (README.md, backend/README.md, backend/api-contract.md, frontend/README.md).
When changing backend payloads, update the frontend store/hooks and Meta tab.
Run npm run lint (frontend) and uv pip check / python -m compileall (backend) before opening a PR.

See roadmap.md for active workstreams and ideas.

Apache 2.0 – see LICENSE.

Read Entire Article