Natural-language analytics for BigQuery, powered by Google’s ADK and a streaming Next.js experience.
Selecta ships two tightly-integrated pieces:
- Backend (backend/) – a Google ADK agent that parses English questions, generates GoogleSQL, executes BigQuery jobs, and streams structured results (rows, charts, summaries, meta).
- Frontend (frontend/) – a Shadcn-flavoured Next.js 14 client that renders the conversation, reasoning trace, data visualisation, SQL, and a Meta tab with execution telemetry.
Together they deliver a “Gemini + BigQuery” workflow that feels immediate: ask a question, watch the reasoning stream, see the Vega-Lite chart materialise, and browse job details without reloading.
- Structured answers – every response arrives with Summary, Results, and Business Insights sections plus a generated Vega-Lite spec.
- Meta-aware streaming – the UI separates reasoning from the final answer and exposes execution metrics (job id, latency, dataset provenance) in a dedicated tab.
- Drop-in backend – runs on top of the official ADK HTTP server; no custom FastAPI layer or bespoke protocol to maintain.
- Modern frontend – Next.js App Router, TypeScript, Zustand, Shadcn UI, and Tailwind 4 powering a responsive glassmorphic layout.
- The frontend posts a question to POST /run_sse.
- The ADK agent streams intermediate reasoning followed by a structured payload:
- rows + columns + row count
- Vega-Lite chart suggestion
- markdown summary/results/insights
- BigQuery job metadata (executionMs, jobId, dataset info)
- The frontend normalises those increments into the chat bubble, a hidden “View reasoning” accordion, and the Chart / Table / SQL / Meta tabs.
See backend/api-contract.md for the exact event schema.
- Python 3.12+
- uv for backend dependency management
- Node.js 22+ and npm for the frontend
- A Google Cloud project with BigQuery access (the repo ships with the public thelook_ecommerce dataset configured)
- A Gemini API key (or Vertex AI credentials) for ADK
Visit http://localhost:3000 and ask a question like “What are the top products by revenue in the last 150 days?”.
Use the helper script to build the container and roll out the ADK + Next.js stack:
- The script mirrors the verified manual command. The ADK backend listens on 8081, the Next.js frontend on 8080, and the default dataset config comes from /opt/venv/lib/python3.12/site-packages/selecta/datasets/thelook.yaml inside the image.
- Ensure the chosen service account has roles/secretmanager.secretAccessor on the Gemini key secret (or pass --env GOOGLE_API_KEY=... if you prefer a literal env var).
- To swap datasets, include --env SELECTA_DATASET_CONFIG=/app/path/to/custom.yaml.
Once deployed, the service URL printed by gcloud (or the script) becomes the base for the frontend; no additional proxy configuration is required.
- Backend docs – backend/README.md (setup, configuration, payload reference)
- API contract – backend/api-contract.md (endpoints + streaming schema)
- Frontend docs – frontend/README.md (app structure, dev commands)
Every streamed result includes:
summary, resultsMarkdown, businessInsights | Markdown sections rendered in the chat bubble. |
rows, columns, rowCount | Lightweight preview of the BigQuery result set. |
chart | Vega-Lite spec used by the Chart tab. |
executionMs, jobId | BigQuery latency + job identifier (shown in the Meta tab). |
dataset | Active dataset descriptor (project, billing project, location, table allowlist). |
Use these fields to build your own client, persist history, or trigger follow-up analyses.
We welcome issues and pull requests! A few guidelines:
- Keep docs and contracts up to date (README.md, backend/README.md, backend/api-contract.md, frontend/README.md).
- When changing backend payloads, update the frontend store/hooks and Meta tab.
- Run npm run lint (frontend) and uv pip check / python -m compileall (backend) before opening a PR.
See roadmap.md for active workstreams and ideas.
Apache 2.0 – see LICENSE.