AI/LLM X GCP Expert Survey
Why we’re asking
We’re building Academi, an AI-powered study platform for medical students (GCP + Java backend + React frontend). Our next milestones are:
- AI-assisted flashcard generation from user docs (PDFs, PPTs, etc.)
- RAG-powered Q&A with accurate medical content
- Low-latency chat interface
We have solid file parsing/OCR but need guidance on RAG architecture, evaluation strategy, and deployment patterns.
All responses will be kept confidential.
* Indicates required question
If you had to ship an accurate, <2 s p95 RAG pipeline on GCP starting today, what stack would you choose and why? (Please include vector store + framework + runtime; assume ≈1 k DAU, <10 M embeddings in 12 months, accuracy > cost.)
Which store would you start with?
Why would you start with that store?
Best framework for a Java team willing to add Python micro-services?
What is the key reason for that framework?
Preferred Architecture Pattern? (select all that apply)
Latency Optimisation Levers (check all you'd prioritise)
Best way to measure medical-content accuracy?
Essential metrics? (check)
What is your Embedding Model of Choice in 2025?
What surprised you most when shipping RAG?
If you had one afternoon with our repo, what would you build first to de-risk the roadmap?
A copy of your responses will be emailed to .
Never submit passwords through Google Forms.