Have there been significant improvements this year?
The simple flow we landed on in 2024 was:
1. Chunk and embed docs with embedding model 2. Embed query (maybe using an LLM to reformulate first) 3. Retrieve N1 docs using cosine similarity 4. Narrow to N2 using a reranking model 5. Inject these docs into context to generate answer
Have there been significant advancements? Has anyone had seen improvements using graph structures like Neo4j for more sophisticated retrieval?