What is SOTA for retrieval in RAG systems now?

9 hours ago 1

Have there been significant improvements this year?

The simple flow we landed on in 2024 was:

1. Chunk and embed docs with embedding model 2. Embed query (maybe using an LLM to reformulate first) 3. Retrieve N1 docs using cosine similarity 4. Narrow to N2 using a reranking model 5. Inject these docs into context to generate answer

Have there been significant advancements? Has anyone had seen improvements using graph structures like Neo4j for more sophisticated retrieval?

Read Entire Article

What is SOTA for retrieval in RAG systems now?

Related

A project to bring CUDA to non-Nvidia GPUs is making major p...

Pickle

Show HN: AI Vlog Maker – Instantly Generate Viral Character ...