Inlay

//

Profile

Loading...

Real user queries often look different from the clean, concise ones in academic benchmarks - ambiguity, full of typos, and much less readable. We show that even strong RAG systems quickly break under these conditions. Awesome project led by @neelbhandari.bsky.social and @tianyucao.bsky.social!!

Apr 22, 2025

These days RAG systems have gotten popular for boosting LLMs—but they're brittle💔. Minor shifts in phrasing (✍️ style, politeness, typos) can wreck the pipeline. Even advanced components don’t fix the issue. Check out this extensive eval by @neelbhandari.bsky.social and @tianyucao.bsky.social!

Apr 18, 2025

Akari Asai

1/🚨 𝗡𝗲𝘄 𝗽𝗮𝗽𝗲𝗿 𝗮𝗹𝗲𝗿𝘁 🚨 RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style? We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline 🧵

Apr 17, 2025

Neel Bhandari

Akhila Yerukola

1/🚨 𝗡𝗲𝘄 𝗽𝗮𝗽𝗲𝗿 𝗮𝗹𝗲𝗿𝘁 🚨 RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style? We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline 🧵

Apr 17, 2025

Neel Bhandari