//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...

Loading...
Real user queries often look different from the clean, concise ones in academic benchmarks - ambiguity, full of typos, and much less readable. We show that even strong RAG systems quickly break under these conditions. Awesome project led by @neelbhandari.bsky.social and @tianyucao.bsky.social!!
Apr 22, 2025
These days RAG systems have gotten popular for boosting LLMsโ€”but they're brittle๐Ÿ’”. Minor shifts in phrasing (โœ๏ธ style, politeness, typos) can wreck the pipeline. Even advanced components donโ€™t fix the issue. Check out this extensive eval by @neelbhandari.bsky.social and @tianyucao.bsky.social!
Apr 18, 2025
Akari Asai
1/๐Ÿšจ ๐—ก๐—ฒ๐˜„ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฎ๐—น๐—ฒ๐—ฟ๐˜ ๐Ÿšจ RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style? We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline ๐Ÿงต
Apr 17, 2025
Neel Bhandari
Akhila Yerukola
1/๐Ÿšจ ๐—ก๐—ฒ๐˜„ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฎ๐—น๐—ฒ๐—ฟ๐˜ ๐Ÿšจ RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style? We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline ๐Ÿงต
Apr 17, 2025
Neel Bhandari