//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
🤗 Super excited to have this work out! Turns out by calculating the angles 📐 between representations, you can pick out difficult data samples! This can be very useful for assembling hard test sets or more efficient training sets. See more cool results and visuals in the 🧵
13h
Ruochen
We don’t always know what problems are hard for LLMs. So devs evaluate on tasks HUMANS find hard or on broad benchmarks. What if we could instead anticipate which scenarios a model will fail on—all without evaluating specific input examples? 🧵NEW PAPER by @jenniferlumeng.bsky.social
17h
Naomi Saphra