Inlay

🇦🇹I'll be at #ACL2025! Recently I've been thinking about: ✨linguistically + cognitively-motivated evals (as always!) ✨understanding multilingualism + representation learning (new!) I'll also be presenting a poster for BehaviorBox on Wed @ Poster Session 4 (Hall 4/5, 10-11:30)!

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9