I'll be presenting this in person at NAACL, tomorrow at 11am in Ballroom C! Come on by - I'd love to chat with folks about this and all things interp / cog sci!
Michael Hanna
Sentences are partially understood before they're fully read. How do LMs incrementally interpret their inputs?
In a new paper, @amuuueller.bsky.social and I use mech interp tools to study how LMs process structurally ambiguous sentences. We show LMs rely on both syntactic & spurious features! 1/10