//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...






Loading...
Do instructions affect how LMs process and produce language? โ˜๏ธNot the way you think! ๐Ÿ˜ฒLMs barely change task information when processing a task sample. Instead, instructions shape how this information is accessed and expressed when producing output tokens. #interpretability #nlproc (1/๐Ÿงต)
26d
I already presented some work on reference (names, pronouns, coreference resolution, pronoun fidelity, etc.) asย a rich site to evaluate biases and commonsense reasoning, and our work on disentangling model behaviour and internals through aligned probing (led by @tresiwald.bsky.social).
In short, instructions act less on what models process, and more on what they emit. Behavior changes from prompting, including prompt instability and in-context learning, therefore seem to arise mainly at the production stage, with little adaptation during task-sample processing. (2/๐Ÿงต)
Thanks a lot to everyone for the support, guidance, mentoring, collaboration, and great moments over the past years! ๐Ÿ™ Without you, this journey wouldn't have been such a pleasure โ€” and now excited to see what the future brings! ๐Ÿš€
It was a pleasure to ๐Ÿธ
Excited to present this work together with @dippedrusk.com at #EACL. Join us in the poster session 1 (11:30-13:00) ๐Ÿ”ฅ
More on this production-centered mechanism across models and + implications for evaluation, interpretation, and pre-training: ๐Ÿ”— instruction-probing.github.io ๐Ÿ“„ arxiv.org/abs/2605.11206 Team effort with @lchoshen.bsky.social @yufanghou.bsky.social @yperlitz.bsky.social๐Ÿ™Œ Questions? Reach out! (3/3)
2mo
Andreas Waldis