More on this production-centered mechanism across models and + implications for evaluation, interpretation, and pre-training:
๐ instruction-probing.github.io
๐ arxiv.org/abs/2605.11206
Team effort with @lchoshen.bsky.social @yufanghou.bsky.social @yperlitz.bsky.social๐
Questions? Reach out! (3/3)