So many people, CS researchers included, think that you can explore how an LLM works by simply asking it to tell you what it is doing or "thinking".
Here @jennhu.bsky.social provides an excellent illustration of how that approach fails even at the most basic level.
Carl T. Bergstrom
To researchers doing LLM evaluation: prompting is *not a substitute* for direct probability measurements. Check out the camera-ready version of our work, to appear at EMNLP 2023! (w/ @rplevy.bsky.social)
Paper: arxiv.org/abs/2305.13264
Original thread: twitter.com/_jennhu/stat...