This is an excellent thread on the nature of what LLMs are doing and, nestled neatly within, an important observation about the limitations of human communication (and the context we are always inferring).
Come for the broomsticks, stay for the genies.
Benjamin Riley
I gave comments for a news piece in Science on a recent preprint on AI finding loopholes ("Large Language Models Hack Rewards, and Society", Liu et al.)
since these things are always cut short, I wanted to expand here:
www.science.org/content/arti...
arxiv.org/abs/2606.04075
Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to ...