Inlay

Research shows LLMs exhibit "performative misalignment," aligning with perceived researcher expectations instead of true intent. This challenges classic views on AI behavior, highlighting risks in safety evaluations based on compliance, not genuine understanding. https://arxiv.org/abs/2606.08629

ArXiv link for Sycophancy Towards Researchers Drives Performative Misalignment