ManagerBench was accepted to ICLR! @iclr-conf.bsky.social #ICLR2026
LLMs are still either unsafe, or completely harm avoidant - even when the harm affects furniture ๐๏ธ
Check out our benchmark, online or in Rio ๐ง๐ท
Martin Tutek
๐คWhat happens when LLM agents choose between achieving their goals and avoiding harm to humans in realistic management scenarios? Are LLMs pragmatic or prefer to avoid human harm?
๐ New paper out: ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs๐๐งต