//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
ManagerBench was accepted to ICLR! @iclr-conf.bsky.social #ICLR2026 LLMs are still either unsafe, or completely harm avoidant - even when the harm affects furniture ๐Ÿ›‹๏ธ Check out our benchmark, online or in Rio ๐Ÿ‡ง๐Ÿ‡ท
4mo
Martin Tutek
๐Ÿค”What happens when LLM agents choose between achieving their goals and avoiding harm to humans in realistic management scenarios? Are LLMs pragmatic or prefer to avoid human harm? ๐Ÿš€ New paper out: ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs๐Ÿš€๐Ÿงต
8mo
Martin Tutek