//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfileReplies









Loading...
To learn more: Website: agentcoma.github.io Preprint: arxiv.org/abs/2508.19988 A big thanks to my brilliant coauthors Lihu Chen, Ana Brassard, @joestacey.bsky.social, @rahmanidashti.bsky.social and @marekrei.bsky.social! Note: We welcome submissions to the #AgentCoMa leaderboard from researchers šŸš€
9mo
agentcoma.github.io
AgentCoMa is an Agentic Commonsense and Math benchmark where each compositional task requires both commonsense and mathematical reasoning to be solved. The tasks are set in real-world scenarios:…
AgentCoMa
Lisa Alazraki
At #NeurIPS2025 today, @lisaalaz.bsky.social is presenting our joint paper on Reverse Engineering Human Preferences with Reinforcement Learning! Demonstrating undetectable attacks on LLM-as-a-judge benchmarks. Great collaboration with @cohereforai.bsky.social and a well-deserved NeurIPS spotlight!
6mo
We also postulate that the benefits of RLRE do not end at adversarial attacks. Reverse engineering human preferences could be used for a variety of applications, including but not limited to meaningful tasks such as reducing toxicity or mitigating bias šŸ”„
May 22, 2025