To learn more:
Website: agentcoma.github.io
Preprint: arxiv.org/abs/2508.19988
A big thanks to my brilliant coauthors Lihu Chen, Ana Brassard, @joestacey.bsky.social, @rahmanidashti.bsky.social and @marekrei.bsky.social!
Note: We welcome submissions to the #AgentCoMa leaderboard from researchers 🚀
AgentCoMa is an Agentic Commonsense and Math benchmark where each compositional task requires both commonsense and mathematical reasoning to be solved. The tasks are set in real-world scenarios:…