//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts








[1/n] Just wrapped up 7 months interning with @pcastr.bsky.social at Google DeepMind and I'm so excited to share our work: arxiv.org/abs/2602.10324. TLDR: We used LLM-powered program synthesis to automatically model and discover differences between human and LLM strategic behavior
[4/n] Frontier models (Gemini 2.5 Pro/Flash, GPT 5.1) win more and adapt much faster than humans, while smaller models like GPT OSS 120B actually get worse over time because they can’t integrate the long context.
[8/n] For me, it’s really cool that this aligns with the jump in theory-of-mind capabilities in recent LLMs (since opponent modeling in IRPS is basically a type of ToM)
[3/n] So how do their strategic behaviors actually differ from humans? We examined this question through the lens of behavioral game theory, using iterated rock-paper-scissors (IRPS).
[2/n] LLM agents are everywhere now: customer service, negotiations, even as human proxies for social science/market research
[5/n] But what does the difference in win rates actually mean? To understand, we used AlphaEvolve to automatically discover interpretable behavioral models directly from gameplay data.
[6/n] Using this approach, we get actual programs that explain the behavior, which we can read and compare. Diagram for human program shown below.
[7/n] So what were the insights? Both humans and LLMs use value learning + opponent modeling, but frontier models maintain more sophisticated opponent models (3x3x3 transition matrices vs simple size 3 vectors tracking of prior move frequencies).