//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfileReplies









Loading...
I am humbled to join this excellent team and work on delivering the highest quality human preference LLM evals! ⚔️⚔️⚔️
curiosity discovery goofiness
I've been following this project since it first showed up in my google scholar notifications for papers that cite Elo in 2023 and had fun experimenting with their data and contributing open source before it was a company.
My brain is living in my head rent free
EsportsBench refreshed with data up through June 2025, over 61k new matches across 20 esports have been recorded in the last 3 months! huggingface.co/datasets/Esp...
Extremely excited to announce that I've joined @lmarena.bsky.social ! For years I've been working in LLMs for my job, and hacking on rankings and ratings for fun, beyond thrilled to be able to join this project at the intersection!
Just ran into Simpsons paradox in the wild for the first time lol. Was looking at some data and was like "that doesn't look right all the means went up when all I did was assign groups differently, this is like Simpson's paradox or something lol"