//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
2/ Model A may beat model B on average, but model A can still lose to model B if judged by the min. over several tasks. I wrote a brief blog post on this (good time to announce I started a substack!). shuvom.substack.com/p/revenge-of...
Or maybe, revenge of the 1st quantile. What common AI benchmarking discourse misses.
shuvom.substack.com
Revenge of the Worst Case
2mo
Shuvom Sadhuka