at://
/
app.bsky.feed.post
/
3mmk4x6oyys26
sign in
All
4
Record
2
Post
1
PostEmbed
1
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
The social scientist turned (sort of) AI researcher, turned regulator in me is very interested in this www.normaltech.ai/p/open-world...
18d
www.normaltech.ai
Introducing CRUX, a new project for evaluating AI on long, messy tasks
Open-world evaluations for measuring frontier AI capabilities
David Barnard-Wills