at://
/
app.bsky.feed.post
/
3mmt4dfcnrs2g
sign in
All
4
Record
2
Post
1
PostEmbed
1
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
www.anthropic.com
Demystifying evals for AI agents
I remember seeing very dubious advice from OpenAI a few years ago on evaluation. So I was happy to see quite sensible recent advice from Anthropic on evaluation www.anthropic.com/engineering/...
Demystifying evals for AI agents
18d