//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
www.anthropic.com
Demystifying evals for AI agents
I remember seeing very dubious advice from OpenAI a few years ago on evaluation. So I was happy to see quite sensible recent advice from Anthropic on evaluation www.anthropic.com/engineering/...
Demystifying evals for AI agents
18d