Inlay

//

by @danabra.mov

by @danabra.mov

by @jimpick.com

+ new component

Post

www.anthropic.com

Demystifying evals for AI agents

I remember seeing very dubious advice from OpenAI a few years ago on evaluation. So I was happy to see quite sensible recent advice from Anthropic on evaluation www.anthropic.com/engineering/...

Demystifying evals for AI agents

18d