//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Evaluating language model responses on open-ended tasks is hard! πŸ€” We introduce EvalAgent, a framework that identifies nuanced and diverse criteria πŸ“‹βœοΈ. EvalAgent identifies πŸ‘©β€πŸ«πŸŽ“ expert advice on the web that implicitly address the user’s prompt πŸ§΅πŸ‘‡
Apr 22, 2025
Manya Wadhwa