//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Inverse Rubric Optimization: A testbed for agent science | Discussion
4h
We propose inverse rubric optimization (IRO): tasks where an agent must learn the preferences of a black-box judge under a label budget. IRO tasks induce rich agent behavior and smooth scaling, making them a useful testbed for agent science.
fulcrum.inc
Inverse Rubric Optimization: A testbed for agent science
Hacker News Top Stories