DEEPRUBRIC: Evidence-Tree Rubric Supervision for Efficient Reinforcement Learning of Deep Research Agents
Builds an evidence tree to jointly derive training queries and rubrics.
š arxiv.org/abs/2606.17029
šØš½āš» zminghang.github.io/DeepRubric-C...
arxiv.org
Deep research agents synthesize long-form reports by searching and reasoning over retrieved evidence. Reinforcement learning with rubric-based rewards improves these agents by optimizing them against ...