//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
10/ This work was co-first-authored with Jerome Han, together with @benpry.bsky.social , @satchelgrant.bsky.social , @noahdgoodman.bsky.social , and @judithfan.bsky.social . arXiv: arxiv.org/abs/2605.28742 Code: github.com/LinasNas/cor... Website: linasnas.github.io/core-reasoni...
5d
Language models can use verifiable rewards to improve at a wide variety of reasoning tasks. However, both parametric (e.g. RLVR) and non-parametric (e.g. prompt optimization) approaches to doing so ty...
arxiv.org
CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning
Linas Nasvytis