//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
๐Ÿงต[7/n] ๐Ÿ” Potential Reasons ๐Ÿ’ก We hypothesize that the in-distribution nature of training data is a key driver behind this sparsity ๐Ÿง  The model already "knows" a lot โ€” RL just fine-tunes a small, relevant subnetwork rather than overhauling everything
May 21, 2025
Sagnik Mukherjee