//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Reasoning models are far more expensive to post-train. For our 32B model, post-training our Think model takes 17x more datacenter energy than post-training the Instruct variant, and almost all of that gap is reinforcement learning.
1mo
Jacob Morrison