//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
RL + LLM @ai2.bsky.social; main dev of https://cleanrl.dev/
Costa Huang







Loading...
Congrats on the launch!
This is all. Enjoy the new model ๐Ÿ˜†
One fun thing is that our model outperformed qwen by almost ~26 points in IFEval. What's going on? We built some nice visualization tools, finding out that basically our model can follow the instructions like "write without a comma" well.
Our 1B model achieves impressive performance. See our official tweet for more details! bsky.app/profile/ai2....
The model checkpoints are available in huggingface.co/collections/.... As always, we uploaded all the intermediate RL checkpoints
๐Ÿฅ˜ Excited to share our latest OLMo 1 B models! Almost summer RL time. We did another two-stage RL: * The first RLVR run uses allenai/RLVR-GSM-MATH-IF-Mixed-Constraints * The final RLVR run uses allenai/RLVR-MATH for targeted MATH improvement Short ๐Ÿงต
We streamlined our release process to include the RLVR intermediate checkpoints as well. They are available in the revisions if you want to check it out. See our updated collection here: huggingface.co/collections/...