Inlay

Introducing OLMo-2-0325-32B-Instruct! It's the spring RL curve time. This time, we used GRPO for RLVR and trained a pretty nice fully open source model!

8mo

🥘 Excited to share our latest OLMo 1 B models! Almost summer RL time. We did another two-stage RL: * The first RLVR run uses allenai/RLVR-GSM-MATH-IF-Mixed-Constraints * The final RLVR run uses allenai/RLVR-MATH for targeted MATH improvement Short 🧵

This is all. Enjoy the new model 😆