🥳Accepted to ICML, by the way!
Old paper here: arxiv.org/abs/2510.25897 (new version up soon)
Going to pressure @nicolasdufour.bsky.social to release the code and models! 😅
David Picard
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.
- 19x faster convergence âš¡
- 370x less FLOPS than FLUX-dev 📉