Inlay

Thrilled to share that MIRO is accepted to ICML 2026 @icmlconf.bsky.social ! 🎉 By training on the reward scores, we can simply condition the model on high rewards at inference time to guarantee top-tier, aligned outputs. We’ve updated our paper with some additional results!

The default paradigm of post-training text-to-image generators includes post-hoc selection of generated images, and subsequent training with one reward model to align the generator to the reward, typi...

We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control. - 19x faster convergence ⚡ - 370x less FLOPS than FLUX-dev 📉