Surflo: Consistent 3D Surface Flow Model with Global State
@antoine-guedon.bsky.social, Shu Nakamura, @nicolasdufour.bsky.social, Jiahui Lei, Ko Nishino, @akanazawa.bsky.social
arxiv.org/abs/2606.13644
The default paradigm of post-training text-to-image generators includes post-hoc selection of generated images, and subsequent training with one reward model to align the generator to the reward, typi...
Open-source fueled the LLM revolution, but Physical AI hasn't fully benefited from this flywheel yet. Today, we're launching kesai.eu, our mission to democratize robotics research! First milestone: training a frontier-level self-driving policy using significantly less data than typically required.
It's not just for training from scratch. Our new results show that MIRO functions beautifully as a post-training framework!
Applying this multi-reward conditioning during the fine-tuning phase of an existing base model yields the exact same controllable alignment.
Why does MIRO work so reliably? Our paper introduces a theorem proving that conditioning on the joint reward distribution mathematically guarantees that the model steers toward high-reward regions while preserving sample diversity and avoiding single-metric hacking.
Are all rewards useful? Yes! Our new "leave-one-out" ablation shows that removing even a single reward drops performance. Even though these rewards are quite entangled, each one still provides unique, useful bits of information that the model needs to succeed.
Everything is fully open-sourced, including the codebase, the model + all individual single reward model variants!
š Site: nicolas-dufour.github.io/miro
š Paper: arxiv.org/abs/2510.25897
š ļø Git: github.com/nicolas-dufo...
š¤ HF: huggingface.co/nicolas-dufo...
šØ Demo: huggingface.co/spaces/nicol...
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.
- 19x faster convergence ā”
- 370x less FLOPS than FLUX-dev š
The model is so fast and easy to use that I vibe-coded a small game with it in 1h š Runs flawlessly on a consumer GPU if you're looking for a small local model to tinker with.
Nicolas Dufour
Nicolas Dufour
Playing with some post-processing on a small in-house-but-soon-to-be-released model.
Nicolas Dufour
Nicolas Dufour
Nicolas Dufour
Kashyap Chitta
KE:SAI is a Franco-German non-profit open science lab for scalable autonomous intelligence.
Everything is fully open-sourced, including the codebase, the model + all individual single reward model variants!
š Site: nicolas-dufour.github.io/miro
š Paper: arxiv.org/abs/2510.25897
š ļø Git: github.com/nicolas-dufo...
š¤ HF: huggingface.co/nicolas-dufo...
šØ Demo: huggingface.co/spaces/nicol...