//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
Postdoc at Kyutai http://nicolas-dufour.github.io
Nicolas Dufour









Loading...
Surflo: Consistent 3D Surface Flow Model with Global State @antoine-guedon.bsky.social, Shu Nakamura, @nicolasdufour.bsky.social, Jiahui Lei, Ko Nishino, @akanazawa.bsky.social arxiv.org/abs/2606.13644
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control. - 19x faster convergence ⚔ - 370x less FLOPS than FLUX-dev šŸ“‰
Thrilled to share that MIRO is accepted to ICML 2026 @icmlconf.bsky.social ! šŸŽ‰ By training on the reward scores, we can simply condition the model on high rewards at inference time to guarantee top-tier, aligned outputs. We’ve updated our paper with some additional results!
1d
7mo
24d
Zhenjun Zhao
Nicolas Dufour
Nicolas Dufour
The default paradigm of post-training text-to-image generators includes post-hoc selection of generated images, and subsequent training with one reward model to align the generator to the reward, typi...
arxiv.org
MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency
Open-source fueled the LLM revolution, but Physical AI hasn't fully benefited from this flywheel yet. Today, we're launching kesai.eu, our mission to democratize robotics research! First milestone: training a frontier-level self-driving policy using significantly less data than typically required.
It's not just for training from scratch. Our new results show that MIRO functions beautifully as a post-training framework! Applying this multi-reward conditioning during the fine-tuning phase of an existing base model yields the exact same controllable alignment.
Why does MIRO work so reliably? Our paper introduces a theorem proving that conditioning on the joint reward distribution mathematically guarantees that the model steers toward high-reward regions while preserving sample diversity and avoiding single-metric hacking.
Are all rewards useful? Yes! Our new "leave-one-out" ablation shows that removing even a single reward drops performance. Even though these rewards are quite entangled, each one still provides unique, useful bits of information that the model needs to succeed.
Everything is fully open-sourced, including the codebase, the model + all individual single reward model variants! 🌐 Site: nicolas-dufour.github.io/miro šŸ“„ Paper: arxiv.org/abs/2510.25897 šŸ› ļø Git: github.com/nicolas-dufo... šŸ¤— HF: huggingface.co/nicolas-dufo... šŸŽØ Demo: huggingface.co/spaces/nicol...
We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control. - 19x faster convergence ⚔ - 370x less FLOPS than FLUX-dev šŸ“‰
24d
24d
24d
24d
24d
7mo
The model is so fast and easy to use that I vibe-coded a small game with it in 1h šŸ˜… Runs flawlessly on a consumer GPU if you're looking for a small local model to tinker with.
Nicolas Dufour
Nicolas Dufour
Playing with some post-processing on a small in-house-but-soon-to-be-released model.
Nicolas Dufour
Nicolas Dufour
Nicolas Dufour
Kashyap Chitta
24d
KE:SAI is a Franco-German non-profit open science lab for scalable autonomous intelligence.
kesai.eu
KE:SAI — Open Science Autonomy Lab
25d
Video
David Picard
Video
David Picard
Everything is fully open-sourced, including the codebase, the model + all individual single reward model variants! 🌐 Site: nicolas-dufour.github.io/miro šŸ“„ Paper: arxiv.org/abs/2510.25897 šŸ› ļø Git: github.com/nicolas-dufo... šŸ¤— HF: huggingface.co/nicolas-dufo... šŸŽØ Demo: huggingface.co/spaces/nicol...
24d
Video
Nicolas Dufour