//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts





Can we extend the power of world models beyond just online model-based learning? Absolutely! We believe the true potential of world models lies in enabling agents to reason at test time. Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.
Overall, DINO-WM takes a step toward bridging the gap between task-agnostic world modeling and reasoning and control, offering promising prospects for generic world models in real-world applications.
Huge thanks to all my collaborators who made this project possible @hengkaipan.bsky.social, @yann-lecun.bsky.social, @lerrelpinto.com We have open-sourced our code and data. For more details, checkout the paper and website: Website: dino-wm.github.io arXiv: arxiv.org/abs/2411.04983
DINO-WM consists of: 1️⃣An out-of-the-box DINOv2 model as the observation model. 2️⃣A causal ViT as the predictor. 3️⃣A decoder that is optional for visualization. DINO-WM plans entirely in latent space, without the need to reconstruct pixel images.
Unlike previous works that couple world model learning with behavior learning, we train a dynamics-only model and infer actions only at test time. This allows zero-shot goal-reaching by reasoning through the dynamics—no expert demonstrations, no rewards, no online interactions.
Jan 31, 2025
Jan 31, 2025
Jan 31, 2025
Jan 31, 2025
The object and spatial understanding priors of DINOv2 features enable robust scene understanding, essential for navigation and manipulation tasks. With this prior, DINO-WM outperforms state-of-the-art world models by 45% in downstream task performance on our hardest tasks.
Video
Jan 31, 2025
Jan 31, 2025