//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...








Loading...
I love SfM, but it's way less useful than it should be because of a handful of characteristic failures. @zhengqi_li's new paper basically solves them all: -No parallax? ✅ -No calibration? ✅ -Dynamic scenes? ✅ -Dense geometry? ✅ Best of all, it's super fast.
I know, it's hard to believe. But this thing really works. Check out the website, there are a couple dozen interactive results and over 80 video examples in the gallery. No cherry-picking here. mega-sam.github.io
Check out CAT4D: our new paper that turns (text, sparse images, videos) => (dynamic 3D scenes)! I can't get over how cool the interactive demo is. Try it out for yourself on the project page: cat-4d.github.io
Dec 6, 2024
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊 We can ⌨️Make a typewriter sound like a piano 🎹 🐱Make a cat meow like a lion roars! 🦁 ⏱️Perfectly time existing SFX 💥 to a video. arXiv: arxiv.org/abs/2411.17698 website: ificl.github.io/MultiFoley/
Dec 6, 2024
Nov 28, 2024
Video
Video
Nov 27, 2024
Video
Introducing MegaSaM! Accurate, fast, & robust structure + camera estimation from casual monocular videos of dynamic scenes! MegaSaM outputs camera parameters and consistent video depth, scaling to long videos with unconstrained camera paths and complex scene dynamics!
Aleksander Hołyński
Quark is out! Come check our work on generalized realtime 3D reconstruction. quark-3d.github.io PS: We're looking for interns!
Aleksander Hołyński
Aleksander Hołyński
Dec 6, 2024
Video
We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: tap-next.github.io
Ziyang Chen
Nov 28, 2024
Video
Introducing MegaSaM! Accurate, fast, & robust structure + camera estimation from casual monocular videos of dynamic scenes! MegaSaM outputs camera parameters and consistent video depth, scaling to long videos with unconstrained camera paths and complex scene dynamics!
Apr 9, 2025
We just dropped CAT4D, text to dynamic 3D models that you can render in real time. Not posting a video because Bluesky is garbage in this respect; go straight to the real time viewer on a desktop browser and look around. The cat kneading dough is my favorite. cat-4d.github.io
Video
Stop watching videos, start interacting with worlds. Stoked to share CAT4D, our new method for turning videos into dynamic 3D scenes that you can move through in real-time! cat-4d.github.io arxiv.org/abs/2411.18613
Dec 6, 2024
Nov 28, 2024
Video
Nov 28, 2024
Video
We present CAT4D, a method for creating 4D (dynamic 3D) scenes from monocular video. CAT4D leverages a multi-view video diffusion model trained on a diverse combination of datasets to enable novel vie...
cat-4d.github.io
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
Clément Godard
Ben Poole
Jon Barron