PhD in 3D Vision @Stanford | MSc CS @ETH | Ex @Qualcomm, @MercedesBenz
W: sayands.github.io
Sayan Deb Sarkar
π arXiv: arxiv.org/abs/2502.15011
π Project page: sayands.github.io/crossover/
Joint work with Ondrej Miksik, @marcpollefeys.bsky.social, @danielbarath.bsky.social and @ir0armeni.bsky.social π€π‘
β¨ Excited to head off to Nashville for #CVPR2025
π€ Catch me at the poster sessions or just come say hi to grab β
ποΈ Friday 13 June 4:00 p.m. - 6:00 p.m. CDT
π Poster Session #2 β Exhibit Hall D Highlight Poster #346
(1/3)
π CrossOver is accepted as a ππΆπ΄π΅πΉπΆπ΄π΅π at #CVPR2025! β¨
π» Fully open-sourced code with all pre-trained checkpoints: github.com/GradientSpac...
π‘ Stay tuned for a deep-dive thread and what else we are cooking π³
π Excited to share our latest work, CrossOver: 3D Scene Cross-Modal Alignment, accepted to #CVPR2025 πβ¨
We learn a unified, modality-agnostic embedding space, enabling seamless scene-level alignment across multiple modalities β no semantic annotations needed!π
π° Paper: arxiv.org/abs/2502.15011
βΆοΈ Project Page: sayands.github.io/crossover/
π» Codebase: github.com/GradientSpacesβ¦
Work w/ Ondrej Miksik, @marcpollefeys.bsky.social, @danielbarath.bsky.social and @ir0armeni.bsky.social β¨
(3/3)
Video
ππPaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes.
1/7
Multi-modal 3D object understanding has gained significant attention, yet current approaches often assume complete data availability and rigid alignment across all modalities. We present CrossOver, a ...