Inlay

1/ 📄 Paper 1: "DUNE: Distilling a UNiversal Encoder from Heterogeneous 2D and 3D Teachers" We propose DUNE: a ViT-based encoder distilled from multiple specialized 2D & 3D foundation models to unify visual tasks across 2D, 3D and human understanding.