Inlay

Profile

Doing research in 3D Computer Vision. Ph.D. student at the University of Amsterdam. Previously at TUM. https://vladimiryugay.github.io/

Vladimir Yugay

📽️ Check out Visual Odometry Transformer! VoT is an end-to-end model for getting accurate metric camera poses from monocular videos. vladimiryugay.github.io/vot/

⏩ GaME code release! github.com/VladimirYuga... Grab components for your 3D reconstruction pipeline: 🔹 Purely geometric out-of-view scene change detection 🔹 Outdated observations filtering 🔹 Evaluation videos of changing scenes Contributions welcome 🚀

We experimented with different backbones, camera pose representations, scalability, and attention mechanisms. Our evaluation spans hundreds of full-length videos across various metrics, without aligning the predicted trajectory to the ground truth, to simulate a real-world application

8mo

Video

VoT does not require calibration or post-optimization and operates in real-time, capable of processing thousands of frames. It is trained on a vast amount of real-world indoor data, but can work just fine in outdoor scenarios. It uses only camera poses as supervision, making it broadly accessible