Doing research in 3D Computer Vision.
Ph.D. student at the University of Amsterdam. Previously at TUM.
https://vladimiryugay.github.io/
Vladimir Yugay
Loading...
📽️ Check out Visual Odometry Transformer! VoT is an end-to-end model for getting accurate metric camera poses from monocular videos.
vladimiryugay.github.io/vot/
⏩ GaME code release!
github.com/VladimirYuga...
Grab components for your 3D reconstruction pipeline:
🔹 Purely geometric out-of-view scene change detection
🔹 Outdated observations filtering
🔹 Evaluation videos of changing scenes
Contributions welcome 🚀
We experimented with different backbones, camera pose representations, scalability, and attention mechanisms. Our evaluation spans hundreds of full-length videos across various metrics, without aligning the predicted trajectory to the ground truth, to simulate a real-world application
Video
VoT does not require calibration or post-optimization and operates in real-time, capable of processing thousands of frames. It is trained on a vast amount of real-world indoor data, but can work just fine in outdoor scenarios. It uses only camera poses as supervision, making it broadly accessible