Research Scientist at Google DeepMind. Working on Gemini reasoning models.
PhD from UofT and Vector Institute
www.paulvicol.com
Paul Vicol
Veo 3 can understand which objects fit in a backpack, and can categorize objects (putting all the toys in a bucket). Veo 3 can also draw, and understands the effects of gravity and air resistance on falling objects.
Video
Paul Vicol
🎨 Introducing the TMLR Beyond PDF Editor, a web-based editor for creating Beyond PDF submissions!
🚀 Zero setup + live preview + download & import.
🤝 Compatible with local compilation.
🎉 Thanks to Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
Veo 3 understands material properties, including buoyancy, reflections, flammability, and soft body physics.
Video
Video
🖋️ Play with it here: tmlr-beyond-pdf.org/editor
📃 Submission instructions: tmlr-beyond-pdf.org/submission-i...
Paul Vicol
Veo 3 can reason, filling in the next element in a sequence of images.
🔥Veo 3 has emergent zero-shot learning and reasoning capabilities!
This multitalented model can do a huge range of interesting tasks.
It understands physical properties, can manipulate objects, and can even reason.
arxiv.org/abs/2509.20328
video-zero-shot.github.io
Examples in this thread!
Paul Vicol
🚀 All this and more in our paper!
arXiv: arxiv.org/abs/2509.20328
By
@thwiedemer.bsky.social, Yuxuan Li, @paulvicol.bsky.social, @nmatares.bsky.social, Been Kim, Kevin Swersky, Priyank Jaini, and Robert Geirhos.
🚀 Introducing TMLR Beyond PDF!
🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.
🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
Video
🌐 Check out TMLR Beyond PDF at tmlr-beyond-pdf.org!
💻 Submissions are made on OpenReview just like standard TMLR submissions.
🎨 Create your TMLR Beyond PDF submission today!
Video
Video
Video
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Are we experiencing a 'GPT moment' in vision?
In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it.
🌐 video-zero-shot.github.io
1/n