Research Scientist at Google DeepMind. Working on Gemini reasoning models.
PhD from UofT and Vector Institute
www.paulvicol.com
Paul Vicol
Loading...
๐๏ธ Play with it here: tmlr-beyond-pdf.org/editor
๐ Submission instructions: tmlr-beyond-pdf.org/submission-i...
๐จ Introducing the TMLR Beyond PDF Editor, a web-based editor for creating Beyond PDF submissions!
๐ Zero setup + live preview + download & import.
๐ค Compatible with local compilation.
๐ Thanks to Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
๐ Check out TMLR Beyond PDF at tmlr-beyond-pdf.org!
๐ป Submissions are made on OpenReview just like standard TMLR submissions.
๐จ Create your TMLR Beyond PDF submission today!
๐ Introducing TMLR Beyond PDF!
๐ฌ This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.
๐ Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
๐ All this and more in our paper!
arXiv: arxiv.org/abs/2509.20328
By
@thwiedemer.bsky.social, Yuxuan Li, @paulvicol.bsky.social, @nmatares.bsky.social, Been Kim, Kevin Swersky, Priyank Jaini, and Robert Geirhos.
Veo 3 understands material properties, including buoyancy, reflections, flammability, and soft body physics.
Veo 3 can understand which objects fit in a backpack, and can categorize objects (putting all the toys in a bucket). Veo 3 can also draw, and understands the effects of gravity and air resistance on falling objects.
Veo 3 can reason, filling in the next element in a sequence of images.
๐ฅVeo 3 has emergent zero-shot learning and reasoning capabilities!
This multitalented model can do a huge range of interesting tasks.
It understands physical properties, can manipulate objects, and can even reason.
arxiv.org/abs/2509.20328
video-zero-shot.github.io
Examples in this thread!
Video
Video
Video
Video
Video
Video
Video
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Are we experiencing a 'GPT moment' in vision?
In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it.
๐ video-zero-shot.github.io
1/n