Inlay

Profile

Research Scientist at Google DeepMind. Working on Gemini reasoning models. PhD from UofT and Vector Institute www.paulvicol.com

Paul Vicol

Veo 3 can understand which objects fit in a backpack, and can categorize objects (putting all the toys in a bucket). Veo 3 can also draw, and understands the effects of gravity and air resistance on falling objects.

Video

8mo

Paul Vicol

🎨 Introducing the TMLR Beyond PDF Editor, a web-based editor for creating Beyond PDF submissions! 🚀 Zero setup + live preview + download & import. 🤝 Compatible with local compilation. 🎉 Thanks to Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!

3mo

Veo 3 understands material properties, including buoyancy, reflections, flammability, and soft body physics.

Video

8mo

Video

🖋️ Play with it here: tmlr-beyond-pdf.org/editor 📃 Submission instructions: tmlr-beyond-pdf.org/submission-i...

Paul Vicol

Veo 3 can reason, filling in the next element in a sequence of images.

🔥Veo 3 has emergent zero-shot learning and reasoning capabilities! This multitalented model can do a huge range of interesting tasks. It understands physical properties, can manipulate objects, and can even reason. arxiv.org/abs/2509.20328 video-zero-shot.github.io Examples in this thread!

3mo

Paul Vicol

🚀 All this and more in our paper! arXiv: arxiv.org/abs/2509.20328 By @thwiedemer.bsky.social, Yuxuan Li, @paulvicol.bsky.social, @nmatares.bsky.social, Been Kim, Kevin Swersky, Priyank Jaini, and Robert Geirhos.

🚀 Introducing TMLR Beyond PDF! 🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images. 🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!

8mo

Video

🌐 Check out TMLR Beyond PDF at tmlr-beyond-pdf.org! 💻 Submissions are made on OpenReview just like standard TMLR submissions. 🎨 Create your TMLR Beyond PDF submission today!

8mo

6mo

Video

6mo

Video

Paul Vicol

Are we experiencing a 'GPT moment' in vision? In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it. 🌐 video-zero-shot.github.io 1/n

8mo

Thaddäus Wiedemer