π¨ Introducing the TMLR Beyond PDF Editor, a web-based editor for creating Beyond PDF submissions!
π Zero setup + live preview + download & import.
π€ Compatible with local compilation.
π Thanks to Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
Video
Paul Vicol
Veo 3 can reason, filling in the next element in a sequence of images.
Veo 3 can understand which objects fit in a backpack, and can categorize objects (putting all the toys in a bucket). Veo 3 can also draw, and understands the effects of gravity and air resistance on falling objects.
Veo 3 understands material properties, including buoyancy, reflections, flammability, and soft body physics.
π All this and more in our paper!
arXiv: arxiv.org/abs/2509.20328
By
@thwiedemer.bsky.social, Yuxuan Li, @paulvicol.bsky.social, @nmatares.bsky.social, Been Kim, Kevin Swersky, Priyank Jaini, and Robert Geirhos.
π Introducing TMLR Beyond PDF!
π¬ This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.
π Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
ποΈ Play with it here: tmlr-beyond-pdf.org/editor
π Submission instructions: tmlr-beyond-pdf.org/submission-i...
Video
Video
Video
Video
Video
π Check out TMLR Beyond PDF at tmlr-beyond-pdf.org!
π» Submissions are made on OpenReview just like standard TMLR submissions.
π¨ Create your TMLR Beyond PDF submission today!
π₯Veo 3 has emergent zero-shot learning and reasoning capabilities!
This multitalented model can do a huge range of interesting tasks.
It understands physical properties, can manipulate objects, and can even reason.
arxiv.org/abs/2509.20328
video-zero-shot.github.io
Examples in this thread!
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Paul Vicol
Video
Paul Vicol
Paul Vicol
Are we experiencing a 'GPT moment' in vision?
In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it.
π video-zero-shot.github.io
1/n