Excited about this new work from @haoyuhe.bsky.social. TLDR: Diffusion language models treat learning and inference differently which lowers performance. RL can be used to overcome this issue for certain problems.
Andreas Geiger
π¨ New Paper: "Solving Spatial Supersensing Without Spatial Supersensing"
Huge credit to the Cambrian-S team for tackling one of the hardest open problems in video understanding: spatial supersensing. In our paper, we take a closer look at their benchmarks & methods π
For VSI-Super-Counting (VSC), we run a sanity check:
π VSC-Repeat: we concatenate each video with itself 1-5Γ
β Unique object count stays the same
β Cambrian-S accuracy drops from 42% β 0%
A genuine supersensing system should be robust here.
Presenting A Sober Look at Progress in LM Reasoning at @colmweb.org today π¨π¦ #COLM2025
π Today
π 11:00 AM β 1:00 PM
π Room 710 - Poster #31
We find that many βreasoningβ gains fall within variance and show how to make evaluation reproducible again.
π bethgelab.github.io/sober-reasoning