For Alyosha (and R2's) favorite question of whether we can prompt our way out of this problem, we find that mode collapse is actually worse for longer, AI generated prompts and noise optimization is valuable here.
Hey everyone, super happy to share our work on quantum algorithms for heterogeneous partial differential equations (PDEs)! (1/4)
scirate.com/arxiv/2604.0...
Shyamgopal Karthik
👋🇧🇷 If you are at #ICLR2026 today, you should talk to @antonbaumann.bsky.social who is presenting our paper about turning pre-trained VLMs into probabilistic models without retraining or fine-tuning.
Poster Session 3
⌚: 10:30am - 1:00pm (local time)
📍: Pavilion 3 P3 - #313
@iclr-conf.bsky.social
What do we optimize for? Naively, we'd assume that the average pairwise similarity between the set of output samples would be enough to quantify diversity. However, this allows "reward-hacking" where one sample can become very different while the remaining samples stay the same.
There's nothing more satisfying than watching the right noise do its magic with diffusion models! A few interesting takeaways I had from this work
đź§µ
Marcus Klasson
It turns out that the eigenvalues of the pairwise similarity matrix is the key to measure diversity. Both Determinantal Point Processes and Vendi Score (arxiv.org/abs/2210.02410) formalize this intuition and allow differentiable optimization. It is also well correlated with human judgement!
From an implementation standpoint, backprop through multiple steps of the sampling process turns out to be expensive (as always). I'm glad that we finally have a working implementation of activation checkpointing to solve the GPU memory bottleneck with Flux.2 github.com/anneharringt...
This has now been accepted at @iclr-conf.bsky.social !
New paper: Back into Plato’s Cave
Are vision and language models converging to the same representation of reality? The Platonic Representation Hypothesis says yes. BUT we find the evidence for this is more fragile than it looks.
Project page: akoepke.github.io/cave_umwelten/
1/9