Inlay

Profile

For Alyosha (and R2's) favorite question of whether we can prompt our way out of this problem, we find that mode collapse is actually worse for longer, AI generated prompts and noise optimization is valuable here.

Hey everyone, super happy to share our work on quantum algorithms for heterogeneous partial differential equations (PDEs)! (1/4) scirate.com/arxiv/2604.0...

11d

2mo

Shyamgopal Karthik

👋🇧🇷 If you are at #ICLR2026 today, you should talk to @antonbaumann.bsky.social who is presenting our paper about turning pre-trained VLMs into probabilistic models without retraining or fine-tuning. Poster Session 3 ⌚: 10:30am - 1:00pm (local time) 📍: Pavilion 3 P3 - #313 @iclr-conf.bsky.social

1mo

What do we optimize for? Naively, we'd assume that the average pairwise similarity between the set of output samples would be enough to quantify diversity. However, this allows "reward-hacking" where one sample can become very different while the remaining samples stay the same.

There's nothing more satisfying than watching the right noise do its magic with diffusion models! A few interesting takeaways I had from this work 🧵

Marcus Klasson

11d

It turns out that the eigenvalues of the pairwise similarity matrix is the key to measure diversity. Both Determinantal Point Processes and Vendi Score (arxiv.org/abs/2210.02410) formalize this intuition and allow differentiable optimization. It is also well correlated with human judgement!

From an implementation standpoint, backprop through multiple steps of the sampling process turns out to be expensive (as always). I'm glad that we finally have a working implementation of activation checkpointing to solve the GPU memory bottleneck with Flux.2 github.com/anneharringt...

11d

Shyamgopal Karthik

#CVPR2026 paper: It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models Text-to-image models often collapse to near-identical samples. Our fix: optimize the noise. Start from pink 🩷, not white noise. 🔗 akoepke.github.io/divgen/index... 1/6

Shyamgopal Karthik

This has now been accepted at @iclr-conf.bsky.social !

New paper: Back into Plato’s Cave Are vision and language models converging to the same representation of reality? The Platonic Representation Hypothesis says yes. BUT we find the evidence for this is more fragile than it looks. Project page: akoepke.github.io/cave_umwelten/ 1/9

11d

4mo

1mo

Martin Trapp

A. Sophia Koepke