//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...









Loading...
For Alyosha (and R2's) favorite question of whether we can prompt our way out of this problem, we find that mode collapse is actually worse for longer, AI generated prompts and noise optimization is valuable here.
Hey everyone, super happy to share our work on quantum algorithms for heterogeneous partial differential equations (PDEs)! (1/4) scirate.com/arxiv/2604.0...
11d
2mo
Shyamgopal Karthik
👋🇧🇷 If you are at #ICLR2026 today, you should talk to @antonbaumann.bsky.social who is presenting our paper about turning pre-trained VLMs into probabilistic models without retraining or fine-tuning. Poster Session 3 ⌚: 10:30am - 1:00pm (local time) 📍: Pavilion 3 P3 - #313 @iclr-conf.bsky.social
1mo
What do we optimize for? Naively, we'd assume that the average pairwise similarity between the set of output samples would be enough to quantify diversity. However, this allows "reward-hacking" where one sample can become very different while the remaining samples stay the same.
There's nothing more satisfying than watching the right noise do its magic with diffusion models! A few interesting takeaways I had from this work đź§µ
Marcus Klasson
11d
11d
It turns out that the eigenvalues of the pairwise similarity matrix is the key to measure diversity. Both Determinantal Point Processes and Vendi Score (arxiv.org/abs/2210.02410) formalize this intuition and allow differentiable optimization. It is also well correlated with human judgement!
From an implementation standpoint, backprop through multiple steps of the sampling process turns out to be expensive (as always). I'm glad that we finally have a working implementation of activation checkpointing to solve the GPU memory bottleneck with Flux.2 github.com/anneharringt...
11d
11d
Shyamgopal Karthik
Shyamgopal Karthik
Shyamgopal Karthik
#CVPR2026 paper: It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models Text-to-image models often collapse to near-identical samples. Our fix: optimize the noise. Start from pink đź©·, not white noise. đź”— akoepke.github.io/divgen/index... 1/6
Shyamgopal Karthik
This has now been accepted at @iclr-conf.bsky.social !
New paper: Back into Plato’s Cave Are vision and language models converging to the same representation of reality? The Platonic Representation Hypothesis says yes. BUT we find the evidence for this is more fragile than it looks. Project page: akoepke.github.io/cave_umwelten/ 1/9
11d
4mo
1mo
Martin Trapp
A. Sophia Koepke
A. Sophia Koepke