🤹 New blog post!
I write about our recent work on using hierarchical trees to enable sparse attention over irregular data (point clouds, meshes) - Erwin Transformer, accepted to ICML 2025
blog: maxxxzdn.github.io/blog/erwin/
paper: arxiv.org/abs/2502.17019
Compressed version in the thread below: