//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
🚨Go check out our most recent preprint on multilayer attention probing for Vision Transformers! It’s almost as performant as full fine-tuning while being similarly compute efficient as standard linear probing! Plus you get interpretable attention maps! More in πŸ§΅πŸ‘‡πŸΌ Preprint: arxiv.org/abs/2601.09322
4mo
With the rise of large-scale foundation models, efficiently adapting them to downstream tasks remains a central challenge. Linear probing, which freezes the backbone and trains a lightweight head, is ...
arxiv.org
Beyond the final layer: Attentive multilayer fusion for vision transformers
Lukas Muttenthaler
Why you should probe more than just the final layer of your Vision Transformer to maximize performance. πŸ§΅πŸ‘‡