//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
Assistant professor in Natural Language Processing at the University of Edinburgh and visiting professor at NVIDIA | A Kleene star shines on the hour of our meeting.
Edoardo Ponti






Loading...
Up next on stage, Dr. @edoardo-ponti.bsky.social ( @edinburgh-uni.bsky.social / NVIDIA) 🎤 “Adaptive Units of Computation: Towards Sublinear-Memory and Tokenizer-Free Foundation Models” Fascinating glimpse into the next gen of foundation models. #FoundationModels #NLP #TokenizerFree #ADSAI2025
Thanks to the amazing collaborators Adrian Łańcucki, Konrad Staniszewski, and Piotr Nawrot! It was amazing to spend a year at NVIDIA as a visiting professor! arXiv: arxiv.org/pdf/2506.05345 Code and models coming soon!
🏆 We evaluate inference-time hyper-scaling on DeepSeek R1-distilled models of different sizes, increasing accuracy on maths, science, and coding by up to 15 points for a given budget.
💡The idea behind DMS is to *train* existing LLMs to evict tokens from the KV cache, while delaying the eviction some time after the decision. This allows LLMs to preserve information while reducing latency and memory size.