//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts









Loading...
How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! Our latest work with NVIDIA introduces new CUDA kernels & data formats for faster inference and training of sparse transformer language models: Blog: pub.sakana.ai/sparser-fast...
1mo
Reproducing all of Jürgen Schmidhuber’s papers (1990-2025) using an AI coding assistant. Cool project by Yaroslav! It even reproduced the “World Models” paper by me and Schmidhuber (2018) using a toy environment, with a full VAE + RNN world model implementation. Project: github.com/cybertronai/...
1mo
One of the most frustrating paradoxes in deep learning: making a model do less math often makes it run slower. Why? Because unstructured sparsity introduces irregular memory access, and GPUs are built for predictable, dense blocks of math.
The human brain is incredibly efficient because it only activates the specific neurons needed for a thought. Modern LLMs naturally try to do this too (over 95% of neurons in feedforward layers stay silent for any given word), but our hardware punishes them for it.
Video
1mo
1mo