Inlay

Reproducing all of Jürgen Schmidhuber’s papers (1990-2025) using an AI coding assistant. Cool project by Yaroslav! It even reproduced the “World Models” paper by me and Schmidhuber (2018) using a toy environment, with a full VAE + RNN world model implementation. Project: github.com/cybertronai/...

Video

1mo

hardmaru

How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! Our latest work with NVIDIA introduces new CUDA kernels & data formats for faster inference and training of sparse transformer language models: Blog: pub.sakana.ai/sparser-fast...

1mo

Through TwELL and a new set of custom CUDA kernels for both LLM inference and training, we translated theoretical sparsity into actual wall-clock speedups: >20% faster training and inference on H100 GPUs, while also cutting energy consumption and memory requirements.