//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
xLSTM Distillation: arxiv.org/abs/2603.15590 Near-lossless distillation of quadratic Transformer LLMs into linear-time xLSTM architectures enables cost- and energy-efficient alternatives without sacrificing performance. Efficient xLSTM variants of instruction-tuned Llama, Qwen, and Olmo models.
2mo