//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
How much does a language model forget when finetuned on new tasks? We show both model size and optimization matter and forgetting can be nearly eliminated with self-generated replay! arxiv.org/abs/2605.26097 w/Martin Marek, Dongkyu Cho, Shikai Qiu, Rumi Chunara, and Pavel Izmailov. 1/8
15d
Andrew Gordon Wilson