Inlay

//

by @danabra.mov

by @danabra.mov

by @jimpick.com

+ new component

Post

We view forgetting as drift in the model's predictions on old data. So the fix is simple: use a KL penalty on past (pretraining) data to keep old outputs fixed while the model fits the new data. 2/8

15d

Andrew Gordon Wilson