Control experiment: Are gains just from a larger network?
We scaled up baselines to match parameters. Result: Bigger networks alone do not solve continual learning. The gains arise from what is remembered and how memory is organized across timescales.
12/15