//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
🧵[5/n]📊 🧪 Training the Subnetwork Reproduces Full Model 1️⃣ When trained in isolation, the sparse subnetwork recovers almost the exact same weights as the full model 2️⃣ achieves comparable (or better) end-task performance 3️⃣ 🧮 Even the training loss converges more smoothly
May 21, 2025
Sagnik Mukherjee