Inlay

This is cool and makes a lot of sense. Reminds me of the theory of neutral networks in evolutionary theory, where networks of neutral genotype changes enable populations to traverse the fitness landscape without getting stuck in local minima

NEW PAPER. Why do larger networks train better? "Because they contain more candidate *sub*networks that can learn the task" → lottery tickets This popular explanation uses an appealing but misleading metaphor🧵 We propose an intuitive alternative grounded in theory: escape dimensions