//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
At the edge of this regime (where η ∝ 1/√m), there exists a well-defined infinite-width limit where feature learning persists in all hidden layers. This Feature Learning Limit closely matches the behavior of optimally tuned finite-width networks under CE loss. (6/10)
6mo
Leena C Vankadara