//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfileReplies









Loading...
The AdamW videos compress to 1/6th the size of the Muon videos. Something AdamW is doing allows the crease visualisation to be compressed well, but not Muon. This is the weirdest observation ever.
The weirdest observation: I generated movies visualizing the polytope boundaries for ReLU networks using Muon and AdamW. Same experiment, same data, same random seed. The difference is the "crease pattern" that the optimizers produce.
The most insightful take on Mythos I've seen so far. Everyone should read this but especially those who are currently thinking through the possible regulatory responses.
www.faz.net/premium/digi... I wrote a FAZ guest article.
Confession time: I use agentic coding all day, every day. It makes me much more productive. But I am also terrified of skill atrophy, I feel like I need to break out pen & paper to force myself to "weight-lift" mentally so I don't forget how to think. How do y'all handle this?