a transformer is a machine which approximates a 1/f distribution log-linearly, relying on attn to rescale as well as transfer information between positions, so it makes sense that this would work
arxiv.org/abs/2105.03824
similarly a diffusion model is a machine which turns white noise into pink noise
interesting and slightly ominous fact: i made Claude Opus 4.6 play the Gostak today, and it got 5/5 glauds, only needing one hint about the use of the raskable glaud to get a creature with murgous enough goaves into the delcot to leil the glaud of jenth
there are actually fifteen datacenters within a 10km radius of me, which is nice, because i can now boil water by putting a pot out my window
honestly this is all very "japan in the 1930s": a country that has engaged in aggression towards its neighbors for decades suddenly losing its mind overnight and becoming almost cartoonishly fanatic
it's so beautiful. it's like what Dracula drives in a cyberpunk movie