Input, more input π€β‘
Just like Jonny 5 in Short Circuit, our baby model is reading every single token from its pretraining dataset.
So far: 10 trillion tokens, 36 languages + code & math as their own "languages" πππ»
Weβre tracking progress & sharing it openly π
(1/2)