In our latest blog post, Marin team member Larry Dial describes the pretraining techniques we're using as we transition from dense models to more efficient Mixture of Experts (MoE) models. This demos the stability and predictability of MoEs, giving us a promising direction for our next training run
Open Athena is a nonprofit that accelerates academia with capabilities from the AI frontier