//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
In our latest blog post, Marin team member Larry Dial describes the pretraining techniques we're using as we transition from dense models to more efficient Mixture of Experts (MoE) models. This demos the stability and predictability of MoEs, giving us a promising direction for our next training run
13d
Open Athena is a nonprofit that accelerates academia with capabilities from the AI frontier
openathena.ai
Open Athena | Improving our LLM Pretraining Efficiency
Open Athena