Inlay

Profile

I've seen a number of claims that it's 10T or 6T paramètres and MoE, but without sources. Simon Willison has some good points about why it certainly feels big: simonwillison.net/2026/Jun/9/c...

Long thread on exciting work on how vocal learning (thought to be crucial also for human language) works in the brains of seals and sea lions. Massive effort to scan very many brains & species. In evolution, it may have started with volitional control over breathing! www.science.org/doi/10.1126/...

Interested in how AI models can achieve flexible, robust, human-like reasoning? Me too! I am recruiting for a PhD position in neurosymbolic AI to investigate this question. If you are interested, please take a look here: werkenbij.uva.nl/en/vacancies...

(The OpenMythos repository does not look like *plausible* speculation to me, so that doesn't count)

of known techniques: (better) synthetic data, MoE routing, posttaining, harness optimization etc. Does anyone know more?

There's a broken cuneiform tablet from the Old Babylonian period, nearly 4,000 years ago, which preserves a tiny portion of a dialogue between two friends. It feels a bit like the conversations I've been having for the past week, so I wanted to share it.

My own speculation is that, in addition to even more scaling up, Anthropic has been able to better optimize the core LLM for its use inside a highly optimized harness, due to availability of all the data Anthropic has been able to collect in its surge in popularity in the last months.

Does anyone know of some plausible speculations on *technical innovations* driving the impressive performance of Mythos/Fable? The model card only talks about evaluations (interestingly, mostly in biology). The interpretability work on Mythos Preview suggests it's essentially all based on versions

27d

📢 PhD position in Developmental Language Modelling (PLZ RT) What can human language acquisition teach us about training language models? Join us as a PhD! mpi.nl/career-education/vacancies/vacancy/fully-funded-4-year-phd-position-developmental-language @carorowland.bsky.social @mpi-nl.bsky.social

1mo

3mo

Jelle Zuidema 🟥