Inlay

ProfilePosts

Pretraining launched!🚀 Our 9B/10TT baby model is making its first steps in Leonardo (CINECA). 🐣 All people involved are eager to see the results of the effort it took to get here and share them. 👀 And advancing to push hard for the next cycle. 🦾 #goOpenEuroLLM

Wrapping up our 3rd general meeting, hosted by AI Sweden in sunny Stockholm ☀️ A full room makes the final decisions before training the first OpenEuroLLM model. Sharing updates, ideas, and future plans. Two more days of tight collaboration. Full speed mode. 🚀 #goOpenEuroLLM

HPLT is of the datasets we are sharing in our world-readable catalogue across HPCs. Interesting talk at #LREC2026 in 15 min in room Menorca 1 at 16:20!!!

All ready to share information about #OpenEuroLLM with the #LREC2026 crowd. Let's talk data, infra, evals and open multilingual LLM models together! Come to booth #5 at the poster area 1, Elyxir Building. #multingualLLMs #openLLMs #diverseLLMs #safeLLMs

Quite a nice "representation" of the OpenEuroLLM crowd will be at the International Conference on Learning Representations (ICLR) this week. On Friday 24, come to poster "OpenThoughts: Data Recipes for Reasoning Models", work partially supported by our project, and meet us! 👋

Experimenting with model-based annotation for better data selection? A candidate to consider is propella-1, a mulitlingual and multi-property annotator partially funded by #OpenEuroLLM which is fully open-source. 👍 Models, annotations and paper ready! See: huggingface.co/collections/...

🎉 One year of OpenEuroLLM! 🇪🇺We’re building Europe’s next-gen open-source LLMs to boost digital sovereignty. More about our achievements and next steps for infrastructure, data, models and evaluation at openeurollm.eu/blog/first-y.... Year 2 = full speed ahead. 🚀 Go #OpenEuroLLM!

1mo

3mo

Also, today, know more about bechmark contamination impact goint to the poster of our colleagues from the unversities of Helsinki and Turku and the ELLIS Institute Finland.

Input, more input 🤖⚡ Just like Jonny 5 in Short Circuit, our baby model is reading every single token from its pretraining dataset. So far: 10 trillion tokens, 36 languages + code & math as their own "languages" 📚🌍💻 We’re tracking progress & sharing it openly 👇 (1/2)

A series of foundation models for transparent AI in Europe

openeurollm.eu

As of this morning: 🧠 425.49B tokens seen 📊 4.25% completed This eager reader wants more input, one token at a time. Follow along. 🔍 (2/2) #PreTraining #LLM #MultilingualAI #TransparentAI #goOpenEuroLLM

OpenEuroLLM

Open Euro LLM

1mo

Open Euro LLM

ALT: Ally Sheedy and Johnny 5 in Short Circuit

static.klipy.com

Ally Sheedy and Johnny 5 in Short Circuit

Open Euro LLM