yesterday we launched the Gemma 4 12B model and reception by the community has been great! 🎉
did you know it has vision and audio capabilities, but it doesn't have the vision and audio dedicated encoders
How?
we call this architecture Encoder Free
newsletter.maartengrootendorst.com/p/a-visual-g...
An in-depth explainer to Gemma 4 12B; a unified, encoder-free multimodal model!