//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
Researching #ComputerVision at #GoogleDeepMind using JAX/Flax (http://github.com/google/flax). Views are my own.
Andreas Steiner








Looking for a small or medium sized VLM? PaliGemma 2 spans more than 150x of compute! Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints: 🤗 huggingface.co/blog/paligem... 🎤 developers.googleblog.com/en/introduci...
Attending #NeurIPS2024? If you're interested in multimodal systems, building inclusive & culturally aware models, and how fractals relate to LLMs, we've 3 posters for you. I look forward to presenting them on behalf of our GDM team @ Zurich & collaborators. Details below (1/4)
🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes. 1/7
Want to get started using PaliGemma 2? 🎤 developers.googleblog.com/en/introduci... 🤗 huggingface.co/blog/paligem... 💾 kaggle.com/models/googl... 🔧 github.com/google-resea... 7/7
In addition to the pre-trained checkpoints, we also release two checkpoints fine-tuned on the DOCCI dataset, which generate fine-grained captions with a great quality/compute trade-off – and no yapping! 5/7
After 🪄finetuning🪄 on your data, you can expect to see great results, like the sota we got on recognizing table structures, music scores, molecular structures, and text, and on radiography report generation. 4/7
If you want to know more, now is a good time to head over to the 31 page tech report. Brought to you by an amazing team of collaborators from @GoogleDeepMind and @GoogleAI . arxiv.org/abs/2412.03555 6/7
As the original PaliGemma, the pre-trained PaliGemma 2 models have segmentation and detection capabilities, and excel at OCR – which makes them extremely versatile for 🪄finetuning🪄. The original demo hf.co/spaces/big-v... gives you an idea of the capabilities. 3/7
Adding this new "model size" dimension unlocks substantial improvements for some tasks (blue, e.g. AI2D), and compounds with improvements from increased resolution for most tasks (green, e.g. InfoVQA). 2/7
Feb 19, 2025
Dec 7, 2024
Dec 5, 2024
Dec 5, 2024
Dec 5, 2024
Dec 5, 2024
Dec 5, 2024
Dec 5, 2024
Dec 5, 2024
Andreas Steiner
Andreas Steiner
Andreas Steiner
Andreas Steiner
Andreas Steiner
Andreas Steiner
Andreas Steiner
Ibrahim Alabdulmohsin
Andreas Steiner