//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Microsoft Research's Lens: detailed captions beat raw training scale for image generation quality. MAI-Image-2.5 ranked 3rd on Arena.ai, behind only OpenAI's ChatGPT Images 2.0. Annotation investment can substitute for GPU hours.
3h
Microsoft Research presents Lens, a text-to-image model with just 3.8 billion parameters that matches much larger rivals on benchmarks, at a fraction of the training cost. The secret sauce: 800 million detailed image captions generated by GPT-4.1 instead of vague web alt-text. Code and weights are o
the-decoder.com
Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient
AI Founders ONLINE