//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
How useful are self-generated 'mental images' (visual aids) in MLLM/UMM reasoning? Turns out: currently not very. Visualizations have small errors that compound in multi-step problems, and models often ignore correct visual aids in their decision making.
4mo
Thaddäus Wiedemer
Can AI reason by “imagining” — not just by seeing or reading? We introduce Mentis Oculi, a benchmark for machine mental imagery: multi-step visual puzzles that require maintaining and updating visual states over time. 📄 arxiv.org/abs/2602.02465 🌐 jana-z.github.io/mentis-oculi/ 🧵⬇️
4mo
Video