//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Across all tasks, state-of-the-art multimodal models often perform at or near chance, even at relatively low difficulty. Performance degrades rapidly as soon as reasoning requires sequential visual state updates, rather than long-horizon planning or complex rules.
4mo