//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
How good are LLMs at ๐Ÿ”ญ scientific computing and visualization ๐Ÿ”ญ? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. ๐Ÿงต
Jun 2, 2025
Sebastian Joseph