//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Inspired by this research, I unleashed OpenAI’s agentic mode upon classic Zork. On the one hand, it’s impressive that it could play this sim at all! On the other, after figuring out basic gameplay, it got stuck looping in circles and randomly dropping stuff, like a clumsy lobotomized hamster.
4h
Benjamin Riley
LLMs suck at Zork How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork? "all tested models achieve less than 10% completion on average, with even the best-performing model (Claude Opus 4.5) reaching only approximately 75 out of 350 possible points"
11h
In this positioning paper, we evaluate the problem-solving and reasoning capabilities of contemporary Large Language Models (LLMs) through their performance in Zork, the seminal text-based adventure g...
arxiv.org
Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork?
René Walter