Interesting Emergence World thing where they stuck agents from five different models in the same simulation, left it running for weeks, watched what happened with predictable results (hi grok don’t stab me pls)
www.emergence.ai/blog/emergen...
Most evaluations of AI agents look like exams: a discrete task, a clean environment, a score in minutes or hours. Emergence World is built for the opposite question—what happens when you let agents ru...