PerspectiveGap shows LLMs struggle with role-specific prompts for multi-agent orchestration, garnering just a 14.9% average pass rate. Highlighting 110 real-world scenarios, the study signals a gap in AI model capabilities for better multi-agent system design. https://arxiv.org/abs/2606.08878
ArXiv link for PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting