1
ClawArena: Benchmarking AI Agents in Evolving Information Environments
AI Agent在动态信息环境中的信念维护与矛盾证据处理,这篇论文定义了首个演化信息基准测试。
arXiv:2604.04202v2 Announce Type: replace Abstract: AI agents deployed as persistent assistants must maintain correct beliefs as their information env…