1
BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format
系统性揭示LLM在生物与经济安全基准上的类失控优化器失败模式,视角新颖,观察格式简化。
arXiv:2509.02655v3 Announce Type: replace-cross Abstract: Many AI alignment discussions of "runaway optimisation" focus on RL agents: unbounded utilit…