1
LEAP: Trajectory-Level Evaluation of LLMs in Iterative Scientific Design
提出轨迹级评估框架LEAP,首次量化LLM在科学设计中的迭代学习过程,而非仅关注结果快照。
arXiv:2605.15341v1 Announce Type: cross Abstract: LLMs are increasingly deployed in autonomous laboratories, under the assumption that their domain pr…