1
The Silent Hyperparameter: Quantifying the Impact of Inference Backends on LLM Reproducibility
推理后端竟是LLM基准测试的“隐形超参数”?最新研究量化其对可重复性的影响,提醒研究者注意分数差异的底层来源。
arXiv:2605.19537v1 Announce Type: new Abstract: Progress in LLMs is increasingly measured through standardized benchmarks, where state-of-the-art impr…