Instance-Optimal Estimation with Multiple LLM Judges on a Budget
研究如何在预算有限下用多个LLM评判器实现实例最优估计,提升评估效率与准确性。
arXiv:2605.23362v1 Announce Type: new Abstract: Evaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations…