SCOPE: Selective Conformal Optimized Pairwise LLM Judging
被ICML 2026接收,提出用选择性保形优化提升LLM成对评估的可靠性,解决打分偏差难题
arXiv:2602.13110v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly used as scalable judges in pairwise evaluation…