1
Nonparametric LLM Evaluation from Preference Data
非参数方法评估LLM性能,突破参数假设限制,提供可靠的不确定性量化
arXiv:2601.21816v2 Announce Type: replace Abstract: Evaluating the performance of large language models (LLMs) from human preference data is crucial f…
非参数方法评估LLM性能,突破参数假设限制,提供可靠的不确定性量化
arXiv:2601.21816v2 Announce Type: replace Abstract: Evaluating the performance of large language models (LLMs) from human preference data is crucial f…
文学翻译高质量数据稀缺?新框架用多维度迭代生成参考与偏好数据,提升LLM翻译流畅性与文学效果。
arXiv:2606.05924v1 Announce Type: cross Abstract: Literary translation poses unique challenges due to the scarcity of high-quality annotated data and …
新方法用DPO隐式奖励差距衡量样本难度,自动筛选高质量偏好数据,提升模型训练效率。
arXiv:2508.04149v2 Announce Type: replace-cross Abstract: Aligning large language models (LLMs) with human preferences is a critical challenge in AI r…