Unified Data Selection for LLM Reasoning
提出统一数据选择框架,为LLM推理任务高效筛选高质量训练数据,显著提升推理能力。
arXiv:2605.22389v1 Announce Type: new Abstract: Effectively training Large Language Models (LLMs) for complex, long-CoT reasoning is often bottlenecke…
提出统一数据选择框架,为LLM推理任务高效筛选高质量训练数据,显著提升推理能力。
arXiv:2605.22389v1 Announce Type: new Abstract: Effectively training Large Language Models (LLMs) for complex, long-CoT reasoning is often bottlenecke…
新方法用DPO隐式奖励差距衡量样本难度,自动筛选高质量偏好数据,提升模型训练效率。
arXiv:2508.04149v2 Announce Type: replace-cross Abstract: Aligning large language models (LLMs) with human preferences is a critical challenge in AI r…
提出Learning-Zone Energy方法,在线选择数据以提升RL后训练效率,避免均匀分配浪费计算。
arXiv:2605.17003v1 Announce Type: new Abstract: Reinforcement Learning (RL) post-training has emerged as the dominant paradigm for eliciting mathemati…
提出凸数据集估值方法,解决LLM后训练中数据集选择的成本与性能权衡问题
arXiv:2605.16704v1 Announce Type: new Abstract: Improving LLM performance on downstream tasks sometimes requires leveraging auxiliary datasets during …
论文提出即插即用的振荡式数据体积调度方法,超越传统样本选择,显著提升模型训练效率。
arXiv:2605.14773v1 Announce Type: cross Abstract: Data selection accelerates training by identifying representative training data while preserving mod…
由相似图构建加权独立集,平衡样本质量与多样性,为高效数据选择提供新框架。
arXiv:2605.15691v1 Announce Type: new Abstract: Data selection seeks to identify a compact yet informative subset from large-scale training corpora, b…