Strong Teacher Not Needed? On Distillation in LLM Pretraining
颠覆认知?弱教师模型也能有效蒸馏LLM,预训练阶段教师强度并非关键。
arXiv:2605.23857v1 Announce Type: new Abstract: Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield b…
颠覆认知?弱教师模型也能有效蒸馏LLM,预训练阶段教师强度并非关键。
arXiv:2605.23857v1 Announce Type: new Abstract: Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield b…
无需辅助组件的投影引导跨分词器知识蒸馏,有效解决词汇不兼容问题。
arXiv:2605.21699v1 Announce Type: cross Abstract: Cross-tokenizer knowledge distillation allows a student model to learn from teachers with incompatib…
多轮对话代理只能“一刀切”蒸馏?这篇论文给出何时蒸馏、蒸馏什么的智能选择策略
arXiv:2605.19447v1 Announce Type: new Abstract: Reinforcement learning can train LLM agents from sparse task rewards, but long-horizon credit assignme…
互补自蒸馏如何维护大模型上下文完整性?这项研究提出双模型协作新方案,为LLM安全对齐提供创新思路。
arXiv:2605.20258v1 Announce Type: new Abstract: Contextual Integrity (CI) defines privacy not merely as keeping information hidden, but as governing i…
最新研究:后训练MoE模型通过自蒸馏跳过一半专家,无需从头预训练,显著降低计算量。
arXiv:2605.18643v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) scales language models efficiently through sparse expert activation, and its …
揭秘MoE大模型预训练中剪枝与蒸馏技术,SlimQwen优化效率与性能。
arXiv:2605.08738v2 Announce Type: replace Abstract: Structured pruning and knowledge distillation (KD) are typical techniques for compressing large la…
提出稀疏到稠密奖励原则,四阶段后训练流程更高效利用稀缺标注数据,为LLM推理优化提供新范式。
arXiv:2605.12483v2 Announce Type: replace-cross Abstract: When labeled verifiable training data is scarce, each checked example should be used where i…
扩散LLM无需外部教师,通过“展开回退”策略自我提升推理效率,开辟模型加速新方向。
arXiv:2605.16941v1 Announce Type: new Abstract: Diffusion Large Language Models (DLLMs) promise fast parallel generation, yet open-source DLLMs still …
首份大模型在线策略蒸馏综述,系统梳理方法、挑战与未来方向,适合研究者深挖。
arXiv:2604.00626v3 Announce Type: replace Abstract: As Large Language Models (LLMs) continue to grow in both capability and cost, transferring frontie…
多模态大模型新突破,通过自蒸馏策略让AI学会捕捉视觉细节,显著提升细粒度理解能力。
arXiv:2605.18740v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) still struggle with fine-grained visual understanding, wher…
从长时域智能体学习挑战入手,提出目标后见自蒸馏方法,提升复杂任务表现。
arXiv:2605.17873v1 Announce Type: new Abstract: Training long-horizon LLM agents with reinforcement learning is challenging because sparse outcome rew…
一篇统一SFT、DAgger、离线RL和OPD视角的LLM蒸馏论文,解耦KL与轨迹,为模型优化提供新理论框架。
arXiv:2605.16826v1 Announce Type: new Abstract: Knowledge distillation is central to LLM post-training, yet its design space remains poorly understood…
用认知不确定性引导知识蒸馏,解决学生误解分类中数据稀疏与边界模糊难题。
arXiv:2605.14752v1 Announce Type: cross Abstract: Accurately identifying student misconceptions is crucial for personalized education but faces three …
双向联邦知识蒸馏框架,破解非独立同分布与长尾心电图监测的隐私与效率难题
arXiv:2605.14886v1 Announce Type: new Abstract: Electrocardiogram (ECG) monitoring in Internet of Medical Things (IoMT) networks is constrained by str…
看点在通过回溯机制缓解LLM推理蒸馏中的双重暴露偏差,提升长链思维迁移效率
arXiv:2605.19433v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable success in complex reasoning tasks via long chai…
提出一种无损抗蒸馏采样方法,为保护大模型知识产权提供新思路
arXiv:2605.18829v1 Announce Type: new Abstract: Frontier commercial generative models face a growing threat from distillation, whereby a distiller har…
提出近端在策略蒸馏方法,解决LLM后训练中知识注入与保留的冲突,理论与实验双验证。
arXiv:2603.01683v2 Announce Type: replace Abstract: Injecting new reasoning knowledge into Large Language Models (LLMs) via post-training often induce…
介绍一种批评驱动Voronoi量化方法,实现深度强化学习策略向可解释模型的高效蒸馏,解决性能-可解释性权衡难题。
arXiv:2605.14897v1 Announce Type: cross Abstract: Despite many successful attempts at explaining Deep Reinforcement Learning policies using distillati…
前沿论文:反蒸馏指纹技术,用于检测LLM被无授权蒸馏,平衡鲁棒性与模型性能。
arXiv:2602.03812v2 Announce Type: replace-cross Abstract: Model distillation enables efficient emulation of frontier large language models (LLMs), cre…
OpenAI官方教你用API实现模型蒸馏,低成本微调高性能小模型,降本增效的实战指南。
Fine-tune a cost-efficient model with the outputs of a large frontier model–all on the OpenAI platform