Strong Teacher Not Needed? On Distillation in LLM Pretraining
颠覆认知?弱教师模型也能有效蒸馏LLM,预训练阶段教师强度并非关键。
arXiv:2605.23857v1 Announce Type: new Abstract: Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield b…
颠覆认知?弱教师模型也能有效蒸馏LLM,预训练阶段教师强度并非关键。
arXiv:2605.23857v1 Announce Type: new Abstract: Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield b…
将Agent工作流编译进LLM权重,以极低成本实现接近前沿的质量,提出了一种颠覆性的模型优化路径。
arXiv:2605.22502v1 Announce Type: new Abstract: Agent orchestration frameworks have proliferated, collectively exceeding 290,000 GitHub stars across L…
无需辅助组件的投影引导跨分词器知识蒸馏,有效解决词汇不兼容问题。
arXiv:2605.21699v1 Announce Type: cross Abstract: Cross-tokenizer knowledge distillation allows a student model to learn from teachers with incompatib…
提出LEAP可学习端到端自适应剪枝方法,在保持大语言模型性能的同时实现高效压缩
arXiv:2605.17289v1 Announce Type: new Abstract: Unstructured sparsity is now natively accelerated by recent GPU kernels and dataflow hardware, shiftin…
混合全微调与低秩适应的新方法,专为后训练场景优化,效率与性能兼得
arXiv:2605.18822v1 Announce Type: new Abstract: Post-training has become essential for adapting large language models (LLMs) to complex downstream beh…
探索K-Quantization对模型输出性能的影响,量化新技术深度解析
arXiv:2605.19645v1 Announce Type: new Abstract: Recent advancements in large language models (LLMs) have shown their remarkable capacities in many NLP…
量化技术让机器学习模型在低资源医疗影像场景下也能高效运行,大幅降低算力门槛,加速基层医疗智能化。
arXiv:2605.19207v1 Announce Type: cross Abstract: Deep learning models have shown strong performance in medical image analysis, but deploying them in …
基于平坦度的理论最优量化方法,为深度学习模型压缩提供新思路
arXiv:2605.18800v1 Announce Type: new Abstract: Post-training quantization has emerged as a widely adopted technique for compressing and accelerating …
利用大模型隐藏表示实现每任务量化,在保持性能的同时大幅提升效率,值得关注的技术突破。
arXiv:2511.06516v3 Announce Type: replace Abstract: Many LLM applications require only narrow capabilities, yet standard post-training quantization (P…
提出运行时自适应剪枝方法,让LLM推理内存动态调整,效率大增
arXiv:2505.17138v5 Announce Type: replace Abstract: Large language models (LLMs) excel at language understanding and generation, but their enormous co…
二值对称循环矩阵新结构,为深度学习在资源受限平台的高效部署提供突破性方案
arXiv:2605.16443v1 Announce Type: new Abstract: Despite the success of deep neural networks in vision, medical diagnosis, and IoT scenarios, their dep…
探究任务感知剪枝如何提升模型在分布外数据上的表现,揭示内在机制
arXiv:2605.14738v1 Announce Type: cross Abstract: Recent work has promoted task-aware layer pruning as a way to improve model performance on particula…
提出新型结构化剪枝方法,实现大模型高效压缩同时保持鲁棒性,适合模型优化研究者
arXiv:2605.18331v1 Announce Type: new Abstract: Large Language Models (LLMs) have experienced significant growth and development in recent years. Howe…
提出GSQ方法,利用Gumbel-Softmax采样实现LLM的高精度低精度标量量化,突破现有量化瓶颈
arXiv:2604.18556v2 Announce Type: replace-cross Abstract: Quantization has become a standard tool for efficient LLM deployment, especially for local i…
新算法Mosaic-of-Motifs大幅简化神经网络,参数压缩性能损失极小,揭秘深度学习模型为何易于压缩。
arXiv:2602.14896v2 Announce Type: replace Abstract: Large-scale deep learning models are well-suited for compression. Across a variety of tasks, metho…
1-bit量化大模型新思路,输出对齐策略再审视,助力低资源设备高效推理
arXiv:2512.21651v3 Announce Type: replace Abstract: Large Language Models (LLMs) deliver strong performance across a wide range of NLP tasks, but thei…
提出输入输出白化SVD方法,实现自适应秩的大语言模型压缩,提升推理效率。
arXiv:2605.15626v1 Announce Type: new Abstract: Large language models deliver strong performance across language and reasoning tasks, but their storag…
一种新型穿孔神经网络,在关键词识别上同时提升准确率和压缩模型,突破边缘部署瓶颈。
arXiv:2605.15647v1 Announce Type: new Abstract: Edge machine learning presents a unique set of constraints not encountered in cloud-scale model deploy…