A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
重新审视大模型剪枝后微调的必要性,挑战复杂剪枝标准,提出更高效的压缩策略。
arXiv:2510.14444v3 Announce Type: replace Abstract: Post-training pruning can substantially reduce LLM inference costs, but it often degrades quality …