1
When Data Is Scarce: Scaling Sparse Language Models with Repeated Training
数据稀缺时如何扩展稀疏语言模型?这篇ICML 2026论文提出重复训练方法,有望突破数据瓶颈。
arXiv:2606.01155v1 Announce Type: new Abstract: Scaling laws for dense LLMs under infinite data are well explored, but how sparsity interacts with lim…