1
Scaling Laws for Mixture Pretraining Under Data Constraints
受限数据下混合预训练的缩放定律,揭示稀缺目标数据与通用数据的最佳配比策略。
arXiv:2605.12715v2 Announce Type: replace Abstract: As language models scale, the amount of data they require grows -- yet many target data sources, s…