1
Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training
LLM继续预训练中,超参数配置可预测的缩放规律,告别启发式搜索与高昂成本
arXiv:2606.05610v1 Announce Type: new Abstract: The efficacy of continued pre-training for Large Language Models (LLMs) hinges upon hyperparameter con…