1
Model Collapse as Cultural Evolution
从文化演化理论解释大模型自我训练导致的模型崩溃,提出五个可证伪预测,填补语言学空白。
arXiv:2605.23054v1 Announce Type: cross Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been character…
从文化演化理论解释大模型自我训练导致的模型崩溃,提出五个可证伪预测,填补语言学空白。
arXiv:2605.23054v1 Announce Type: cross Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been character…
破解多奖励强化学习中的模型崩溃难题,提出RLIF训练新框架实现稳定收敛。
arXiv:2605.22620v1 Announce Type: cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has substantially improved the reasoning abili…