牛哥精选 · 半年

📋 全部 🤖 AI·大模型 ⚡ 效率工具 📝 深度技术 🚀 产品观察 💰 商业科技 🔓 开源项目 🎨 设计创意 📖 阅读推荐 🏷 资源合集 🌱 成长效率

📝 深度技术 arXiv AI 2026-06-23

Self-Improvement Can Self-Regress: The Rise-and-Collapse Failure Mode of LLM Self-Training

揭示LLM自我训练中"先升后崩"的失败模式，为模型优化后的退化问题提供关键机理分析

arXiv:2606.21090v1 Announce Type: new Abstract: Self-improvement can self-regress. In REINFORCE post-training for code, a model can quickly improve on…

llm自我训练后训练模型崩溃 reinforce 优化陷阱

🤖 AI·大模型 arXiv 机器学习 2026-06-02

Entropy Minimization without Model Collapse: Mitigating Prediction Bias in Medical Imaging

医学影像AI新方案：用熵最小化避免模型崩溃，精准降低预测偏差

arXiv:2606.02339v1 Announce Type: new Abstract: Entropy minimization (EM) is the dominant objective for test-time adaptation, yet its failure mode, mo…

熵最小化模型崩溃医学影像预测偏差深度学习

📝 深度技术 arXiv AI 2026-05-25

Model Collapse as Cultural Evolution

从文化演化理论解释大模型自我训练导致的模型崩溃，提出五个可证伪预测，填补语言学空白。

arXiv:2605.23054v1 Announce Type: cross Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been character…

模型崩溃文化演化 llm 迭代学习理论语言退化

🤖 AI·大模型 arXiv NLP 2026-05-22

Two is better than one: A Collapse-free Multi-Reward RLIF Training Framework

破解多奖励强化学习中的模型崩溃难题，提出RLIF训练新框架实现稳定收敛。

arXiv:2605.22620v1 Announce Type: cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has substantially improved the reasoning abili…

强化学习模仿学习多奖励模型崩溃 rlif

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选

Self-Improvement Can Self-Regress: The Rise-and-Collapse Failure Mode of LLM Self-Training

Entropy Minimization without Model Collapse: Mitigating Prediction Bias in Medical Imaging

Model Collapse as Cultural Evolution

Two is better than one: A Collapse-free Multi-Reward RLIF Training Framework

📅 日期