1
Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key
探讨强化学习能否教会大模型长程推理,关键在于表达力,为LLM能力扩展提供新视角。
arXiv:2605.06638v3 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has been applied to improve large language model (LLM) reasoning…