1
Policy and World Modeling Co-Training for Language Agents
提出联合训练策略与世界观模型的新方法,让语言智能体在复杂任务中表现更佳。
arXiv:2606.02388v1 Announce Type: new Abstract: Reinforcement learning (RL) improves large language model (LLM) agents by teaching them which actions …