EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
将强化学习与环境合成结合,为扩展工具使用智能体提供稳健新方法。
arXiv:2605.18703v1 Announce Type: cross Abstract: Equipping LLMs with tool-use capabilities via Agentic Reinforcement Learning (Agentic RL) is bottlen…