1
Libra: Efficient Resource Management for Agentic RL Post-Training
高效管理Agentic RL后训练资源的新方案Libra,降低训练成本、提升性能。
arXiv:2606.03077v1 Announce Type: cross Abstract: Reinforcement learning (RL) has become a standard post-training paradigm for large language models (…