1
CuSearch: Curriculum Rollout Sampling via Search Depth for Agentic RAG
提出CuSearch课程采样法,通过搜索深度优化Agentic RAG的强化学习训练,提升效率
arXiv:2605.11611v2 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a promising paradigm for trai…