1
ComplexConstraints and Beyond: Expert Rubrics for RLVR
提出专家评分标准解决RLVR中复杂约束问题,为强化学习奖励设计提供新范式
arXiv:2606.09118v1 Announce Type: new Abstract: As LLM capabilities advance rapidly, the evaluation methods used to assess them increasingly lag behin…