1
TemplateRL: Structured Template-Guided Reinforcement Learning for LLM Reasoning
用结构化模板引导强化学习,让LLM推理训练告别低效自采样,提升策略可迁移性。
arXiv:2505.15692v5 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has emerged as an effective paradigm for enhancing model reasoni…