VRPRM: Process Reward Modeling via Visual Reasoning
通过视觉推理提升过程奖励建模精度,为复杂任务训练提供新思路。
arXiv:2508.03556v3 Announce Type: replace Abstract: Process Reward Model (PRM) is widely used in the post-training of Large Language Model (LLM) becau…
通过视觉推理提升过程奖励建模精度,为复杂任务训练提供新思路。
arXiv:2508.03556v3 Announce Type: replace Abstract: Process Reward Model (PRM) is widely used in the post-training of Large Language Model (LLM) becau…
被ICML 2026收录,提出逐步置信度归因方法,精准诊断黑盒大模型的多步推理失败原因。
arXiv:2605.19228v1 Announce Type: cross Abstract: Large Language Models have achieved strong performance on reasoning tasks with objective answers by …
闭环验证推理突破复杂视觉生成,用可验证的多步推理解决规划幻觉问题,效果惊艳。
arXiv:2605.14876v1 Announce Type: cross Abstract: Despite rapid advancements, current text-to-image (T2I) models predominantly rely on a single-step g…
企业级AI智能体规模化实战:Netomi如何用GPT-4.1和GPT-5.2实现并发、治理与多步推理
How Netomi scales enterprise AI agents using GPT-4.1 and GPT-5.2—combining concurrency, governance, and multi-step reasoning for reliable production w…