1
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
用强化学习让大模型更诚实,TruthRL方法提升LLM回答真实性,含代码开源
arXiv:2509.25760v2 Announce Type: replace-cross Abstract: While large language models (LLMs) have demonstrated strong performance on factoid question …