1
Offline Reinforcement Learning with Universal Horizon Models
提出通用视界模型,直接预测折扣无限期未来,缓解离线RL模型推断的复合误差
arXiv:2605.15603v1 Announce Type: cross Abstract: Model-based reinforcement learning (RL) offers a compelling approach to offline RL by enabling value…