1
Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning
用奖励不确定性引导智能体自我探索,强化学习实现真正多样化的行为涌现
arXiv:2606.03962v1 Announce Type: cross Abstract: Classical reinforcement learning (RL) typically seeks a deterministic policy that maximizes the expe…