1
Off-Policy Learning with Limited Supply
WWW 2026会议论文,专攻资源受限场景下的离线策略学习,理论与实验并重。
arXiv:2603.18702v4 Announce Type: replace Abstract: We study off-policy learning (OPL) in contextual bandits, which plays a key role in a wide range o…
WWW 2026会议论文,专攻资源受限场景下的离线策略学习,理论与实验并重。
arXiv:2603.18702v4 Announce Type: replace Abstract: We study off-policy learning (OPL) in contextual bandits, which plays a key role in a wide range o…