1
Logging Policy Design for Off-Policy Evaluation
离策略评估的精度取决于日志策略设计,这篇论文系统研究如何优化它
arXiv:2605.15108v1 Announce Type: cross Abstract: Off-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender sy…
离策略评估的精度取决于日志策略设计,这篇论文系统研究如何优化它
arXiv:2605.15108v1 Announce Type: cross Abstract: Off-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender sy…