AITopics | doubly-robust off-policy evaluation

Collaborating Authors

doubly-robust off-policy evaluation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy

Lee, Kyungbok, Paik, Myunghee Cho

arXiv.org Machine LearningApr-2-2024

In various decision-making problems, estimating the value, the expected reward of a policy is a crucial question that needs to be addressed. Online evaluation requiring a comprehensive evaluation of policy value can be expensive and may not be applicable to multiple target policies. Alternatively, off-policy evaluation (OPE) refers to a technique that estimates the value of a target policy by utilizing log data generated from a different logging policy. This approach has attracted considerable interest in the domains of contextual bandits (CB) [Dudík et al., 2011, Swaminathan et al., 2017] and reinforcement learning (RL) [Precup, 2000, Mahmood et al., 2014, Jiang and Li, 2016]. Several off-policy evaluation algorithms [Dudík et al., 2011, Thomas and Brunskill, 2016, Wang et al., 2017, Farajtabar et al., 2018, Su et al., 2020] currently in use rely on having complete knowledge of the logging policy in order to utilize inverse probability weighting (IPW).

asymptotic variance, doubly-robust off-policy evaluation, estimator, (11 more...)

arXiv.org Machine Learning

2404.0183

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

Add feedback