Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies

Open in new window