Active Offline Policy Selection
–Neural Information Processing Systems
Several off-policy evaluation (OPE) techniques have been proposed to assess the value of policies using only logged data.
Neural Information Processing Systems
Aug-17-2025, 11:06:14 GMT