Active Offline Policy Selection
–Neural Information Processing Systems
Several off-policy evaluation (OPE) techniques have been proposed to assess the value of policies using only logged data.
Neural Information Processing Systems
Nov-15-2025, 17:38:46 GMT