Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions

Open in new window