Percentile Criterion Optimization in Offline Reinforcement Learning

Neural Information Processing Systems 

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion .

Similar Docs  Excel Report  more

TitleSimilaritySource
None found