Towards Instance-Optimal Offline Reinforcement Learning with Pessimism Ming Yin 1,2 and Y u-Xiang Wang 1 1 Department of Computer Science, UC Santa Barbara

Neural Information Processing Systems 

Prior works study this problem based on different data-coverage assumptions, and their learning guarantees are expressed by the covering coefficients which lack the explicit characterization of system quantities.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found