Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage

Neural Information Processing Systems 

We tackle this by introducing two novel value-based algorithms.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found