Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Neural Information Processing Systems 

We provide both theoretical analysis and experimental results to validate the effectiveness of our proposed algorithm.