Supplementary Material: Supported Policy Optimization for Offline Reinforcement Learning

Neural Information Processing Systems 

We will introduce the details of both stages respectively.