Trust Region Reward Optimization and Proximal Inverse Reward Optimization Algorithm

Open in new window