OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

Neural Information Processing Systems 

Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using a pre-collected dataset.