A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Open in new window