Confident Natural Policy Gradient for Local Planning in $q_\pi$-realizable Constrained MDPs

Open in new window