XXXXX
–Neural Information Processing Systems
There have been multiple recent approaches to obtain a near-optimal policy in CMDPs in the regret-minimization or PAC-RL settings [13, 38, 9, 19, 31, 22, 36, 12, 15, 16, 11].
Neural Information Processing Systems
Feb-7-2026, 14:34:09 GMT