VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning
–Neural Information Processing Systems
This arrangement is particularly important in scenarios with high sampling costs and potential dangers, such as autonomous driving and robotics.
Neural Information Processing Systems
Oct-8-2025, 20:37:18 GMT