On the Convergence of Smooth Regularized Approximate Value Iteration Schemes
–Neural Information Processing Systems
A number of techniques is commonly used in the large-scale RL setting, namely, entropy regularization, smoothing of Q-values and neural network function approximation.
Neural Information Processing Systems
Nov-20-2025, 08:51:40 GMT