Continuous Deep Q-Learning in Optimal Control Problems: Normalized Advantage Functions Analysis
–Neural Information Processing Systems
One of the most effective continuous deep reinforcement learning algorithms is normalized advantage functions (NAF). The main idea of NAF consists in the approximation of the Q-function by functions quadratic with respect to the action variable. This idea allows to apply the algorithm to continuous reinforcement learning problems, but on the other hand, it brings up the question of classes of problems in which this approximation is acceptable. The presented paper describes one such class. We consider reinforcement learning problems obtained by the discretization of certain optimal control problems.
continuous deep q-learning, normalized advantage function analysis, optimal control problem, (4 more...)
Neural Information Processing Systems
Jan-17-2025, 18:42:56 GMT