Reviews: Zap Q-Learning

Jan-20-2025, 04:49:48 GMT–Neural Information Processing Systems

The paper proposes a variant of Q-learning, called Zap Q-learning, that is more stable than its precursor. Specifically, the authors show that, in the tabular case, their method minimises the asymptotic covariance of the parameter vector by applying approximate second-order updates based on the stochastic Newton-Raphson method. The behaviour of the algorithm is analised for the particular case of a tabular representation and experiments are presented showing the empirical performance of the method in its most general form. This is an interesting paper that addresses a core issue in RL. I have some comments regarding both its content and its presentation.

review, update equation, zap q-learning, (2 more...)

Neural Information Processing Systems

Jan-20-2025, 04:49:48 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)