Reviews: Non-delusional Q-learning and value-iteration

Oct-7-2024, 11:11:55 GMT–Neural Information Processing Systems

The paper defines a new type of reinforcement learning algorithm, which takes account of the imperfections of the function approximator and tries to obtain the best policy available given these imperfections rather than assuming no imperfections exist, thus avoiding pathologies arising when we assume a flawed approximate is perfect. The quality of this paper is really good. It introduces a new type of RL algorithm, which is clearly motivated and solid. The weaker points are: 1. The complexity of the defined algorithm seems too high for it to be immediately applicable to interesting problems.

algorithm, non-delusional q-learning and value-iteration, review, (2 more...)

Neural Information Processing Systems

Oct-7-2024, 11:11:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)