e1696007be4eefb81b1a1d39ce48681b-Paper.pdf
–Neural Information Processing Systems
In this work, we identify anovel set of conditions that ensure convergence with probability 1 ofQ-learning with linear function approximation, by proposing a twotime-scalevariationthereof.
Neural Information Processing Systems
Feb-10-2026, 19:41:58 GMT