AITopics | double q-learning

Collaborating Authors

double q-learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

54e8912427a8d007ece906c577fdca60-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 23:22:42 GMT

artificial intelligence, machine learning, q-learning, (15 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Appendix ANetwork Architectures

Neural Information Processing SystemsApr-25-2026, 21:58:26 GMT

In this section, we describe the details of the network architectures used in Sec. 4 and 5. We mainly used 4 GPUs (NVIDIAV100; 16GB) for the experiments in Sec. 4 and 5 and it took about 4 hours per seed (in the case of 3M steps). Actually, we conducted exhaustive evaluations through the enormous experiments, and we hope our empirical observations and recommendations help the practitioners to explore the explosive configuration space. Adam Adam Learning rate (policy) 1e-4 5e-5 3e-4 3e-4 Learning rate (value) 1e-4 1e-2 3e-4 3e-4 Weight initialization Uniform Xavier Uniform Xavier Uniform Xavier Uniform Initial output scale (policy) 1.0 1e-4 1e-2 1e-2 Target update Hard - Soft (5e-3) Soft (5e-3) Clipped Double QFalse - True True Table 7: Details of each network architecture. We refer the original implementations of each algorithm which is available online [23, 14, 48, 27, 42].

artificial intelligence, machine learning, training step, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Faster Non-asymptotic Convergence for Double Q-learning

Neural Information Processing SystemsApr-25-2026, 12:42:34 GMT

Double Q-learning (Hasselt, 2010) has gained significant success in practice due to its effectiveness in overcoming the overestimation issue of Q-learning. However, the theoretical understanding of double Q-learning is rather limited. The only existing finite-time analysis was recently established in (Xiong et al., 2020), where the polynomial learning rate adopted in the analysis typically yields a slower convergence rate. This paper tackles the more challenging case of a constant learning rate, and develops new analytical tools that improve the existing convergence rate by orders of magnitude.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

3b712de48137572f3849aabd5666a4e3-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 00:52:00 GMT

complexity, double q-learning, q-learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

54e8912427a8d007ece906c577fdca60-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 17:44:51 GMT

denote, double q-learning, q-learning, (13 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OntheEstimationBiasinDoubleQ-Learning

Neural Information Processing SystemsFeb-8-2026, 17:44:47 GMT

One of the phenomena of interest is that Q-learning (Watkins, 1989) is known to suffer from overestimation issues, since it takes a maximum operator overaset ofestimated action-values.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

4bfbd52f4e8466dc12aaf30b7e057b66-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 08:45:59 GMT

double q-learning, eigenvalue, q-learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TheMean-SquaredErrorofDoubleQ-Learning

Neural Information Processing SystemsFeb-8-2026, 08:45:52 GMT

Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both tabular setting and with linear function approximation, provided thattheoptimal policyisunique andthealgorithms converge.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: