Supplementary Materials A Numerical Example on Convergence Bounds

Nov-15-2025, 06:19:11 GMT–Neural Information Processing Systems

We use the following numerical experiment to further illustrate our finite-time bounds on the convergence of double Q-learning. In such an experiment, the optimal Q-function can be explicitly calculated and thus the learning errors can be tracked. We choose γ = 0 .8,α We prove Lemma 1 by induction. First, it is easy to justify that the initial case is satisfied, i.e., In this appendix, we will provide a detailed proof of Theorem 1.

iteration, max null, nullnull, (15 more...)

Neural Information Processing Systems

Nov-15-2025, 06:19:11 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.69)

Duplicate Docs Excel Report

Title
c20bb2d9a50d5ac1f713f8b34d9aac5a-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found