Asymptotic regularity of a generalised stochastic Halpern scheme with applications
Pischke, Nicholas, Powell, Thomas
–arXiv.org Artificial Intelligence
We provide abstract, general and highly uniform rates of asympto tic regularity for a generalized stochastic Halpern-style iteration, which incorpo rates a second mapping in the style of a Krasnoselskii-Mann iteration. This iteration is general in two ways: First, it incorporates stochasticity in a completely abstract way rather th an fixing a sampling method; secondly, it includes as special cases stochastic versions of variou s schemes from the optimization literature, including Halpern's iteration as well as a Krasnoselskii-Mann iteration with Tikhonov regularization terms in the sense of Bot, Csetnek and Me ier. For these particular cases, we in particular obtain linear rates of asymptotic regularity, matching (or improving) the currently best known rates for these iterations in stochastic opt imization, and quadratic rates of asymptotic regularity are obtained in the context of inner produ ct spaces for the general iteration. We utilize these rates to give bounds on the oracle complex ity of such iterations under suitable variance assumptions and batching strategies, aga in presented in an abstract style. Finally, we sketch how the schemes presented here can be ins tantiated in the context of reinforcement learning to yield novel methods for Q-learning. Keywords: Asymptotic regularity, Halpern iteration, Tikhonov regularization, Q-learning, proof mining MSC2020 Classification: 47J25, 47H09, 62L20, 03F10 1. Introduction Approximating fixed points of nonexpansive mappings is one of the mo st fundamental tasks in nonlinear analysis and optimization. The problem becomes particular ly interesting when we only have noisy versions those mappings, in which case the resulting a pproximation methods become stochastic processes. Concrete examples of this genera l situation include model-free reinforcement learning algorithms, where variants of Q-learning, for instance, can be viewed as stochastic methods for computing fixpoints of nonexpansive oper ators. To be more concrete, let p X, null null q be a seperable real-valued normed space and T,U: X Ñ X be two nonexpansive mappings on X, i.e. null Tx Ty null ď null x y null and null Ux Uy null ď null x y null for all x,y P X .
arXiv.org Artificial Intelligence
Nov-7-2024
- Country:
- Europe
- Germany > Hesse
- Darmstadt Region > Darmstadt (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > Hesse
- Europe
- Genre:
- Research Report (0.84)
- Technology: