random operator
Probabilistic Contraction Analysis of Iterated Random Operators
Gupta, Abhishek, Jain, Rahul, Glynn, Peter
In many branches of engineering, Banach contraction mapping theorem is employed to establish the convergence of certain deterministic algorithms. Randomized versions of these algorithms have been developed that have proved useful in data-driven problems. In a class of randomized algorithms, in each iteration, the contraction map is approximated with an operator that uses independent and identically distributed samples of certain random variables. This leads to iterated random operators acting on an initial point in a complete metric space, and it generates a Markov chain. In this paper, we develop a new stochastic dominance based proof technique, called probabilistic contraction analysis, for establishing the convergence in probability of Markov chains generated by such iterated random operators in certain limiting regime. The methods developed in this paper provides a general framework for understanding convergence of a wide variety of Monte Carlo methods in which contractive property is present. We apply the convergence result to conclude the convergence of fitted value iteration and fitted relative value iteration in continuous state and continuous action Markov decision problems as representative applications of the general framework developed here.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
Gupta, Abhishek, Haskell, William B.
This paper develops a unified framework, based on iterated random operator theory, to analyze the convergence of constant stepsize recursive stochastic algorithms (RSAs) in machine learning and reinforcement learning. RSAs use randomization to efficiently compute expectations, and so their iterates form a stochastic process. The key idea is to lift the RSA into an appropriate higher-dimensional space and then express it as an equivalent Markov chain. Instead of determining the convergence of this Markov chain (which may not converge under constant stepsize), we study the convergence of the distribution of this Markov chain. To study this, we define a new notion of Wasserstein divergence. We show that if the distribution of the iterates in the Markov chain satisfy certain contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution. Inspired by the SVRG algorithm, we develop a method to convert any RSA to a variance reduced RSA that converges to the optimal solution with in almost sure sense or in probability. We show that convergence of a large family of constant stepsize RSAs can be understood using this framework. We apply this framework to ascertain the convergence of mini-batch SGD, forward-backward splitting with catalyst, SVRG, SAGA, empirical Q value iteration, synchronous Q-learning, enhanced policy iteration, and MDPs with a generative model. We also develop two new algorithms for reinforcement learning and establish their convergence using this framework.
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Illinois > Champaign County > Champaign (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms
Gupta, Abhishek, Tendolkar, Gaurav, Chen, Hao, Pi, Jianzong
Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximates certain contraction operators and can be viewed within the framework of iterated random maps. Accordingly, we consider iterated random maps over a Polish space that simulates a contraction operator over that Polish space. Assume that the iterated maps are indexed by $n$ such that as $n\rightarrow\infty$, each realization of the random map converges (in some sense) to the contraction map it is simulating. We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random maps converge weakly to the trajectory generated by the contraction operator. We further show that under certain conditions, the time average of the random sequence converge to the spatial mean of the invariant distribution. We then apply these results to logistic regression, empirical value iteration, empirical Q value iteration, and empirical relative value iteration for finite state finite action MDPs.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)