On Markov Chain Gradient Descent
Sun, Tao, Sun, Yuejiao, Yin, Wotao
Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain. Existing results of this method assume convex objectives and a reversible Markov chain and thus have their limitations. We establish new non-ergodic convergence under wider step sizes, for nonconvex problems, and for non-reversible finite-state Markov chains. Nonconvexity makes our method applicable to broader problem classes. Non-reversible finite-state Markov chains, on the other hand, can mix substatially faster. To obtain these results, we introduce a new technique that varies the mixing levels of the Markov chains. The reported numerical results validate our contributions.
Sep-11-2018
- Country:
- Asia
- China > Hunan Province (0.04)
- Middle East > Jordan (0.04)
- Europe > Montenegro (0.04)
- North America > United States
- California > Los Angeles County > Los Angeles (0.14)
- Asia
- Genre:
- Research Report (1.00)