On Markov Chain Gradient Descent
Sun, Tao, Sun, Yuejiao, Yin, Wotao
–Neural Information Processing Systems
Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain. Existing results of this method assume convex objectives and a reversible Markov chain and thus have their limitations. We establish new non-ergodic convergence under wider step sizes, for nonconvex problems, and for non-reversible finite-state Markov chains. Nonconvexity makes our method applicable to broader problem classes. Non-reversible finite-state Markov chains, on the other hand, can mix substatially faster. To obtain these results, we introduce a new technique that varies the mixing levels of the Markov chains. The reported numerical results validate our contributions.
Neural Information Processing Systems
Dec-31-2018
- Country:
- Asia
- China (0.05)
- Middle East > Jordan (0.04)
- Europe > Montenegro (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > California
- Los Angeles County > Los Angeles (0.29)
- Canada > Quebec
- Asia
- Genre:
- Research Report (0.67)