linear convergence rate
Sub-sampled Newton Methods with Non-uniform Sampling
Peng Xu, Jiyan Yang, Fred Roosta, Christopher Ré, Michael W. Mahoney
We consider the regime where nd. We propose randomized Newton-type algorithms that exploit non-uniform sub-sampling of { 2fi(w)}ni=1, as well as inexact updates, as means to reduce the computational complexity, and are applicable to a wide range of problems in machine learning. Two non-uniform sampling distributions based on block norm squares and block partial leverage scores are considered. Under certain assumptions, we show that our algorithms inherit a linear-quadratic convergence rate in w and achieve a lower computational complexity compared to similar existing methods. In addition, we show that our algorithms exhibit more robustness and better dependence on problem specific quantities, such as the condition number. We empirically demonstrate that our methods are at least twice as fast as Newton's methods on several real datasets.
Asynchronous Parallel Greedy Coordinate Descent
Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh
In this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints. At each iteration, workers asynchronously conduct greedy coordinate descent updates on a block of variables. In the first part of the paper, we analyze the theoretical behavior of Asy-GCD and prove a linear convergence rate. In the second part, we develop an efficient kernel SVM solver based on Asy-GCD in the shared memory multi-core setting. Since our algorithm is fully asynchronous--each core does not need to idle and wait for the other cores--the resulting algorithm enjoys good speedup and outperforms existing multi-core kernel SVM solvers including asynchronous stochastic coordinate descent and multi-core LIBSVM.