Reviews: Stochastic Spectral and Conjugate Descent Methods
–Neural Information Processing Systems
This paper focuses on quadratic minimization and proposes a novel stochastic gradient descent algorithm by interpolating randomized coordinate descent (RCD) and stochastic spectral descent (SSD). Those two algorithms were studied in prior but they have drawbacks to apply for practical usage. Hence, authors combine them by random coin-tossing (i.e., with probability \alpha run RCD and with probability 1-\alpha run SSD) and provide the optimal parameters to achieve the best convergence rate. In addition, they study the rate of their algorithm for some special cases, e.g., a matrix in objective has clustered or exponentially decaying eigenvalues. In experiments, they report practical behavior of their algorithm under synthetic settings. The proposed algorithm takes advantages of two prior works, i.e., efficient gradient update from RCD and fast convergence from SSD.
Neural Information Processing Systems
Oct-8-2024, 08:12:24 GMT
- Technology: