Breaking the Span Assumption Yields Fast Finite-Sum Minimization

Robert Hannah, Yanli Liu, Daniel O'Connor, Wotao Yin

Oct-7-2024, 23:37:22 GMT–Neural Information Processing Systems

In this paper, we show that SVRG and SARAH can be modified to be fundamentally faster than all of the other standard algorithms that minimize the sum of n smooth functions, such as SAGA, SAG, SDCA, and SDCA without duality. Most finite sum algorithms follow what we call the "span assumption": Their updates are in the span of a sequence of component gradients chosen in a random IID fashion. In the big data regime, where the condition number κ = O(n), the span assumption prevents algorithms from converging to an approximate solution of accuracy ɛ in less than n ln(1/ɛ) iterations. SVRG and SARAH do not follow the span assumption since they are updated with a hybrid of full-gradient and component-gradient information.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Oct-7-2024, 23:37:22 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)