Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity

Jan-18-2025, 20:22:20 GMT–Neural Information Processing Systems

One common approach to this problem, exemplified by stochastic gradient descent, involves sampling one f_i term at every iteration to make progress. This approach crucially relies on a notion of uniformity across the f_i's, formally captured by their condition number. In this work, we give an algorithm that minimizes the above convex formulation to \epsilon -accuracy in \widetilde{O}(\sum_{i 1} n d_i \log (1 /\epsilon)) gradient computations, with no assumptions on the condition number. The previous best algorithm independent of the condition number is the standard cutting plane method, which requires O(nd \log (1/\epsilon)) gradient computations. As a corollary, we improve upon the evaluation oracle complexity for decomposable submodular minimization by [Axiotis, Karczmarz, Mukherjee, Sankowski and Vladu, ICML 2021].

condition number, decomposable non-smooth convex optimization, nearly-linear gradient oracle complexity, (2 more...)

Neural Information Processing Systems

Jan-18-2025, 20:22:20 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)