Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity

Dong, Sally, Jiang, Haotian, Lee, Yin Tat, Padmanabhan, Swati, Ye, Guanghao

Aug-7-2022–arXiv.org Artificial Intelligence

Many fundamental problems in machine learning can be formulated by the convex program \[ \min_{\theta\in R^d}\ \sum_{i=1}^{n}f_{i}(\theta), \] where each $f_i$ is a convex, Lipschitz function supported on a subset of $d_i$ coordinates of $\theta$. One common approach to this problem, exemplified by stochastic gradient descent, involves sampling one $f_i$ term at every iteration to make progress. This approach crucially relies on a notion of uniformity across the $f_i$'s, formally captured by their condition number. In this work, we give an algorithm that minimizes the above convex formulation to $\epsilon$-accuracy in $\widetilde{O}(\sum_{i=1}^n d_i \log (1 /\epsilon))$ gradient computations, with no assumptions on the condition number. The previous best algorithm independent of the condition number is the standard cutting plane method, which requires $O(nd \log (1/\epsilon))$ gradient computations. As a corollary, we improve upon the evaluation oracle complexity for decomposable submodular minimization by Axiotis et al. (ICML 2021). Our main technical contribution is an adaptive procedure to select an $f_i$ term at every iteration via a novel combination of cutting-plane and interior-point methods.

algorithm, convex, equation, (16 more...)

arXiv.org Artificial Intelligence

Aug-7-2022

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- North America > United States
  - Washington > King County
    - Seattle (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Russia > Central Federal District
    - Moscow Oblast > Moscow (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found