Mirror Descent Under Generalized Smoothness

Yu, Dingzhi, Jiang, Wei, Wan, Yuanyu, Zhang, Lijun

Feb-2-2025–arXiv.org Artificial Intelligence

Smoothness is crucial for attaining fast rates in first-order optimization. However, many optimization problems in modern machine learning involve non-smooth objectives. Recent studies relax the smoothness assumption by allowing the Lipschitz constant of the gradient to grow with respect to the gradient norm, which accommodates a broad range of objectives in practice. Despite this progress, existing generalizations of smoothness are restricted to Euclidean geometry with $\ell_2$-norm and only have theoretical guarantees for optimization in the Euclidean space. In this paper, we address this limitation by introducing a new $\ell*$-smoothness concept that measures the norm of Hessian in terms of a general norm and its dual, and establish convergence for mirror-descent-type algorithms, matching the rates under the classic smoothness. Notably, we propose a generalized self-bounding property that facilitates bounding the gradients via controlling suboptimality gaps, serving as a principal component for convergence analysis. Beyond deterministic optimization, we establish an anytime convergence for stochastic mirror descent based on a new bounded noise condition that encompasses the widely adopted bounded or affine noise assumptions.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-2-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
- Europe
  - Russia (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia
  - Russia (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - China
    - Jiangsu Province > Nanjing (0.04)
    - Zhejiang Province > Ningbo (0.04)
    - Guangxi Province > Nanning (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Natural Language (0.92)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found