AITopics | provable acceleration

Collaborating Authors

provable acceleration

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provable Acceleration of Nesterov's Accelerated Gradient for Asymmetric Matrix Factorization and Linear Neural Networks

Neural Information Processing SystemsMay-26-2025, 21:49:20 GMT

We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem. Furthermore, we prove that Nesterov's accelerated gradient (NAG) attains an iteration complexity of O(\kappa\log\frac{1}{\epsilon}), which is the best-known bound of first-order methods for rectangular matrix factorization. Different from small balanced random initialization in the existing literature, we adopt an unbalanced initialization, where \mathbf{X}_0 is large and \mathbf{Y}_0 is 0 . Moreover, our initialization and analysis can be further extended to linear neural networks, where we prove that NAG can also attain an accelerated linear convergence rate. In particular, we only require the width of the network to be greater than or equal to the rank of the output label matrix. In contrast, previous results achieving the same rate require excessive widths that additionally depend on the condition number and the rank of the input data matrix.

artificial intelligence, machine learning, mathbf, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.41)

Add feedback

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-\L{}ojasiewicz Functions when the Non-Convexity is Averaged-Out

Wang, Jun-Kun, Lin, Chi-Heng, Wibisono, Andre, Hu, Bin

arXiv.org Artificial IntelligenceAug-29-2023

Heavy Ball (HB) nowadays is one of the most popular momentum methods in non-convex optimization. It has been widely observed that incorporating the Heavy Ball dynamic in gradient-based methods accelerates the training process of modern machine learning models. However, the progress on establishing its theoretical foundation of acceleration is apparently far behind its empirical success. Existing provable acceleration results are of the quadratic or close-to-quadratic functions, as the current techniques of showing HB's acceleration are limited to the case when the Hessian is fixed. In this work, we develop some new techniques that help show acceleration beyond quadratics, which is achieved by analyzing how the change of the Hessian at two consecutive time points affects the convergence speed. Based on our technical results, a class of Polyak-\L{}ojasiewicz (PL) optimization problems for which provable acceleration can be achieved via HB is identified. Moreover, our analysis demonstrates a benefit of adaptively setting the momentum parameter. (Update: 08/29/2023) Erratum is added in Appendix J. This is an updated version that fixes an issue in the previous version. An additional condition needs to be satisfied for the acceleration result of HB beyond quadratics in this work, which naturally holds when the dimension is one or, more broadly, when the Hessian is diagonal. We elaborate on the issue in Appendix J.

artificial intelligence, machine learning, provable acceleration, (18 more...)

arXiv.org Artificial Intelligence

2206.11872

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports > Tennis (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback