AITopics | matrix optimization

Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods

Neural Information Processing SystemsDec-24-2025, 04:38:56 GMT

We consider proximal splitting algorithms for convex optimization problems over matrices. A significant computational bottleneck in many of these algorithms is the need to compute a full eigenvalue or singular value decomposition at each iteration for the evaluation of a proximal operator.In this paper we propose to use an old and surprisingly simple method due to Jacobi to compute these eigenvalue and singular value decompositions, and we demonstrate that it can lead to substantial gains in terms of computation time compared to standard approaches. We rely on three essential properties of this method: (a) its ability to exploit an approximate decomposition as an initial point, which in the case of iterative optimization algorithms can be obtained from the previous iterate; (b) its parallel nature which makes it a great fit for hardware accelerators such as GPUs, now common in machine learning, and (c) its simple termination criterion which allows us to trade-off accuracy with computation time. We demonstrate the efficacy of this approach on a variety of algorithms and problems, and show that, on a GPU, we can obtain 5 to 10x speed-ups in the evaluation of proximal operators compared to standard CPU or GPU linear algebra routines. Our findings are supported by new theoretical results providing guarantees on the approximation quality of proximal operators obtained using approximate eigenvalue or singular value decompositions.

algorithm, faster proximal algorithm, matrix optimization, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.59)

Add feedback

The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization

Kravatskiy, Alexey, Kozyrev, Ivan, Kozlov, Nikolai, Vinogradov, Alexander, Merkulov, Daniil, Oseledets, Ivan

arXiv.org Artificial IntelligenceDec-11-2025

In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanions, which are closely related to Dion. By working with duals of convex combinations of the Ky Fan $k$-norms with either the Frobenius norm or the $l_\infty$ norm, we construct the families of F-Fanions and S-Fanions, respectively. Their most prominent members are F-Muon and S-Muon. We complement our theoretical analysis with an extensive empirical study of these algorithms across a wide range of tasks and settings, demonstrating that F-Muon and S-Muon consistently match Muon's performance, while outperforming vanilla Muon on a synthetic linear least squares problem.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2512.09678

Country: Asia > Middle East (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods

Neural Information Processing SystemsOct-10-2024, 16:15:43 GMT

We consider proximal splitting algorithms for convex optimization problems over matrices. A significant computational bottleneck in many of these algorithms is the need to compute a full eigenvalue or singular value decomposition at each iteration for the evaluation of a proximal operator.In this paper we propose to use an old and surprisingly simple method due to Jacobi to compute these eigenvalue and singular value decompositions, and we demonstrate that it can lead to substantial gains in terms of computation time compared to standard approaches. We rely on three essential properties of this method: (a) its ability to exploit an approximate decomposition as an initial point, which in the case of iterative optimization algorithms can be obtained from the previous iterate; (b) its parallel nature which makes it a great fit for hardware accelerators such as GPUs, now common in machine learning, and (c) its simple termination criterion which allows us to trade-off accuracy with computation time. We demonstrate the efficacy of this approach on a variety of algorithms and problems, and show that, on a GPU, we can obtain 5 to 10x speed-ups in the evaluation of proximal operators compared to standard CPU or GPU linear algebra routines. Our findings are supported by new theoretical results providing guarantees on the approximation quality of proximal operators obtained using approximate eigenvalue or singular value decompositions.

algorithm, jacobi-based eigenvalue method, singular value decomposition, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.62)

Add feedback

Matrix optimization based Euclidean embedding with outliers

Zhang, Qian, Zhao, Xinyuan, Ding, Chao

arXiv.org Machine LearningDec-23-2020

Euclidean embedding from noisy observations containing outlier errors is an important and challenging problem in statistics and machine learning. Many existing methods would struggle with outliers due to a lack of detection ability. In this paper, we propose a matrix optimization based embedding model that can produce reliable embeddings and identify the outliers jointly. We show that the estimators obtained by the proposed method satisfy a non-asymptotic risk bound, implying that the model provides a high accuracy estimator with high probability when the order of the sample size is roughly the degree of freedom up to a logarithmic factor. Moreover, we show that under some mild conditions, the proposed model also can identify the outliers without any prior information with high probability. Finally, numerical experiments demonstrate that the matrix optimization-based model can produce configurations of high quality and successfully identify outliers even for large networks.

matrix, optimization, probability, (16 more...)

arXiv.org Machine Learning

2012.12772

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.95)
Information Technology > Communications > Networks (0.88)

Add feedback

Filters

Collaborating Authors

matrix optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods

The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization

Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods

Matrix optimization based Euclidean embedding with outliers