The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization
Kravatskiy, Alexey, Kozyrev, Ivan, Kozlov, Nikolai, Vinogradov, Alexander, Merkulov, Daniil, Oseledets, Ivan
–arXiv.org Artificial Intelligence
In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanions, which are closely related to Dion. By working with duals of convex combinations of the Ky Fan $k$-norms with either the Frobenius norm or the $l_\infty$ norm, we construct the families of F-Fanions and S-Fanions, respectively. Their most prominent members are F-Muon and S-Muon. We complement our theoretical analysis with an extensive empirical study of these algorithms across a wide range of tasks and settings, demonstrating that F-Muon and S-Muon consistently match Muon's performance, while outperforming vanilla Muon on a synthetic linear least squares problem.
arXiv.org Artificial Intelligence
Dec-11-2025