Goto

Collaborating Authors

 grassmannian



A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning

Neural Information Processing Systems

Minimizing a non-smooth function over the Grassmannian appears in many applications in machine learning. In this paper we show that if the objective satisfies a certain Riemannian regularity condition with respect to some point in the Grassmannian, then a Riemannian subgradient method with appropriate initialization and geometrically diminishing step size converges at a linear rate to that point. We show that for both the robust subspace learning method Dual Principal Component Pursuit (DPCP) and the Orthogonal Dictionary Learning (ODL) problem, the Riemannian regularity condition is satisfied with respect to appropriate points of interest, namely the subspace orthogonal to the sought subspace for DPCP and the orthonormal dictionary atoms for ODL. Consequently, we obtain in a unified framework significant improvements for the convergence theory of both methods.


Riemannian Batch Normalization: A Gyro Approach

Chen, Ziheng, Wu, Xiao-Jun, Schölkopf, Bernhard, Sebe, Nicu

arXiv.org Artificial Intelligence

Normalization layers are crucial for deep learning, but their Euclidean formulations are inadequate for data on manifolds. On the other hand, many Riemannian manifolds in machine learning admit gyro-structures, enabling principled extensions of Euclidean neural networks to non-Euclidean domains. Inspired by this, we introduce GyroBN, a principled Riemannian batch normalization framework for gyrogroups. We establish two necessary conditions, namely \emph{pseudo-reduction} and \emph{gyroisometric gyrations}, that guarantee GyroBN with theoretical control over sample statistics, and show that these conditions hold for all known gyrogroups in machine learning. Our framework also incorporates several existing Riemannian normalization methods as special cases. We further instantiate GyroBN on seven representative geometries, including the Grassmannian, five constant curvature spaces, and the correlation manifold, and derive novel gyro and Riemannian structures to enable these instantiations. Experiments across these geometries demonstrate the effectiveness of GyroBN. The code is available at https://github.com/GitZH-Chen/GyroBN.git.



To Reviewer 1 (R1)

Neural Information Processing Systems

We thank all three reviewers for their constructive comments. We address them below one by one. Q1: what makes it nontrivial to extend the regularity condition and proof technique in [11] to Riemmanian optimization. The Grassmannian manifold is nonconvex, making the analysis more complex. We will incorporate these into a revised version of the manuscript.


Exemplar-Free Continual Learning for State Space Models

Lee, Isaac Ning, Mahmoodi, Leila, Le, Trung, Harandi, Mehrtash

arXiv.org Artificial Intelligence

State-Space Models (SSMs) excel at capturing long-range dependencies with structured recurrence, making them well-suited for sequence modeling. However, their evolving internal states pose challenges in adapting them under Continual Learning (CL). This is particularly difficult in exemplar-free settings, where the absence of prior data leaves updates to the dynamic SSM states unconstrained, resulting in catastrophic forgetting. To address this, we propose Inf-SSM, a novel and simple geometry-aware regularization method that utilizes the geometry of the infinite-dimensional Grassmannian to constrain state evolution during CL. Unlike classical continual learning methods that constrain weight updates, Inf-SSM regularizes the infinite-horizon evolution of SSMs encoded in their extended observability subspace. We show that enforcing this regularization requires solving a matrix equation known as the Sylvester equation, which typically incurs $\mathcal{O}(n^3)$ complexity. We develop a $\mathcal{O}(n^2)$ solution by exploiting the structure and properties of SSMs. This leads to an efficient regularization mechanism that can be seamlessly integrated into existing CL methods. Comprehensive experiments on challenging benchmarks, including ImageNet-R and Caltech-256, demonstrate a significant reduction in forgetting while improving accuracy across sequential tasks.


Differential Evolution for Grassmann Manifold Optimization: A Projection Approach

Lesniewski, Andrew

arXiv.org Artificial Intelligence

We propose a novel evolutionary algorithm for optimizing real-valued objective functions defined on the Grassmann manifold Gr}(k,n), the space of all k-dimensional linear subspaces of R^n. While existing optimization techniques on Gr}(k,n) predominantly rely on first- or second-order Riemannian methods, these inherently local methods often struggle with nonconvex or multimodal landscapes. To address this limitation, we adapt the Differential Evolution algorithm - a global, population based optimization method - to operate effectively on the Grassmannian. Our approach incorporates adaptive control parameter schemes, and introduces a projection mechanism that maps trial vectors onto the manifold via QR decomposition. The resulting algorithm maintains feasibility with respect to the manifold structure while enabling exploration beyond local neighborhoods. This framework provides a flexible and geometry-aware alternative to classical Riemannian optimization methods and is well-suited to applications in machine learning, signal processing, and low-rank matrix recovery where subspace representations play a central role. We test the methodology on a number of examples of optimization problems on Grassmann manifolds.


Nested subspace learning with flags

Szwagier, Tom, Pennec, Xavier

arXiv.org Machine Learning

Many machine learning methods look for low-dimensional representations of the data. The underlying subspace can be estimated by first choosing a dimension $q$ and then optimizing a certain objective function over the space of $q$-dimensional subspaces (the Grassmannian). Trying different $q$ yields in general non-nested subspaces, which raises an important issue of consistency between the data representations. In this paper, we propose a simple trick to enforce nestedness in subspace learning methods. It consists in lifting Grassmannian optimization problems to flag manifolds (the space of nested subspaces of increasing dimension) via nested projectors. We apply the flag trick to several classical machine learning methods and show that it successfully addresses the nestedness issue.


Reviews: A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning

Neural Information Processing Systems

A number of problems in sparse learning, signal processing, etc., can be phrased as optimizing a nonsmooth function over a riemannian manifold. Many works avoid nonsmooth analysis / optimization, by applying smooth methods to a smoothing of the objective function, often at the cost of suboptimalities in convergence rate, sample complexity, etc.. This work takes a different path, directly developing methods for nonsmooth riemannian optimization. The focus on the grassmannian limits the scope to some extent. It is unclear what in the setup requires the grassmannian.