AITopics | low-rank structure

Collaborating Authors

low-rank structure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalization Bounds for Rank-sparse Neural Networks

Neural Information Processing SystemsJun-22-2026, 22:28:37 GMT

It has been recently observed in much of the literature that neural networks exhibit a bottleneck rank property: for larger depths, the activation and weights of neural networks trained with gradient-based methods tend to be of approximately low rank. In fact, the rank of the activations of each layer converges to a fixed value referred to as the "bottleneck rank", which is the minimum rank required to represent the training data. This perspective is in line with the observation that regularizing linear networks (without activations) with weight decay is equivalent to minimizing the Schatten p quasi norm of the neural network. In this paper we investigate the implications of this phenomenon for generalization. More specifically, we prove generalization bounds for neural networks which exploit the approximate low rank structure of the weight matrices if present. The final results rely on the Schatten p quasi norms of the weight matrices: for small p, the bounds exhibit a sample complexity rOpWrL2q where W and L are the width and depth of the neural network respectively and where r is the rank of the weight matrices. As p increases, the bound behaves more like a norm-based bound instead.

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Group-Aware Matrix Estimation and Latent Subspace Recovery

Golubovic, Hamza, Shen, Matthew, Allen, Genevera I., Zikry, Tarek M.

arXiv.org Machine LearningMay-21-2026

Modern matrix completion problems often involve heterogeneous data whose rows simultaneously belong to many meta-categories, such as demographic and age groups in recommendation systems, or region and recording session labels in neural electrophysiological experiments. Standard low-rank estimators impose a single global latent geometry, which can recover average structure but may smooth away subgroup-specific variation, especially when observations are unevenly distributed across groups. We introduce Group-Aware Matrix Estimation (GAME), a convex estimator for overlapping subgroup-wise low-rank matrix estimation. GAME regularizes category-specific submatrices through overlapping nuclear-norm penalties, allowing related groups to borrow information while preserving local latent structure in a shared coordinate system. We provide finite-sample guarantees for both reconstruction error and subgroup-specific subspace recovery, showing how performance depends on sampling density, subgroup rank, and overlap structure. Experiments on synthetic, recommendation, ecological, and neuroscience datasets show that GAME is most beneficial in structured missingness regimes, where subgroup-aware regularization improves both reconstruction accuracy and latent subspace fidelity. Across these benchmarks, GAME is competitive or best among global low-rank, side-information, and modern imputation baselines, with the largest gains when subgroups exhibit distinct low-rank structure.

artificial intelligence, machine learning, matrix completion, (14 more...)

arXiv.org Machine Learning

2605.20559

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Low-Rank Regression with Tensor Responses

Guillaume Rabusseau, Hachem Kadri

Neural Information Processing SystemsMar-23-2026, 06:20:37 GMT

Neural Information Processing Systems http://nips.cc/

data mining, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States (0.15)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining

Neural Information Processing SystemsMar-22-2026, 15:32:02 GMT

Large language models (LLMs) have shown impressive capabilities across various tasks. However, training LLMs from scratch requires significant computational power and extensive memory capacity. Recent studies have explored low-rank structures on weights for efficient fine-tuning in terms of parameters and memory, either through low-rank adaptation or factorization. While effective for fine-tuning, low-rank structures are generally less suitable for pretraining because they restrict parameters to a low-dimensional subspace. In this work, we propose to parameterize the weights as a sum of low-rank and sparse matrices for pretraining, which we call SLTrain. The low-rank component is learned via matrix factorization, while for the sparse component, we employ a simple strategy of uniformly selecting the sparsity support at random and learning only the non-zero entries with the fixed support. While being simple, the random fixed-support sparse learning strategy significantly enhances pretraining when combined with low-rank learning. Our results show that SLTrain adds minimal extra parameters and memory costs compared to pretraining with low-rank parameterization, yet achieves substantially better performance, which is comparable to full-rank training. Remarkably, when combined with quantization and per-layer updates, SLTrain can reduce memory requirements by up to 73% when pretraining the LLaMA 7B model.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)

Add feedback

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 22:43:16 GMT

Both algorithms yield state-of-the-art performance guarantees.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

adc98a266f45005c403b8311ca7e8bd7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 08:13:30 GMT

dataset, generalization, perturbation, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning

Neural Information Processing SystemsDec-27-2025, 05:03:58 GMT

We study matrix estimation problems arising in reinforcement learning with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it characterizes the transition kernel of the MDP. In both cases, each entry of the matrix carries important information, and we seek estimation methods with low entry-wise prediction error. Importantly, these methods further need to accommodate for inherent correlations in the available data (e.g. for MDPs, the data consists of system trajectories). We investigate the performance of simple spectral-based matrix estimation approaches: we show that they efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise prediction error. These new results on low-rank matrix estimation make it possible to devise reinforcement learning algorithms that fully exploit the underlying low-rank structure. We provide two examples of such algorithms: a regret minimization algorithm for low-rank bandit problems, and a best policy identification algorithm for low-rank MDPs. Both algorithms yield state-of-the-art performance guarantees.

low-rank reinforcement learning, name change, spectral entry-wise matrix estimation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Generalization Bounds for Rank-sparse Neural Networks

Ledent, Antoine, Alves, Rodrigo, Lei, Yunwen

arXiv.org Artificial IntelligenceNov-26-2025

It has been recently observed in much of the literature that neural networks exhibit a bottleneck rank property: for larger depths, the activation and weights of neural networks trained with gradient-based methods tend to be of approximately low rank. In fact, the rank of the activations of each layer converges to a fixed value referred to as the ``bottleneck rank'', which is the minimum rank required to represent the training data. This perspective is in line with the observation that regularizing linear networks (without activations) with weight decay is equivalent to minimizing the Schatten $p$ quasi norm of the neural network. In this paper we investigate the implications of this phenomenon for generalization. More specifically, we prove generalization bounds for neural networks which exploit the approximate low rank structure of the weight matrices if present. The final results rely on the Schatten $p$ quasi norms of the weight matrices: for small $p$, the bounds exhibit a sample complexity $ \widetilde{O}(WrL^2)$ where $W$ and $L$ are the width and depth of the neural network respectively and where $r$ is the rank of the weight matrices. As $p$ increases, the bound behaves more like a norm-based bound instead.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2510.21945

Country: Europe (1.00)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Sequences of Logits Reveal the Low Rank Structure of Language Models

Golowich, Noah, Liu, Allen, Shetty, Abhishek

arXiv.org Machine LearningOct-30-2025

A major problem in the study of large language models is to understand their inherent low-dimensional structure. We introduce an approach to study the low-dimensional structure of language models at a model-agnostic level: as sequential probabilistic models. We first empirically demonstrate that a wide range of modern language models exhibit low-rank structure: in particular, matrices built from the model's logits for varying sets of prompts and responses have low approximate rank. We then show that this low-rank structure can be leveraged for generation -- in particular, we can generate a response to a target prompt using a linear combination of the model's outputs on unrelated, or even nonsensical prompts. On the theoretical front, we observe that studying the approximate rank of language models in the sense discussed above yields a simple universal abstraction whose theoretical predictions parallel our experiments. We then analyze the representation power of the abstraction and give provable learning guarantees.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2510.24966

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Government > Military (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Filters

Collaborating Authors

low-rank structure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Generalization Bounds for Rank-sparse Neural Networks

Group-Aware Matrix Estimation and Latent Subspace Recovery

f334c3375bd3744e98a0ca8eaa2403b0-Paper-Conference.pdf

Low-Rank Regression with Tensor Responses

SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning

adc98a266f45005c403b8311ca7e8bd7-Paper-Conference.pdf

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning

Generalization Bounds for Rank-sparse Neural Networks

Sequences of Logits Reveal the Low Rank Structure of Language Models