AITopics | matrix factorization

Non-negative matrix factorization is a popular tool for decomposing data into feature and weight matrices under non-negativity constraints. It enjoys practical success but is poorly understood theoretically. This paper proposes an algorithm that alternates between decoding the weights and updating the features, and shows that assuming a generative model of the data, it provably recovers the groundtruth under fairly mild conditions. In particular, its only essential requirement on features is linear independence. Furthermore, the algorithm uses ReLU to exploit the non-negativity for decoding the weights, and thus can tolerate adversarial noise that can potentially be as large as the signal, and can tolerate unbiased noise much larger than the signal. The analysis relies on a carefully designed coupling between two potential functions, which we believe is of independent interest.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

SMF_NeurIPS_2023-6

Hanbaek Lyu

Neural Information Processing SystemsApr-30-2026, 07:09:30 GMT

artificial intelligence, bioinformatics, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.28)
North America > United States > California (0.28)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.95)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.94)
(2 more...)

Add feedback

2f2cd5c753d3cee48e47dbb5bbaed331-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 08:12:27 GMT

artificial intelligence, gradient descent, machine learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

2f2cd5c753d3cee48e47dbb5bbaed331-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 08:12:23 GMT

artificial intelligence, gradient descent, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

1fb2a1c37b18aa4611c3949d6148d0f8-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 01:14:55 GMT

data mining, machine learning, regime, (20 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Industry:

Transportation > Ground > Road (0.47)
Transportation > Infrastructure & Services (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Neural Information Processing SystemsApr-25-2026, 00:11:17 GMT

Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > Canada (0.68)
North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Kejun Huang, Xiao Fu, Nikolaos D. Sidiropoulos

Neural Information Processing SystemsApr-22-2026, 08:41:36 GMT

In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words - i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigendecomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Sparse Network Inference under Imperfect Detection and its Application to Ecological Networks

Zhang, Aoran, Wei, Tianyao, Guerrero, Maria J., Uribe, César A.

arXiv.org Machine LearningApr-22-2026

Abstract--Recovering latent structure from count data has received considerable attention in network inference, particularly when one seeks both cross-group interactions and within-group similarity patterns in bipartite networks, which is widely used in ecology research. Such networks are often sparse and inherently imperfect in their detection. Existing models mainly focus on interaction recovery, while the induced similarity graphs are much less studied. Moreover, sparsity is often not controlled, and scale is unbalanced, leading to oversparse or poorly rescaled estimates with degrading structural recovery. We impose nonconvex ℓ1/2 regularization on the latent similarity and connectivity structures to promote sparsity within-group similarity and cross-group connectivity with better relative scale. To solve it, we develop an ADMM-based algorithm with adaptive penalization and scale-aware initialization and establish its asymptotic feasibility and KKT stationarity of cluster points under mild regularity conditions. Experiments on synthetic and real-world ecological datasets demonstrate improved recovery of latent factors and similarity/connectivity structure relative to existing baselines. Index Terms--augmented Lagrangian, nonconvex nonsmooth optimization, nonnegative matrix factorization, link prediction, ecological network inference, structured sparse recovery I. INTRODUCTION This setting is inherent in sensing and monitoring applications [3], [4], where observations, such as counts, are obtained via an imperfect sampling process. In this paper, we are interested in ecological interaction networks describing how species associate with locations and how environments shape biodiversity patterns [5], [6].

artificial intelligence, machine learning, recovery, (17 more...)

arXiv.org Machine Learning

2604.1882

Country:

North America > United States (0.14)
South America > Colombia > Santander Department (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Nonnegative Matrix Factorization in the Component-Wise L1 Norm for Sparse Data

Seraghiti, Giovanni, Dubrulle, Kévin, Vandaele, Arnaud, Gillis, Nicolas

arXiv.org Machine LearningApr-1-2026

Nonnegative matrix factorization (NMF) approximates a nonnegative matrix, $X$, by the product of two nonnegative factors, $WH$, where $W$ has $r$ columns and $H$ has $r$ rows. In this paper, we consider NMF using the component-wise L1 norm as the error measure (L1-NMF), which is suited for data corrupted by heavy-tailed noise, such as Laplace noise or salt and pepper noise, or in the presence of outliers. Our first contribution is an NP-hardness proof for L1-NMF, even when $r=1$, in contrast to the standard NMF that uses least squares. Our second contribution is to show that L1-NMF strongly enforces sparsity in the factors for sparse input matrices, thereby favoring interpretability. However, if the data is affected by false zeros, too sparse solutions might degrade the model. Our third contribution is a new, more general, L1-NMF model for sparse data, dubbed weighted L1-NMF (wL1-NMF), where the sparsity of the factorization is controlled by adding a penalization parameter to the entries of $WH$ associated with zeros in the data. The fourth contribution is a new coordinate descent (CD) approach for wL1-NMF, denoted as sparse CD (sCD), where each subproblem is solved by a weighted median algorithm. To the best of our knowledge, sCD is the first algorithm for L1-NMF whose complexity scales with the number of nonzero entries in the data, making it efficient in handling large-scale, sparse data. We perform extensive numerical experiments on synthetic and real-world data to show the effectiveness of our new proposed model (wL1-NMF) and algorithm (sCD).

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2603.29715

Country:

Europe > United Kingdom (0.04)
Europe > Belgium (0.04)
North America > United States > Utah (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Near-Optimal Smoothing of Structured Conditional Probability Matrices

Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky

Neural Information Processing SystemsMar-23-2026, 15:51:50 GMT

Utilizing the structure of a probabilistic model can significantly increase its learning speed. Motivated by several recent applications, in particular bigram models in language processing, we consider learning low-rank conditional probability matrices under expected KL-risk. This choice makes smoothing, that is the careful handling of low-probability elements, paramount. We derive an iterative algorithm that extends classical non-negative matrix factorization to naturally incorporate additive smoothing and prove that it converges to the stationary points of a penalized empirical risk. We then derive sample-complexity bounds for the global minimzer of the penalized risk and show that it is within a small factor of the optimal sample complexity.

Add feedback

Filters

Collaborating Authors

matrix factorization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates

SMF_NeurIPS_2023-6

2f2cd5c753d3cee48e47dbb5bbaed331-Supplemental.pdf

2f2cd5c753d3cee48e47dbb5bbaed331-Paper.pdf

1fb2a1c37b18aa4611c3949d6148d0f8-Paper.pdf

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Sparse Network Inference under Imperfect Detection and its Application to Ecological Networks

Nonnegative Matrix Factorization in the Component-Wise L1 Norm for Sparse Data

Near-Optimal Smoothing of Structured Conditional Probability Matrices