AITopics | matrix decomposition

We present an algorithm based on the alternating direction method of multipliers (ADMM) for solving nonlinear matrix decompositions (NMD). Given an input matrix $X \in \mathbb{R}^{m \times n}$ and a factorization rank $r \ll \min(m, n)$, NMD seeks matrices $W \in \mathbb{R}^{m \times r}$ and $H \in \mathbb{R}^{r \times n}$ such that $X \approx f(WH)$, where $f$ is an element-wise nonlinear function. We evaluate our method on several representative nonlinear models: the rectified linear unit activation $f(x) = \max(0, x)$, suitable for nonnegative sparse data approximation, the component-wise square $f(x) = x^2$, applicable to probabilistic circuit representation, and the MinMax transform $f(x) = \min(b, \max(a, x))$, relevant for recommender systems. The proposed framework flexibly supports diverse loss functions, including least squares, $\ell_1$ norm, and the Kullback-Leibler divergence, and can be readily extended to other nonlinearities and metrics. We illustrate the applicability, efficiency, and adaptability of the approach on real-world datasets, highlighting its potential for a broad range of applications.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2512.17473

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Neural Information Processing SystemsJun-18-2026, 16:49:14 GMT

Parameter-efficient fine-tuning (PEFT) methods have shown promise in adapting large language models, yet existing approaches exhibit counter-intuitive phenomena: integrating either matrix decomposition or mixture-of-experts (MoE) individually decreases performance across tasks, though decomposition improves results on specific domains despite reducing parameters, while MoE increases parameter count without corresponding decrease in training efficiency. Motivated by these observations and the modular nature of PT, we propose PT-MoE, a novel framework that integrates matrix decomposition with MoE routing for efficient PT. Evaluation results across 17 datasets demonstrate that PT-MoE achieves state-of-the-art performance in both question answering (QA) and mathematical problem solving tasks, improving F1 score by 1.49 points over PT and 2.13 points over LoRA in QA tasks, while improving mathematical accuracy by 10.75 points over PT and 0.44 points over LoRA, all while using 25% fewer parameters than LoRA. Our analysis reveals that while PT methods generally excel in QA tasks and LoRA-based methods in math datasets, the integration of matrix decomposition and MoE in PT-MoE yields complementary benefits: decomposition enables efficient parameter sharing across experts while MoE provides dynamic adaptation, collectively enabling PT-MoE to demonstrate cross-task consistency and generalization abilities. These findings, along with ablation studies on routing mechanisms and architectural components, provide insights for future PEFT methods. 1

computational linguistic, information retrieval, large language model, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.93)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)

Add feedback

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Neural Information Processing SystemsJun-12-2026, 21:11:37 GMT

Parameter-efficient fine-tuning (PEFT) methods have shown promise in adapting large language models, yet existing approaches exhibit counter-intuitive phenomena: integrating either matrix decomposition or mixture-of-experts (MoE) individually decreases performance across tasks, though decomposition improves results on specific domains despite reducing parameters, while MoE increases parameter count without corresponding decrease in training efficiency. Motivated by these observations and the modular nature of PT, we propose PT-MoE, a novel framework that integrates matrix decomposition with MoE routing for efficient PT. Evaluation results across 17 datasets demonstrate that PT-MoE achieves state-of-the-art performance in both question answering (QA) and mathematical problem solving tasks, improving F1 score by 1.49 points over PT and 2.13 points over LoRA in QA tasks, while improving mathematical accuracy by 10.75 points over PT and 0.44 points over LoRA, all while using 25% fewer parameters than LoRA. Our analysis reveals that while PT methods generally excel in QA tasks and LoRA-based methods in math datasets, the integration of matrix decomposition and MoE in PT-MoE yields complementary benefits: decomposition enables efficient parameter sharing across experts while MoE provides dynamic adaptation, collectively enabling PT-MoE to demonstrate cross-task consistency and generalization abilities. These findings, along with ablation studies on routing mechanisms and architectural components, provide insights for future PEFT methods.

artificial intelligence, natural language, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

b1d10e7bafa4421218a51b1e1f1b0ba2-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 18:43:31 GMT

arxiv preprint, complexity, transformer, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

5bd9fbb3a5a985f80c16ddd0ec1dfc43-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 06:32:42 GMT

approximation, cross approximation, decomposition, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Jefferson County > Golden (0.04)
North America > United States > Ohio (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

Neural Information Processing SystemsOct-2-2025, 17:08:13 GMT

Neural Information Processing Systems http://nips.cc/

convergence rate, decomposition, matrix decomposition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Illinois (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Sparse and Low-Rank Tensor Decomposition

Parikshit Shah, Nikhil Rao, Gongguo Tang

Neural Information Processing SystemsOct-2-2025, 07:22:04 GMT

Motivated by the problem of robust factorization of a low-rank tensor, we study the question of sparse and low-rank tensor decomposition. We present an efficient computational algorithm that modifies Leurgans' algoirthm for tensor factorization. Our method relies on a reduction of the problem to sparse and low-rank matrix decomposition via the notion of tensor contraction. We use well-understood convex techniques for solving the reduced matrix sub-problem which then allows us to perform the full decomposition of the tensor. We delineate situations where the problem is recoverable and provide theoretical guarantees for our algorithm.

artificial intelligence, decomposition, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Jefferson County > Golden (0.04)
North America > Mexico > Quintana Roo > Cancún (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

SOFT: Softmax-free Transformer with Linear Complexity Jiachen Lu

Neural Information Processing SystemsAug-16-2025, 22:39:58 GMT

Vision transformers (ViTs) have pushed the state-of-the-art for various visual recognition tasks by patch-wise image tokenization followed by self-attention.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

5bd9fbb3a5a985f80c16ddd0ec1dfc43-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 02:50:23 GMT

approximation, cross approximation, decomposition, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Jefferson County > Golden (0.04)
North America > United States > Ohio (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning

Heng, Qiang, Wang, Caixing

arXiv.org Machine LearningJul-8-2025

First-order methods in convex optimization offer low per-iteration cost but often suffer from slow convergence, while second-order methods achieve fast local convergence at the expense of costly Hessian inversions. In this paper, we highlight a middle ground: minimizing a quadratic majorant with fixed curvature at each iteration. This strategy strikes a balance between per-iteration cost and convergence speed, and crucially allows the reuse of matrix decompositions, such as Cholesky or spectral decompositions, across iterations and varying regularization parameters. We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties under standard assumptions. The new perspective of our analysis is to center the arguments around the induced norm of the curvature matrix $H$. To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems. In particular, we propose a novel Sylvester equation modelling technique for kernel multinomial regression. In Julia-based experiments, QMME compares favorably against various established first- and second-order methods. Furthermore, we demonstrate that our algorithms complement existing kernel approximation techniques through more efficiently handling sketching matrices with large projection dimensions. Our numerical experiments and real data analysis are available and fully reproducible at https://github.com/qhengncsu/QMME.jl.

artificial intelligence, machine learning, regression, (19 more...)

arXiv.org Machine Learning

2507.04247

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Filters

Collaborating Authors

matrix decomposition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

b1d10e7bafa4421218a51b1e1f1b0ba2-Paper.pdf

5bd9fbb3a5a985f80c16ddd0ec1dfc43-Paper-Conference.pdf

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

Sparse and Low-Rank Tensor Decomposition

SOFT: Softmax-free Transformer with Linear Complexity Jiachen Lu

5bd9fbb3a5a985f80c16ddd0ec1dfc43-Paper-Conference.pdf

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning