AITopics | sparse component

Collaborating Authors

sparse component

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Low-Rank Plus Sparse Matrix Transfer Learning under Growing Representations and Ambient Dimensions

Chai, Jinhang, Liu, Xuyuan, Chen, Elynn, Yan, Yujun

arXiv.org Machine LearningJan-30-2026

Learning systems often expand their ambient features or latent representations over time, embedding earlier representations into larger spaces with limited new latent structure. We study transfer learning for structured matrix estimation under simultaneous growth of the ambient dimension and the intrinsic representation, where a well-estimated source task is embedded as a subspace of a higher-dimensional target task. We propose a general transfer framework in which the target parameter decomposes into an embedded source component, low-dimensional low-rank innovations, and sparse edits, and develop an anchored alternating projection estimator that preserves transferred subspaces while estimating only low-dimensional innovations and sparse modifications. We establish deterministic error bounds that separate target noise, representation growth, and source estimation error, yielding strictly improved rates when rank and sparsity increments are small. We demonstrate the generality of the framework by applying it to two canonical problems. For Markov transition matrix estimation from a single trajectory, we derive end-to-end theoretical guarantees under dependent noise. For structured covariance estimation under enlarged dimensions, we provide complementary theoretical analysis in the appendix and empirically validate consistent transfer gains.

artificial intelligence, estimation, machine learning, (18 more...)

arXiv.org Machine Learning

2601.21873

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.70)

Add feedback

Manifold denoising by Nonlinear Robust Principal Component Analysis

Neural Information Processing SystemsDec-25-2025, 20:22:21 GMT

This paper extends robust principal component analysis (RPCA) to nonlinear manifolds. Suppose that the observed data matrix is the sum of a sparse component and a component drawn from some low dimensional manifold. Is it possible to separate them by using similar ideas as RPCA? Is there any benefit in treating the manifold as a whole as opposed to treating each local region independently? We answer these two questions affirmatively by proposing and analyzing an optimization framework that separates the sparse component from the manifold under noisy data. Theoretical error bounds are provided when the tangent spaces of the manifold satisfy certain incoherence conditions. We also provide a near optimal choice of the tuning parameters for the proposed optimization formulation with the help of a new curvature estimation method. The efficacy of our method is demonstrated on both synthetic and real datasets.

manifold, name change, nonlinear robust principal component analysis, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.32)

Add feedback

Transformed $\ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding

Zhao, Kun, Zhang, Haoke, Wang, Jiayi, Lou, Yifei

arXiv.org Machine LearningOct-7-2025

Robust Principal Component Analysis (RPCA) aims to recover a low-rank structure from noisy, partially observed data that is also corrupted by sparse, potentially large-magnitude outliers. Traditional RPCA models rely on convex relaxations, such as nuclear norm and $\ell_1$ norm, to approximate the rank of a matrix and the $\ell_0$ functional (the number of non-zero elements) of another. In this work, we advocate a nonconvex regularization method, referred to as transformed $\ell_1$ (TL1), to improve both approximations. The rationale is that by varying the internal parameter of TL1, its behavior asymptotically approaches either $\ell_0$ or $\ell_1$. Since the rank is equal to the number of non-zero singular values and the nuclear norm is defined as their sum, applying TL1 to the singular values can approximate either the rank or the nuclear norm, depending on its internal parameter. We conduct a fine-grained theoretical analysis of statistical convergence rates, measured in the Frobenius norm, for both the low-rank and sparse components under general sampling schemes. These rates are comparable to those of the classical RPCA model based on the nuclear norm and $\ell_1$ norm. Moreover, we establish constant-order upper bounds on the estimated rank of the low-rank component and the cardinality of the sparse component in the regime where TL1 behaves like $\ell_0$, assuming that the respective matrices are exactly low-rank and exactly sparse. Extensive numerical experiments on synthetic data and real-world applications demonstrate that the proposed approach achieves higher accuracy than the classic convex model, especially under non-uniform sampling schemes.

learn, matrix, theorem 3, (15 more...)

arXiv.org Machine Learning

2510.03624

Country:

North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Robust PCA with compressed data

Wooseok Ha, Rina Foygel Barber

Neural Information Processing SystemsOct-2-2025, 13:27:32 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, matrix, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Sparse PCA via Bipartite Matchings

Megasthenis Asteris, Dimitris Papailiopoulos, Anastasios Kyrillidis, Alexandros G. Dimakis

Neural Information Processing SystemsOct-2-2025, 02:30:20 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, pca problem, sparse pca problem, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

RecIS: Sparse to Dense, A Unified Training Framework for Recommendation Models

Zong, Hua, Zeng, Qingtao, Zhou, Zhengxiong, Han, Zhihua, Yan, Zhensong, Liu, Mingjie, Sun, Hechen, Liu, Jiawei, Hu, Yiwen, Wang, Qi, Xian, YiHan, Guo, Wenjie, Xiang, Houyuan, Zeng, Zhiyuan, Sheng, Xiangrong, Yan, Bencheng, Hu, Nan, Huang, Yuheng, Lian, Jinqing, Xu, Ziru, Zhang, Yan, Huang, Ju, Yang, Siran, Yi, Huimin, Wang, Jiamang, Wang, Pengjie, Zhu, Han, Wu, Jian, Ou, Dan, Xu, Jian, Tang, Haihong, Jiang, Yuning, Zheng, Bo, Qu, Lin

arXiv.org Artificial IntelligenceSep-26-2025

In this paper, we propose RecIS, a unified Sparse-Dense training framework designed to achieve two primary goals: 1. Unified Framework To create a Unified sparse-dense training framework based on the PyTorch ecosystem that meets the training needs of industrial-grade recommendation models that integrated with large models. 2.System Optimization To optimize the sparse component, offering superior efficiency over the TensorFlow-based recommendation models. The dense component, meanwhile, leverages existing optimization technologies within the PyTorch ecosystem. Currently, RecIS is being used in Alibaba for numerous large-model enhanced recommendation training tasks, and some traditional sparse models have also begun training in it.

artificial intelligence, machine learning, recommendation system, (17 more...)

arXiv.org Artificial Intelligence

2509.20883

Genre: Research Report (0.64)

Industry:

Information Technology (0.69)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LOST: Low-rank and Sparse Pre-training for Large Language Models

Li, Jiaxi, Yin, Lu, Shen, Li, Xu, Jinjin, Xu, Liwu, Huang, Tianjin, Wang, Wenwu, Liu, Shiwei, Wang, Xilu

arXiv.org Artificial IntelligenceAug-5-2025

While large language models (LLMs) have achieved remarkable performance across a wide range of tasks, their massive scale incurs prohibitive computational and memory costs for pre-training from scratch. Recent studies have investigated the use of low-rank parameterization as a means of reducing model size and training cost. In this context, sparsity is often employed as a complementary technique to recover important information lost in low-rank compression by capturing salient features in the residual space. However, existing approaches typically combine low-rank and sparse components in a simplistic or ad hoc manner, often resulting in undesirable performance degradation compared to full-rank training. In this paper, we propose \textbf{LO}w-rank and \textbf{S}parse pre-\textbf{T}raining (\textbf{LOST}) for LLMs, a novel method that ingeniously integrates low-rank and sparse structures to enable effective training of LLMs from scratch under strict efficiency constraints. LOST applies singular value decomposition to weight matrices, preserving the dominant low-rank components, while allocating the remaining singular values to construct channel-wise sparse components to complement the expressiveness of low-rank training. We evaluate LOST on LLM pretraining ranging from 60M to 7B parameters. Our experiments show that LOST achieves competitive or superior performance compared to full-rank models, while significantly reducing both memory and compute overhead. Moreover, Code is available at \href{https://github.com/JiaxiLi1/LOST-Low-rank-and-Sparse-Training-for-Large-Language-Models}{LOST Repo}

arxiv preprint arxiv, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.02668

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing

Tian, Yuanhe, Cheng, Pengsen, Jin, Guoqing, Zhang, Lei, Song, Yan

arXiv.org Artificial IntelligenceJun-10-2025

Multi-modal affective computing aims to automatically recognize and interpret human attitudes from diverse data sources such as images and text, thereby enhancing human-computer interaction and emotion understanding. Existing approaches typically rely on unimodal analysis or straightforward fusion of cross-modal information that fail to capture complex and conflicting evidence presented across different modalities. In this paper, we propose a novel LLM-based approach for affective computing that explicitly deconstructs visual and textual representations into shared (modality-invariant) and modality-specific components. Specifically, our approach firstly encodes and aligns input modalities using pre-trained multi-modal encoders, then employs a representation decomposition framework to separate common emotional content from unique cues, and finally integrates these decomposed signals via an attention mechanism to form a dynamic soft prompt for a multi-modal LLM. Extensive experiments on three representative tasks for affective computing, namely, multi-modal aspect-based sentiment analysis, multi-modal emotion analysis, and hateful meme detection, demonstrate the effectiveness of our approach, which consistently outperforms strong baselines and state-of-the-art models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.07086

Country:

Asia (0.93)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Zhang, Junyan, Gao, Yubo, Yan, Yibo, Li, Jungang, Hou, Zhaorui, Tao, Sicheng, Liu, Shuliang, Dai, Song, Hei, Yonghua, Li, Junzhuo, Hu, Xuming

arXiv.org Artificial IntelligenceMay-28-2025

The finetuning of Large Language Models (LLMs) has significantly advanced their instruction-following capabilities, yet the underlying computational mechanisms driving these improvements remain poorly understood. This study systematically examines how fine-tuning reconfigures LLM computations by isolating and analyzing instruction-specific sparse components, i.e., neurons in dense models and both neurons and experts in Mixture-of-Experts (MoE) architectures. In particular, we introduce HexaInst, a carefully curated and balanced instructional dataset spanning six distinct categories, and propose SPARCOM, a novel analytical framework comprising three key contributions: (1) a method for identifying these sparse components, (2) an evaluation of their functional generality and uniqueness, and (3) a systematic comparison of their alterations. Through experiments, we demonstrate functional generality, uniqueness, and the critical role of these components in instruction execution. By elucidating the relationship between fine-tuning-induced adaptations and sparse computational substrates, this work provides deeper insights into how LLMs internalize instruction-following behavior for the trustworthy LLM community.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2505.21191

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Transfer Learning for High-dimensional Reduced Rank Time Series Models

Safikhani, Mingliang Ma Abolfazl

arXiv.org Machine LearningApr-22-2025

The objective of transfer learning is to enhance estimation and inference in a target data by leveraging knowledge gained from additional sources. Recent studies have explored transfer learning for independent observations in complex, high-dimensional models assuming sparsity, yet research on time series models remains limited. Our focus is on transfer learning for sequences of observations with temporal dependencies and a more intricate model parameter structure. Specifically, we investigate the vector autoregressive model (VAR), a widely recognized model for time series data, where the transition matrix can be deconstructed into a combination of a sparse matrix and a low-rank one. We propose a new transfer learning algorithm tailored for estimating high-dimensional VAR models characterized by low-rank and sparse structures. Additionally, we present a novel approach for selecting informative observations from auxiliary datasets. Theoretical guarantees are established, encompassing model parameter consistency, informative set selection, and the asymptotic distribution of estimators under mild conditions. The latter facilitates the construction of entry-wise confidence intervals for model parameters. Finally, we demonstrate the empirical efficacy of our methodologies through both simulated and real-world datasets.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2504.15691

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback