AITopics | latent factor

Collaborating Authors

latent factor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unsupervised Identification and Removal of Spurious Correlations During Fine-Tuning

Gilligan-Lee, Ciarán M., Egan, Joseph, Zhu, Yuchen, O'Riordan, Michael

arXiv.org Machine LearningMay-28-2026

Fine-tuning a pretrained language model on a curated dataset can produce spurious correlations between the fine-tuning task and unintended latent factors -- such as misaligned personas or political slant -- that the curation procedure has entangled with the task. The model can latch onto these spurious correlations, leading to bias and reduced out-of-distribution generalisation. We prove that under reasonable assumptions on task complexity and the spurious correlation, such latent factors can be identified, without supervision, from the weights of a naive LoRA fine-tune. Existing approaches to removing bias, such as activation steering, remove identified factors from residual-stream activations, either at inference or during training. We argue, however, that the goal should be to remove the spurious correlation, not the latent factor itself, as the pretrained model may rely on it for genuine task signal. To enable this, we propose GRASP, GRadient projection of Associated Spurious Patterns, which prevents the model from acquiring new reliance on the identified latent factor while preserving any pretrained content along it. We validate on three fine-tuning tasks. The first two involve emergent misalignment, where fine-tuning on a narrow task -- in our case, writing insecure code and giving bad medical advice -- leads to misaligned responses on unrelated topics. Here our method completely removes misalignment in the insecure code case and reduces them by ~5x in the bad medical advice case, beating all baselines in the trade-off between misalignment-reduction and task-preservation. The last is a novel political-bias experiment, where fine-tuning on right-skewed Reddit financial-advice data causes political-lean drift on unrelated topics. Here our method reduces drift by more than half, while improving financial task performance, beating all baselines.

artificial intelligence, machine learning, misalignment, (17 more...)

arXiv.org Machine Learning

2605.27676

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.87)
Law (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Nonlinear Factor Models with Unknown Monotone Links from Incomplete and Noisy Data

Chao, Yutong, Gökhan, Resat, Etesami, Jalal, Habibnia, Ali

arXiv.org Machine LearningMay-27-2026

We study a nonlinear factor model in which observed responses depend on low-rank latent factors through an unknown monotone link function. This setting is challenging and largely underexplored due to severe nonconvexity and identifiability issues. The link function is assumed to lie in a reproducing kernel Hilbert space (RKHS), enabling flexible nonparametric modeling while preserving identifiability. We formulate the problem as the joint recovery of the low-rank factors, loadings, and the nonlinear link function from possibly incomplete and noisy observations and propose a projected block coordinate descent (BCD) algorithm with explicit regularization to address scale and rotational ambiguities. Under mild incoherence of factors and standard sampling conditions, we establish convergence guarantees in both noiseless and noisy regimes, along with sublinear regret bounds for the link-function updates. Our results extend classical linear factor models to a broad nonlinear regime and provide a principled framework for learning nonlinear latent structures. We evaluate the proposed approach using controlled synthetic experiments, indicating promising performance.

artificial intelligence, link function, machine learning, (18 more...)

arXiv.org Machine Learning

2605.26271

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Distributed Flexible Nonlinear Tensor Factorization

Shandian Zhe, Kai Zhang, Pengyuan Wang, Kuang-chih Lee, Zenglin Xu, Yuan Qi, Zoubin Ghahramani

Neural Information Processing SystemsMay-1-2026, 05:55:37 GMT

Tensor factorization is a powerful tool to analyse multi-way data. Recently proposed nonlinear factorization methods, although capable of capturing complex relationships, are computationally quite expensive and may suffer a severe learning bias in case of extreme data sparsity. Therefore, we propose a distributed, flexible nonlinear tensor factorization model, which avoids the expensive computations and structural restrictions of the Kronecker-product in the existing TGP formulations, allowing an arbitrary subset of tensorial entries to be selected for training. Meanwhile, we derive a tractable and tight variational evidence lower bound (ELBO) that enables highly decoupled, parallel computations and high-quality inference. Based on the new bound, we develop a distributed, key-value-free inference algorithm in the MAPREDUCE framework, which can fully exploit the memory cache mechanism in fast MAPREDUCE systems such as SPARK. Experiments demonstrate the advantages of our method over several state-of-the-art approaches, in terms of both predictive performance and computational efficiency.

artificial intelligence, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models

Neural Information Processing SystemsApr-26-2026, 22:47:37 GMT

Understanding the neural basis of behavior is a fundamental goal in neuroscience. Current research in large-scale neuro-behavioral data analysis often relies on decoding models, which quantify behavioral information in neural data but lack details on behavior encoding. This raises an intriguing scientific question: how can we enable in-depth exploration of neural representations in behavioral tasks, revealing interpretable neural dynamics associated with behaviors. However, addressing this issue is challenging due to the varied behavioral encoding across different brain regions and mixed selectivity at the population level. To tackle this limitation, our approach, named (BeNeDiff), first identifies a fine-grained and disentangled neural subspace using a behavior-informed latent variable model. It then employs state-of-the-art generative diffusion models to synthesize behavior videos that interpret the neural dynamics of each latent factor.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

Fine-tuning Factor Augmented Neural Lasso for Heterogeneous Environments

Chai, Jinhang, Fan, Jianqing, Gao, Cheng, Yin, Qishuo

arXiv.org Machine LearningApr-15-2026

Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its methodology and theoretical properties in high-dimensional nonparametric settings with variable selection have not yet been developed. This paper introduces the fine-tuning factor augmented neural Lasso (FAN-Lasso), a transfer learning framework for high-dimensional nonparametric regression with variable selection that simultaneously handles covariate and posterior shifts. We use a low-rank factor structure to manage high-dimensional dependent covariates and propose a novel residual fine-tuning decomposition in which the target function is expressed as a transformation of a frozen source function and other variables to achieve transfer learning and nonparametric variable selection. This augmented feature from the source predictor allows for the transfer of knowledge to the target domain and reduces model complexity there. We derive minimax-optimal excess risk bounds for the fine-tuning FAN-Lasso, characterizing the precise conditions, in terms of relative sample sizes and function complexities, under which fine-tuning yields statistical acceleration over single-task learning. The proposed framework also provides a theoretical perspective on parameter-efficient fine-tuning methods. Extensive numerical experiments across diverse covariate- and posterior-shift scenarios demonstrate that the fine-tuning FAN-Lasso consistently outperforms standard baselines and achieves near-oracle performance even under severe target sample size constraints, empirically validating the derived rates.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2604.12288

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Integrative Learning of Dynamically Evolving Multiplex Graphs and Nodal Attributes Using Neural Network Gaussian Processes with an Application to Dynamic Terrorism Graphs

Rodriguez-Acosta, Jose, Guha, Sharmistha, Patel, Lekha, Shuler, Kurtis

arXiv.org Machine LearningMar-24-2026

Exploring the dynamic co-evolution of multiplex graphs and nodal attributes is a compelling question in criminal and terrorism networks. This article is motivated by the study of dynamically evolving interactions among prominent terrorist organizations, considering various organizational attributes like size, ideology, leadership, and operational capacity. Statistically principled integration of multiplex graphs with nodal attributes is significantly challenging due to the need to leverage shared information within and across layers, account for uncertainty in predicting unobserved links, and capture temporal evolution of node attributes. These difficulties increase when layers are partially observed, as in terrorism networks where connections are deliberately hidden to obscure key relationships. To address these challenges, we present a principled methodological framework to integrate the multiplex graph layers and nodal attributes. The approach employs time-varying stochastic latent factor models, leveraging shared latent factors to capture graph structure and its co-evolution with node attributes. Latent factors are modeled using Gaussian processes with an infinitely wide deep neural network-based covariance function, termed neural network Gaussian processes (NN-GP). The NN-GP framework on latent factors exploits the predictive power of Bayesian deep neural network architecture while propagating uncertainty for reliability. Simulation studies highlight superior performance of the proposed approach in achieving inferential objectives. The approach, termed as dynamic joint learner, enables predictive inference (with uncertainty) of diverse unobserved dynamic relationships among prominent terrorist organizations and their organization-specific attributes, as well as clustering behavior in terms of friend-and-foe relationships, which could be informative in counter-terrorism research.

artificial intelligence, machine learning, nodal, (19 more...)

arXiv.org Machine Learning

2603.20962

Country:

South America > Colombia (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas (0.04)
(13 more...)

Genre: Research Report > New Finding (0.67)

Industry: Law Enforcement & Public Safety > Terrorism (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

Xi Chen, Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel

Neural Information Processing SystemsMar-23-2026, 13:39:43 GMT

This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods. For an up-to-date version of this paper, please see https://arxiv.org/abs/1606.03657.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

latent factor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Unsupervised Identification and Removal of Spurious Correlations During Fine-Tuning

Learning Nonlinear Factor Models with Unknown Monotone Links from Incomplete and Noisy Data

Distributed Flexible Nonlinear Tensor Factorization

d5470483dd38f71f7bd9e68ce1b94145-Paper-Conference.pdf

Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models

29586cb449c90e249f1f09a0a4ee245a-Supplemental.pdf

29586cb449c90e249f1f09a0a4ee245a-Paper.pdf

Fine-tuning Factor Augmented Neural Lasso for Heterogeneous Environments

Integrative Learning of Dynamically Evolving Multiplex Graphs and Nodal Attributes Using Neural Network Gaussian Processes with an Application to Dynamic Terrorism Graphs

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets