AITopics

doi: 10.22044/jadm.2023.13666.2482

2502.01091

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Piotrowski, Mateusz, Riechers, Paul M., Filan, Daniel, Shai, Adam S.

Constrained belief updates explain geometric structures in transformer representations

arXiv.org Artificial IntelligenceFeb-3-2025

What computational structures emerge in transformers trained on next-token prediction? In this work, we provide evidence that transformers implement constrained Bayesian belief updating -- a parallelized version of partial Bayesian inference shaped by architectural constraints. To do this, we integrate the model-agnostic theory of optimal prediction with mechanistic interpretability to analyze transformers trained on a tractable family of hidden Markov models that generate rich geometric patterns in neural activations. We find that attention heads carry out an algorithm with a natural interpretation in the probability simplex, and create representations with distinctive geometric structure. We show how both the algorithmic behavior and the underlying geometry of these representations can be theoretically predicted in detail -- including the attention pattern, OV-vectors, and embedding vectors -- by modifying the equations for optimal future token predictions to account for the architectural constraints of attention. Our approach provides a principled lens on how gradient descent resolves the tension between optimal prediction and architectural design.

artificial intelligence, machine learning, representation, (16 more...)

2502.01954

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Troiani, Emanuele, Cui, Hugo, Dandi, Yatin, Krzakala, Florent, Zdeborová, Lenka

Fundamental limits of learning in sequence multi-index models and deep attention networks: High-dimensional asymptotics and sharp thresholds

arXiv.org Artificial IntelligenceFeb-2-2025

In this manuscript, we study the learning of deep attention neural networks, defined as the composition of multiple self-attention layers, with tied and low-rank weights. We first establish a mapping of such models to sequence multi-index models, a generalization of the widely studied multi-index model to sequential covariates, for which we establish a number of general results. In the context of Bayesian-optimal learning, in the limit of large dimension $D$ and commensurably large number of samples $N$, we derive a sharp asymptotic characterization of the optimal performance as well as the performance of the best-known polynomial-time algorithm for this setting --namely approximate message-passing--, and characterize sharp thresholds on the minimal sample complexity required for better-than-random prediction performance. Our analysis uncovers, in particular, how the different layers are learned sequentially. Finally, we discuss how this sequential learning can also be observed in a realistic setup.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

2502.00901

Country: North America (0.28)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
(2 more...)

arXiv.org Artificial IntelligenceFeb-2-2025

Blink of an eye: a simple theory for feature localization in generative models

Li, Marvin, Karan, Aayush, Chen, Sitan

Large language models (LLMs) can exhibit undesirable and unexpected behavior in the blink of an eye. In a recent Anthropic demo, Claude switched from coding to Googling pictures of Yellowstone, and these sudden shifts in behavior have also been observed in reasoning patterns and jailbreaks. This phenomenon is not unique to autoregressive models: in diffusion models, key features of the final output are decided in narrow ``critical windows'' of the generation process. In this work we develop a simple, unifying theory to explain this phenomenon. We show that it emerges generically as the generation process localizes to a sub-population of the distribution it models. While critical windows have been studied at length in diffusion models, existing theory heavily relies on strong distributional assumptions and the particulars of Gaussian diffusion. In contrast to existing work our theory (1) applies to autoregressive and diffusion models; (2) makes no distributional assumptions; (3) quantitatively improves previous bounds even when specialized to diffusions; and (4) requires basic tools and no stochastic calculus or statistical physics-based machinery. We also identify an intriguing connection to the all-or-nothing phenomenon from statistical inference. Finally, we validate our predictions empirically for LLMs and find that critical windows often coincide with failures in problem solving for various math and reasoning benchmarks.

large language model, machine learning, natural language, (20 more...)

2502.00921

Country:

North America > United States (0.14)
Asia > China (0.04)
Asia > Japan (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceFeb-2-2025

Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling

Cai, Jianfeng, Zhu, Jinhua, Sun, Ruopei, Wang, Yue, Li, Li, Zhou, Wengang, Li, Houqiang

Reinforcement Learning from Human Feedback (RLHF) has achieved considerable success in aligning large language models (LLMs) by modeling human preferences with a learnable reward model and employing a reinforcement learning algorithm to maximize the reward model's scores. However, these reward models are susceptible to exploitation through various superficial confounding factors, with length bias emerging as a particularly significant concern. Moreover, while the pronounced impact of length bias on preference modeling suggests that LLMs possess an inherent sensitivity to length perception, our preliminary investigations reveal that fine-tuned LLMs consistently struggle to adhere to explicit length instructions. To address these two limitations, we propose a novel framework wherein the reward model explicitly differentiates between human semantic preferences and response length requirements. Specifically, we introduce a Response-conditioned Bradley-Terry (Rc-BT) model that enhances the reward model's capability in length bias mitigating and length instruction following, through training on our augmented dataset. Furthermore, we propose the Rc-DPO algorithm to leverage the Rc-BT model for direct policy optimization (DPO) of LLMs, simultaneously mitigating length bias and promoting adherence to length instructions. Extensive evaluations demonstrate that our approach substantially improves both preference modeling and length instruction compliance, with its effectiveness validated across various foundational models and preference datasets.

large language model, length instruction, machine learning, (16 more...)

2502.00814

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Oceania > Australia (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Mildner, Terje, Hamelijnck, Oliver, Giampouras, Paris, Damoulas, Theodoros

Federated Generalised Variational Inference: A Robust Probabilistic Federated Learning Framework

arXiv.org Machine LearningFeb-2-2025

We introduce FedGVI, a probabilistic Federated Learning (FL) framework that is provably robust to both prior and likelihood misspecification. FedGVI addresses limitations in both frequentist and Bayesian FL by providing unbiased predictions under model misspecification, with calibrated uncertainty quantification. Our approach generalises previous FL approaches, specifically Partitioned Variational Inference (Ashman et al., 2022), by allowing robust and conjugate updates, decreasing computational complexity at the clients. We offer theoretical analysis in terms of fixed-point convergence, optimality of the cavity distribution, and provable robustness. Additionally, we empirically demonstrate the effectiveness of FedGVI in terms of improved robustness and predictive performance on multiple synthetic and real world classification data sets.

artificial intelligence, machine learning, posterior, (13 more...)

2502.00846

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

arXiv.org Machine LearningFeb-1-2025

LLM Safety Alignment is Divergence Estimation in Disguise

Haldar, Rajdeep, Wang, Ziyi, Song, Qifan, Lin, Guang, Xing, Yue

We propose a theoretical framework demonstrating that popular Large Language Model (LLM) alignment methods, including Reinforcement Learning from Human Feedback (RLHF) and alternatives, fundamentally function as divergence estimators between aligned (preferred or safe) and unaligned (less-preferred or harmful) distributions. This explains the separation phenomenon between safe and harmful prompts in the model hidden representation after alignment. Inspired by the theoretical results, we identify that some alignment methods are better than others in terms of separation and, introduce a new method, KLDO, and further demonstrate the implication of our theories. We advocate for compliance-refusal datasets over preference datasets to enhance safety alignment, supported by both theoretical reasoning and empirical evidence. Additionally, to quantify safety separation, we leverage a distance metric in the representation space and statistically validate its efficacy as a statistical significant indicator of LLM resilience against jailbreak attacks.

large language model, machine learning, natural language, (15 more...)

2502.00657

Country:

North America > United States > Michigan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningFeb-1-2025

Stochastic Linear Bandits with Latent Heterogeneity

Chen, Elynn, Chen, Xi, Jing, Wenbo, Liu, Xiao

This paper addresses the critical challenge of latent heterogeneity in online decision-making, where individual responses to business actions vary due to unobserved characteristics. While existing approaches in data-driven decision-making have focused on observable heterogeneity through contextual features, they fall short when heterogeneity stems from unobservable factors such as lifestyle preferences and personal experiences. We propose a novel latent heterogeneous bandit framework that explicitly models this unobserved heterogeneity in customer responses, with promotion targeting as our primary example. Our methodology introduces an innovative algorithm that simultaneously learns latent group memberships and group-specific reward functions. Through theoretical analysis and empirical validation using data from a mobile commerce platform, we establish high-probability bounds for parameter estimation, convergence rates for group classification, and comprehensive regret bounds. Notably, our theoretical analysis reveals two distinct types of regret measures: a ``strong regret'' against an oracle with perfect knowledge of customer memberships, which remains non-sub-linear due to inherent classification uncertainty, and a ``regular regret'' against an oracle aware only of deterministic components, for which our algorithm achieves a sub-linear rate that is minimax optimal in horizon length and dimension. We further demonstrate that existing bandit algorithms ignoring latent heterogeneity incur constant average regret that accumulates linearly over time. Our framework provides practitioners with new tools for decision-making under latent heterogeneity and extends to various business applications, including personalized pricing, resource allocation, and inventory management.

data mining, heterogeneity, machine learning, (18 more...)

2502.00423

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Artificial IntelligenceJan-31-2025

Statistical Physics of Deep Neural Networks: Generalization Capability, Beyond the Infinite Width, and Feature Learning

Ariosto, Sebastiano

Deep Neural Networks (DNNs) excel at many tasks, often rivaling or surpassing human performance. Yet their internal processes remain elusive, frequently described as "black boxes." While performance can be refined experimentally, achieving a fundamental grasp of their inner workings is still a challenge. Statistical Mechanics has long tackled computational problems, and this thesis applies physics-based insights to understand DNNs via three complementary approaches. First, by averaging over data, we derive an asymptotic bound on generalization that depends solely on the size of the last layer, rather than on the total number of parameters -- revealing how deep architectures process information differently across layers. Second, adopting a data-dependent viewpoint, we explore a finite-width thermodynamic limit beyond the infinite-width regime. This leads to: (i) a closed-form expression for the generalization error in a finite-width one-hidden-layer network (regression task); (ii) an approximate partition function for deeper architectures; and (iii) a link between deep networks in this thermodynamic limit and Student's t-processes. Finally, from a task-explicit perspective, we present a preliminary analysis of how DNNs interact with a controlled dataset, investigating whether they truly internalize its structure -- collapsing to the teacher -- or merely memorize it. By understanding when a network must learn data structure rather than just memorize, it sheds light on fostering meaningful internal representations. In essence, this thesis leverages the synergy between Statistical Physics and Machine Learning to illuminate the inner behavior of DNNs.

deep convolutional generative adversarial network, deep learning, hierarchical cluster-based deep neural network, (17 more...)

2501.19281

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.92)
Law Enforcement & Public Safety > Fraud (0.67)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Chang, Yingshan, Bisk, Yonatan

Model Successor Functions

arXiv.org Machine LearningJan-31-2025

The notion of generalization has moved away from the classical one defined in statistical learning theory towards an emphasis on out-of-domain generalization (OODG). Recently, there is a growing focus on inductive generalization, where a progression of difficulty implicitly governs the direction of domain shifts. In inductive generalization, it is often assumed that the training data lie in the easier side, while the testing data lie in the harder side. The challenge is that training data are always finite, but a learner is expected to infer an inductive principle that could be applied in an unbounded manner. This emerging regime has appeared in the literature under different names, such as length/logical/algorithmic extrapolation, but a formal definition is lacking. This work provides such a formalization that centers on the concept of model successors. Then we outline directions to adapt well-established techniques towards the learning of model successors. This work calls for restructuring of the research discussion around inductive generalization from fragmented task-centric communities to a more unified effort, focused on universal properties of learning and computation.

artificial intelligence, generalization, machine learning, (16 more...)

2502.00197

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(12 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)