AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

Huang, Audrey, Zhan, Wenhao, Xie, Tengyang, Lee, Jason D., Sun, Wen, Krishnamurthy, Akshay, Foster, Dylan J.

arXiv.org Artificial IntelligenceJul-18-2024

Language model alignment methods, such as reinforcement learning from human feedback (RLHF), have led to impressive advances in language model capabilities, but existing techniques are limited by a widely observed phenomenon known as overoptimization, where the quality of the language model plateaus or degrades over the course of the alignment process. Overoptimization is often attributed to overfitting to an inaccurate reward model, and while it can be mitigated through online data collection, this is infeasible in many settings. This raises a fundamental question: Do existing offline alignment algorithms make the most of the data they have, or can their sample-efficiency be improved further? We address this question with a new algorithm for offline alignment, $\chi^2$-Preference Optimization ($\chi$PO). $\chi$PO is a one-line change to Direct Preference Optimization (DPO; Rafailov et al., 2023), which only involves modifying the logarithmic link function in the DPO objective. Despite this minimal change, $\chi$PO implicitly implements the principle of pessimism in the face of uncertainty via regularization with the $\chi^2$-divergence -- which quantifies uncertainty more effectively than KL-regularization -- and provably alleviates overoptimization, achieving sample-complexity guarantees based on single-policy concentrability -- the gold standard in offline reinforcement learning. $\chi$PO's simplicity and strong guarantees make it the first practical and general-purpose offline alignment algorithm that is provably robust to overoptimization.

chi-squared preference optimization, kl-regularization, overparameterization, (3 more...)

arXiv.org Artificial Intelligence

2407.13399

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.44)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Social learning with complex contagion

Chiba-Okabe, Hiroaki, Plotkin, Joshua B.

arXiv.org Artificial IntelligenceJul-16-2024

We introduce a mathematical model that combines the concepts of complex contagion with payoff-biased imitation, to describe how social behaviors spread through a population. Traditional models of social learning by imitation are based on simple contagion -- where an individual may imitate a more successful neighbor following a single interaction. Our framework generalizes this process to incorporate complex contagion, which requires multiple exposures before an individual considers adopting a different behavior. We formulate this as a discrete time and state stochastic process in a finite population, and we derive its continuum limit as an ordinary differential equation that generalizes the replicator equation, the most widely used dynamical model in evolutionary game theory. When applied to linear frequency-dependent games, our social learning with complex contagion produces qualitatively different outcomes than traditional imitation dynamics: it can shift the Prisoner's Dilemma from a unique all-defector equilibrium to either a stable mixture of cooperators and defectors in the population, or a bistable system; it changes the Snowdrift game from a single to a bistable equilibrium; and it can alter the Coordination game from bistability at the boundaries to two internal equilibria. The long-term outcome depends on the balance between the complexity of the contagion process and the strength of selection that biases imitation towards more successful types. Our analysis intercalates the fields of evolutionary game theory with complex contagions, and it provides a synthetic framework that describes more realistic forms of behavioral change in social systems.

complex contagion, contagion, payoff-biased imitation, (14 more...)

arXiv.org Artificial Intelligence

2406.14922

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Pennsylvania (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Education > Curriculum (0.81)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

Add feedback

Learning to Represent Surroundings, Anticipate Motion and Take Informed Actions in Unstructured Environments

Zhi, Weiming

arXiv.org Artificial IntelligenceJul-14-2024

Contemporary robots have become exceptionally skilled at achieving specific tasks in structured environments. However, they often fail when faced with the limitless permutations of real-world unstructured environments. This motivates robotics methods which learn from experience, rather than follow a pre-defined set of rules. In this thesis, we present a range of learning-based methods aimed at enabling robots, operating in dynamic and unstructured environments, to better understand their surroundings, anticipate the actions of others, and take informed actions accordingly. In the first part of the thesis, we investigate methods which leverage learning to represent the structure and motion in a robot's operating environment, in a continuous manner.

computer vision and pattern recognition, sequential quadratic programming, stochastic process representation, (16 more...)

arXiv.org Artificial Intelligence

2407.10383

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Transportation (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(12 more...)

Add feedback

Statistical ranking with dynamic covariates

Dong, Pinjun, Han, Ruijian, Jiang, Binyan, Xu, Yiming

arXiv.org Machine LearningJul-8-2024

We consider a covariate-assisted ranking model grounded in the Plackett--Luce framework. Unlike existing works focusing on pure covariates or individual effects with fixed covariates, our approach integrates individual effects with dynamic covariates. This added flexibility enhances realistic ranking yet poses significant challenges for analyzing the associated estimation procedures. This paper makes an initial attempt to address these challenges. We begin by discussing the sufficient and necessary condition for the model's identifiability. We then introduce an efficient alternating maximization algorithm to compute the maximum likelihood estimator (MLE). Under suitable assumptions on the topology of comparison graphs and dynamic covariates, we establish a quantitative uniform consistency result for the MLE with convergence rates characterized by the asymptotic graph connectivity. The proposed graph topology assumption holds for several popular random graph models under optimal leading-order sparsity conditions. A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.

assumption 2, covariate, exp, (15 more...)

arXiv.org Machine Learning

2406.16507

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Discounted Pseudocosts in MILP

Patel, Krunal Kishor

arXiv.org Artificial IntelligenceJul-7-2024

In this article, we introduce the concept of discounted pseudocosts, inspired by discounted total reward in reinforcement learning, and explore their application in mixed-integer linear programming (MILP). Traditional pseudocosts estimate changes in the objective function due to variable bound changes during the branch-and-bound process. By integrating reinforcement learning concepts, we propose a novel approach incorporating a forward-looking perspective into pseudocost estimation. We present the motivation behind discounted pseudocosts and discuss how they represent the anticipated reward for branching after one level of exploration in the MILP problem space. Initial experiments on MIPLIB 2017 benchmark instances demonstrate the potential of discounted pseudocosts to enhance branching strategies and accelerate the solution process for challenging MILP problems.

experiment, pseudocost, reinforcement, (15 more...)

arXiv.org Artificial Intelligence

2407.06237

Country:

North America > Canada > Quebec > Montreal (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.56)

Add feedback

Can Machines Learn the True Probabilities?

Kim, Jinsook

arXiv.org Artificial IntelligenceJul-7-2024

When there exists uncertainty, AI machines are The outline of the proof is as follows. After defining some designed to make decisions so as to reach the main concepts, we identify the Success Criterion and the best expected outcomes. Expectations are based necessary condition for any machine to learn the true objective on true facts about the objective environment the probabilities. From these conditions, we derive machines interact with, and those facts can be the theorem that learning implies the true guarantee of encoded into AI models in the form of true objective well-calibration. Roughly speaking, "truly guaranteed wellcalibration" probability functions.

probability, theorem 4, true probability, (12 more...)

arXiv.org Artificial Intelligence

2407.05526

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

Orvieto, Antonio, Xiao, Lin

arXiv.org Artificial IntelligenceJul-5-2024

We consider the problem of minimizing the average of a large number of smooth but possibly non-convex functions. In the context of most machine learning applications, each loss function is non-negative and thus can be expressed as the composition of a square and its real-valued square root. This reformulation allows us to apply the Gauss-Newton method, or the Levenberg-Marquardt method when adding a quadratic regularization. The resulting algorithm, while being computationally as efficient as the vanilla stochastic gradient method, is highly adaptive and can automatically warmup and decay the effective stepsize while tracking the non-negative loss landscape. We provide a tight convergence analysis, leveraging new techniques, in the stochastic convex and non-convex settings. In particular, in the convex case, the method does not require access to the gradient Lipshitz constant for convergence, and is guaranteed to never diverge. The convergence rates and empirical evaluations compare favorably to the classical (stochastic) gradient method as well as to several other adaptive methods.

hyperparameter, ngn, stepsize, (17 more...)

arXiv.org Artificial Intelligence

2407.04358

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

On Differentially Private U Statistics

Chaudhuri, Kamalika, Loh, Po-Ling, Pandey, Shourya, Sarkar, Purnamrita

arXiv.org Artificial IntelligenceJul-5-2024

We consider the problem of privately estimating a parameter $\mathbb{E}[h(X_1,\dots,X_k)]$, where $X_1$, $X_2$, $\dots$, $X_k$ are i.i.d. data from some distribution and $h$ is a permutation-invariant function. Without privacy constraints, standard estimators are U-statistics, which commonly arise in a wide range of problems, including nonparametric signed rank tests, symmetry testing, uniformity testing, and subgraph counts in random networks, and can be shown to be minimum variance unbiased estimators under mild conditions. Despite the recent outpouring of interest in private mean estimation, privatizing U-statistics has received little attention. While existing private mean estimation algorithms can be applied to obtain confidence intervals, we show that they can lead to suboptimal private error, e.g., constant-factor inflation in the leading term, or even $\Theta(1/n)$ rather than $O(1/n^2)$ in degenerate settings. To remedy this, we propose a new thresholding-based approach using \emph{local H\'ajek projections} to reweight different subsets of the data. This leads to nearly optimal private error for non-degenerate U-statistics and a strong indication of near-optimality for degenerate U-statistics.

algorithm, inequality, probability, (14 more...)

arXiv.org Artificial Intelligence

2407.04945

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.81)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Add feedback

Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials

Chen, August Y., Sekhari, Ayush, Sridharan, Karthik

arXiv.org Artificial IntelligenceJul-5-2024

We study the problem of non-convex optimization using Stochastic Gradient Langevin Dynamics (SGLD). SGLD is a natural and popular variation of stochastic gradient descent where at each step, appropriately scaled Gaussian noise is added. To our knowledge, the only strategy for showing global convergence of SGLD on the loss function is to show that SGLD can sample from a stationary distribution which assigns larger mass when the function is small (the Gibbs measure), and then to convert these guarantees to optimization results. We employ a new strategy to analyze the convergence of SGLD to global minima, based on Lyapunov potentials and optimization. We convert the same mild conditions from previous works on SGLD into geometric properties based on Lyapunov potentials. This adapts well to the case with a stochastic gradient oracle, which is natural for machine learning applications where one wants to minimize population loss but only has access to stochastic gradients via minibatch training samples. Here we provide 1) improved rates in the setting of previous works studying SGLD for optimization, 2) the first finite gradient complexity guarantee for SGLD where the function is Lipschitz and the Gibbs measure defined by the function satisfies a Poincar\'e Inequality, and 3) prove if continuous-time Langevin Dynamics succeeds for optimization, then discrete-time SGLD succeeds under mild regularity assumptions.

assumption 2, lemma 6, probability, (14 more...)

arXiv.org Artificial Intelligence

2407.04264

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(3 more...)

Genre:

Research Report (0.50)
Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Jin, Tianyu, Maierhofer, Georg, Schratz, Katharina, Xiang, Yang

arXiv.org Artificial IntelligenceJul-4-2024

The use of implicit time-stepping schemes for the numerical approximation of solutions to stiff nonlinear time-evolution equations brings well-known advantages including, typically, better stability behaviour and corresponding support of larger time steps, and better structure preservation properties. However, this comes at the price of having to solve a nonlinear equation at every time step of the numerical scheme. In this work, we propose a novel operator learning based hybrid Newton's method to accelerate this solution of the nonlinear time step system for stiff time-evolution nonlinear equations. We propose a targeted learning strategy which facilitates robust unsupervised learning in an offline phase and provides a highly efficient initialisation for the Newton iteration leading to consistent acceleration of Newton's method. A quantifiable rate of improvement in Newton's method achieved by improved initialisation is provided and we analyse the upper bound of the generalisation error of our unsupervised learning strategy. These theoretical results are supported by extensive numerical results, demonstrating the efficiency of our proposed neural hybrid solver both in one- and two-dimensional cases.

hybrid solver, iteration, solver, (17 more...)

arXiv.org Artificial Intelligence

2407.03945

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback