AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Unified Convergence Theory of Stochastic and Variance-Reduced Cubic Newton Methods

Chayti, El Mahdi, Doikov, Nikita, Jaggi, Martin

arXiv.org Artificial IntelligenceSep-6-2023

We study stochastic Cubic Newton methods for solving general possibly non-convex minimization problems. We propose a new framework, which we call the helper framework, that provides a unified view of the stochastic and variance-reduced second-order algorithms equipped with global complexity guarantees. It can also be applied to learning with auxiliary information. Our helper framework offers the algorithm designer high flexibility for constructing and analyzing the stochastic Cubic Newton methods, allowing arbitrary size batches, and the use of noisy and possibly biased estimates of the gradients and Hessians, incorporating both the variance reduction and the lazy Hessian updates. We recover the best-known complexities for the stochastic and variance-reduced Cubic Newton, under weak assumptions on the noise. A direct consequence of our theory is the new lazy stochastic second-order method, which significantly improves the arithmetic complexity for large dimension problems. We also establish complexity bounds for the classes of gradient-dominated objectives, that include convex and strongly convex problems. For Auxiliary Learning, we show that using a helper (auxiliary function) can outperform training alone if a given similarity measure is small.

hessian, newton method, optimization, (15 more...)

arXiv.org Artificial Intelligence

2302.11962

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

Doikov, Nikita, Grapiglia, Geovani Nunes

arXiv.org Artificial IntelligenceSep-5-2023

In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of $\mathcal{O}( n^{1/2} \epsilon^{-3/2})$ function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} \epsilon^{-3/2} )$ function evaluations for the derivative-free method, where $n$ is the dimension of the problem and $\epsilon$ is the desired accuracy for the gradient norm. These complexity bounds significantly improve the previously known ones in terms of the joint dependence on $n$ and $\epsilon$, for the first-order and zeroth-order non-convex optimization.

approximation, complexity, implementation, (14 more...)

arXiv.org Artificial Intelligence

2309.02412

Country:

North America > United States (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Belgium (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)

Add feedback

Model-agnostic network inference enhancement from noisy measurements via curriculum learning

Wu, Kai, Li, Yuanyuan, Liu, Jing

arXiv.org Artificial IntelligenceSep-5-2023

Noise is a pervasive element within real-world measurement data, significantly undermining the performance of network inference models. However, the quest for a comprehensive enhancement framework capable of bolstering noise resistance across a diverse array of network inference models has remained elusive. Here, we present an elegant and efficient framework tailored to amplify the capabilities of network inference models in the presence of noise. Leveraging curriculum learning, we mitigate the deleterious impact of noisy samples on network inference models. Our proposed framework is model-agnostic, seamlessly integrable into a plethora of model-based and model-free network inference methods. Notably, we utilize one model-based and three model-free network inference methods as the foundation. Extensive experimentation across various synthetic and real-world networks, encapsulating diverse nonlinear dynamic processes, showcases substantial performance augmentation under varied noise types, particularly thriving in scenarios enriched with clean samples. This framework's adeptness in fortifying both model-free and model-based network inference methodologies paves the avenue towards a comprehensive and unified enhancement framework, encompassing the entire spectrum of network inference models. Available Code: https://github.com/xiaoyuans/MANIE.

inference, manie framework, noise, (13 more...)

arXiv.org Artificial Intelligence

2309.0205

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Locally Stationary Graph Processes

Canbolat, Abdullah, Vural, Elif

arXiv.org Machine LearningSep-4-2023

Stationary graph process models are commonly used in the analysis and inference of data sets collected on irregular network topologies. While most of the existing methods represent graph signals with a single stationary process model that is globally valid on the entire graph, in many practical problems, the characteristics of the process may be subject to local variations in different regions of the graph. In this work, we propose a locally stationary graph process (LSGP) model that aims to extend the classical concept of local stationarity to irregular graph domains. We characterize local stationarity by expressing the overall process as the combination of a set of component processes such that the extent to which the process adheres to each component varies smoothly over the graph. We propose an algorithm for computing LSGP models from realizations of the process, and also study the approximation of LSGPs locally with WSS processes. Experiments on signal interpolation problems show that the proposed process model provides accurate signal representations competitive with the state of the art.

artificial intelligence, machine learning, optimization problem, (20 more...)

arXiv.org Machine Learning

2309.01657

Country:

North America > United States (0.14)
Europe > France (0.04)
Europe > Albania > Durrës County > Durrës (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.87)
(2 more...)

Add feedback

Logarithmic Regret in Multisecretary and Online Linear Programs with Continuous Valuations

Bray, Robert L.

arXiv.org Artificial IntelligenceAug-28-2023

I study how the shadow prices of a linear program that allocates an endowment of $n\beta \in \mathbb{R}^{m}$ resources to $n$ customers behave as $n \rightarrow \infty$. I show the shadow prices (i) adhere to a concentration of measure, (ii) converge to a multivariate normal under central-limit-theorem scaling, and (iii) have a variance that decreases like $\Theta(1/n)$. I use these results to prove that the expected regret in \cites{Li2019b} online linear program is $\Theta(\log n)$, both when the customer variable distribution is known upfront and must be learned on the fly. I thus tighten \citeauthors{Li2019b} upper bound from $O(\log n \log \log n)$ to $O(\log n)$, and extend \cites{Lueker1995} $\Omega(\log n)$ lower bound to the multi-dimensional setting. I illustrate my new techniques with a simple analysis of \cites{Arlotto2019} multisecretary problem.

artificial intelligence, linear program, shadow price, (17 more...)

arXiv.org Artificial Intelligence

1912.08917

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.90)

Add feedback

MF-OMO: An Optimization Formulation of Mean-Field Games

Guo, Xin, Hu, Anran, Zhang, Junzi

arXiv.org Artificial IntelligenceAug-27-2023

This paper proposes a new mathematical paradigm to analyze discrete-time mean-field games. It is shown that finding Nash equilibrium solutions for a general class of discrete-time mean-field games is equivalent to solving an optimization problem with bounded variables and simple convex constraints, called MF-OMO. This equivalence framework enables finding multiple (and possibly all) Nash equilibrium solutions of mean-field games by standard algorithms. For instance, projected gradient descent is shown to be capable of retrieving all possible Nash equilibrium solutions when there are finitely many of them, with proper initializations. Moreover, analyzing mean-field games with linear rewards and mean-field independent dynamics is reduced to solving a finite number of linear programs, hence solvable in finite time. This framework does not rely on the contractive and the monotone assumptions and the uniqueness of the Nash equilibrium.

artificial intelligence, machine learning, nash equilibrium solution, (16 more...)

arXiv.org Artificial Intelligence

2206.09608

Country:

North America > United States > New York (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

On the lifting and reconstruction of nonlinear systems with multiple attractors

Pan, Shaowu, Duraisamy, Karthik

arXiv.org Artificial IntelligenceAug-25-2023

The Koopman operator provides a linear perspective on non-linear dynamics by focusing on the evolution of observables in an invariant subspace. Observables of interest are typically linearly reconstructed from the Koopman eigenfunctions. Despite the broad use of Koopman operators over the past few years, there exist some misconceptions about the applicability of Koopman operators to dynamical systems with more than one fixed point. In this work, an explanation is provided for the mechanism of lifting for the Koopman operator of nonlinear systems with multiple attractors. Considering the example of the Duffing oscillator, we show that by exploiting the inherent symmetry between the basins of attraction, a linear reconstruction with three degrees of freedom in the Koopman observable space is sufficient to globally linearize the system.

artificial intelligence, machine learning, reconstruction, (16 more...)

arXiv.org Artificial Intelligence

2304.1186

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

The SO(3) and SE(3) Lie Algebras of Rigid Body Rotations and Motions and their Application to Discrete Integration, Gradient Descent Optimization, and State Estimation

Gallo, Eduardo

arXiv.org Artificial IntelligenceAug-17-2023

Classical mathematical techniques such as discrete integration, gradient descent optimization, and state estimation (exemplified by the Runge-Kutta method, Gauss-Newton minimization, and extended Kalman filter or EKF, respectively), rely on linear algebra and hence are only applicable to state vectors belonging to Euclidean spaces when implemented as described in the literature. This document discusses how to modify these methods so they can be applied to non-Euclidean state vectors, such as those containing rotations and full motions of rigid bodies. To do so, this document provides an in-depth review of the concept of manifolds or Lie groups, together with their tangent spaces or Lie algebras, their exponential and logarithmic maps, the analysis of perturbations, the treatment of uncertainty and covariance, and in particular the definitions of the Jacobians required to employ the previously mentioned calculus methods. These concepts are particularized to the specific cases of the SO(3) and SE(3) Lie groups, known as the special orthogonal and special Euclidean groups of R3, which represent the rigid body rotations and motions, describing their various possible parameterizations as well as their advantages and disadvantages.

artificial intelligence, machine learning, rigid body rotation and motion, (12 more...)

arXiv.org Artificial Intelligence

2205.12572

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

Add feedback

Why Isaac Newton's laws still give physicists a lot to think about

New ScientistAug-16-2023, 18:00:00 GMT

THERE are two kinds of theoretical physicists: those who use the correct equation for calculating distances in space-time, and those who don't. Obviously, I'm being a bit tongue in cheek here, but I'm serious when I say there is a real bifurcation. The two formulations of this equation for distance – what we call the metric – are equivalent and when used correctly will give the same answers for all calculations. But each group has reasons for believing that one is more natural than the other. Particle physicists tend to use one; relativists (people trained in general relativity) tend to use the other.

equation, isaac newton, physicist

New Scientist

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities

Mandi, Jayanta, Kotary, James, Berden, Senne, Mulamba, Maxime, Bucarey, Victor, Guns, Tias, Fioretto, Ferdinando

arXiv.org Artificial IntelligenceAug-16-2023

Decision-focused learning (DFL) is an emerging paradigm in machine learning which trains a model to optimize decisions, integrating prediction and optimization in an end-to-end system. This paradigm holds the promise to revolutionize decision-making in many real-world applications which operate under uncertainty, where the estimation of unknown parameters within these decision models often becomes a substantial roadblock. This paper presents a comprehensive review of DFL. It provides an in-depth analysis of the various techniques devised to integrate machine learning and optimization models, introduces a taxonomy of DFL methods distinguished by their unique characteristics, and conducts an extensive empirical evaluation of these methods proposing suitable benchmark dataset and tasks for DFL. Finally, the study provides valuable insights into current and potential future avenues in DFL research.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.13565

Country:

North America > Canada > Alberta (0.27)
Europe > Greece (0.14)
Europe > Belgium > Flanders (0.14)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Energy > Power Industry (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)

Add feedback