AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Kernel Interpolation with Sparse Grids

Neural Information Processing SystemsJan-17-2025, 20:05:20 GMT

Structured kernel interpolation (SKI) accelerates Gaussian processes (GP) inference by interpolating the kernel covariance function using a dense grid of inducing points, whose corresponding kernel matrix is highly structured and thus amenable to fast linear algebra. Unfortunately, SKI scales poorly in the dimension of the input points, since the dense grid size grows exponentially with the dimension. To mitigate this issue, we propose the use of sparse grids within the SKI framework. These grids enable accurate interpolation, but with a number of points growing more slowly with dimension. We contribute a novel nearly linear time matrix-vector multiplication algorithm for the sparse grid kernel matrix.

dimension, kernel interpolation, sparse grid, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Joint Feature and Differentiable k -NN Graph Learning using Dirichlet Energy

Neural Information Processing SystemsJan-17-2025, 18:42:42 GMT

Feature selection (FS) plays an important role in machine learning, which extracts important features and accelerates the learning process. In this paper, we propose a deep FS method that simultaneously conducts feature selection and differentiable k -NN graph learning based on the Dirichlet Energy. The Dirichlet Energy identifies important features by measuring their smoothness on the graph structure, and facilitates the learning of a new graph that reflects the inherent structure in new feature subspace. We employ Optimal Transport theory to address the non-differentiability issue of learning k -NN graphs in neural networks, which theoretically makes our method applicable to other graph neural networks for dynamic graph learning. Furthermore, the proposed framework is interpretable, since all modules are designed algorithmically.

dirichlet energy, graph learning, joint feature, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.56)

Add feedback

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

Jansen, Christoph

arXiv.org Machine LearningJan-17-2025

This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed by the ten publications listed on page 5 under the name Contributions 1 to 10. The references to the complete versions of these articles are also found in this list, making them as easily accessible as possible for readers wishing to dive deep into the different research projects. The chapters following this thesis, namely Parts A to C and the concluding remarks, serve to place the articles in a larger scientific context, to (briefly) explain their respective content on a less formal level, and to highlight some interesting perspectives for future research in their respective contexts. Naturally, therefore, the following presentation has neither the level of detail nor the formal rigor that can (hopefully) be found in the papers. The purpose of the following text is to provide the reader an easy and high-level access to this interesting and important research field as a whole, thereby, advertising it to a broader audience.

artificial intelligence, decision support system, machine learning, (18 more...)

arXiv.org Machine Learning

2501.10195

Country: Europe (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Convergence and Stability of Graph Convolutional Networks on Large Random Graphs

Neural Information Processing SystemsJan-15-2025, 21:09:34 GMT

We study properties of Graph Convolutional Networks (GCNs) by analyzing their behavior on standard models of random graphs, where nodes are represented by random latent variables and edges are drawn according to a similarity kernel. This allows us to overcome the difficulties of dealing with discrete notions such as isomorphisms on very large graphs, by considering instead more natural geometric aspects. We first study the convergence of GCNs to their continuous counterpart as the number of nodes grows. Our results are fully non-asymptotic and are valid for relatively sparse graphs with an average degree that grows logarithmically with the number of nodes. We then analyze the stability of GCNs to small deformations of the random graph model.

convergence and stability, graph convolutional network, random graph, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.91)

Add feedback

Higher Order Kernel Mean Embeddings to Capture Filtrations of Stochastic Processes

Neural Information Processing SystemsJan-14-2025, 22:25:32 GMT

Stochastic processes are random variables with values in some space of paths. However, reducing a stochastic process to a path-valued random variable ignores its filtration, i.e. the flow of information carried by the process through time. By conditioning the process on its filtration, we introduce a family of higher order kernel mean embeddings (KMEs) that generalizes the notion of KME to capture additional information related to the filtration. We derive empirical estimators for the associated higher order maximum mean discrepancies (MMDs) and prove consistency. We then construct a filtration-sensitive kernel two-sample test able to capture information that gets missed by the standard MMD test.

capture filtration, higher order kernel mean embedding, stochastic process, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.99)

Add feedback

A Combinatorial Algorithm for Approximating the Optimal Transport in the Parallel and MPC Settings

Neural Information Processing SystemsJan-14-2025, 15:58:15 GMT

Optimal Transport is a popular distance metric for measuring similarity between distributions. Exact and approximate combinatorial algorithms for computing the optimal transport distance are hard to parallelize. This has motivated the development of numerical solvers (e.g. Sinkhorn method) that can exploit GPU parallelism and produce approximate solutions. We introduce the first parallel combinatorial algorithm to find an additive \varepsilon -approximation of the OT distance.

algorithm, artificial intelligence, optimal transport, (5 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas > Downstream (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.94)

Add feedback

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Neural Information Processing SystemsJan-13-2025, 20:18:29 GMT

The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. We prove their convergence to the objective distribution in terms of KL-divergence under the sole assumptions of smoothness and Log-Sobolev inequality which are weaker conditions than those used in prior works for these algorithms. With the batch size and the inner loop length set to \sqrt{n}, the gradient complexity to achieve an \epsilon -precision is \tilde{O}((n dn {1/2}\epsilon {-1})\gamma 2 L 2\alpha {-2}), which is an improvement from any previous analyses.

gradient langevin dynamic, langevin dynamic, stochastic gradient langevin dynamic, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Improving Incremental Nonlinear Dynamic Inversion Robustness Using Robust Control in Aerial Robotics

Hachem, Mohamad, Roos, Clément, Miquel, Thierry, Bronz, Murat

arXiv.org Artificial IntelligenceJan-13-2025

Improving robustness to uncertainty and rejection of external disturbances represents a significant challenge in aerial robotics. Nonlinear controllers based on Incremental Nonlinear Dynamic Inversion (INDI), known for their ability in estimating disturbances through measured-filtered data, have been notably used in such applications. Typically, these controllers comprise two cascaded loops: an inner loop employing nonlinear dynamic inversion and an outer loop generating the virtual control inputs via linear controllers. In this paper, a novel methodology is introduced, that combines the advantages of INDI with the robustness of linear structured $\mathcal{H}_\infty$ controllers. A full cascaded architecture is proposed to control the dynamics of a multirotor drone, covering both stabilization and guidance. In particular, low-order $\mathcal{H}_\infty$ controllers are designed for the outer loop by properly structuring the problem and solving it through non-smooth optimization. A comparative analysis is conducted between an existing INDI/PD approach and the proposed INDI/$\mathcal{H}_\infty$ strategy, showing a notable enhancement in the rejection of external disturbances. It is carried out first using MATLAB simulations involving a nonlinear model of a Parrot Bebop quadcopter drone, and then experimentally using a customized quadcopter built by the ENAC team. The results show an improvement of more than 50\% in the rejection of disturbances such as gusts.

artificial intelligence, controller, disturbance, (17 more...)

arXiv.org Artificial Intelligence

2501.07223

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report (0.70)

Industry:

Aerospace & Defense (0.67)
Energy (0.47)
Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.81)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Add feedback

Stochastic Process Learning via Operator Flow Matching

Shi, Yaozhong, Ross, Zachary E., Asimaki, Domniki, Azizzadenesheli, Kamyar

arXiv.org Artificial IntelligenceJan-8-2025

Expanding on neural operators, we propose a novel framework for stochastic process learning across arbitrary domains. In particular, we develop operator flow matching (OFM) for learning stochastic process priors on function spaces. OFM provides the probability density of the values of any collection of points and enables mathematically tractable functional regression at new points with mean and density estimation. Our method outperforms state-of-the-art models in stochastic process learning, functional regression, and prior learning.

ofm, posterior sample, regression, (13 more...)

arXiv.org Artificial Intelligence

2501.04126

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Matrix Calculus (for Machine Learning and Beyond)

Bright, Paige, Edelman, Alan, Johnson, Steven G.

arXiv.org Machine LearningJan-7-2025

This course, intended for undergraduates familiar with elementary calculus and linear algebra, introduces the extension of differential calculus to functions on more general vector spaces, such as functions that take as input a matrix and return a matrix inverse or factorization, derivatives of ODE solutions, and even stochastic derivatives of random functions. It emphasizes practical computational applications, such as large-scale optimization and machine learning, where derivatives must be re-imagined in order to be propagated through complicated calculations. The class also discusses efficiency concerns leading to "adjoint" or "reverse-mode" differentiation (a.k.a. "backpropagation"), and gives a gentle introduction to modern automatic differentiation (AD) techniques.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2501.14787

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Workflow (0.67)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback