AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Generative Modelling with Tensor Train approximations of Hamilton--Jacobi--Bellman equations

Sommer, David, Gruhlke, Robert, Kirstein, Max, Eigel, Martin, Schillings, Claudia

arXiv.org Machine LearningFeb-23-2024

Sampling from probability densities is a common challenge in fields such as Uncertainty Quantification (UQ) and Generative Modelling (GM). In GM in particular, the use of reverse-time diffusion processes depending on the log-densities of Ornstein-Uhlenbeck forward processes are a popular sampling tool. In Berner et al. [2022] the authors point out that these log-densities can be obtained by solution of a \textit{Hamilton-Jacobi-Bellman} (HJB) equation known from stochastic optimal control. While this HJB equation is usually treated with indirect methods such as policy iteration and unsupervised training of black-box architectures like Neural Networks, we propose instead to solve the HJB equation by direct time integration, using compressed polynomials represented in the Tensor Train (TT) format for spatial discretization. Crucially, this method is sample-free, agnostic to normalization constants and can avoid the curse of dimensionality due to the TT compression. We provide a complete derivation of the HJB equation's action on Tensor Train polynomials and demonstrate the performance of the proposed time-step-, rank- and degree-adaptive integration method on a nonlinear sampling task in 20 dimensions.

approximation, hjb equation, tt approximation, (15 more...)

arXiv.org Machine Learning

2402.15285

Country: Europe > Germany > Berlin (0.04)

Genre: Research Report (0.63)

Industry: Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.82)

Add feedback

Robust Learning of Noisy Time Series Collections Using Stochastic Process Models with Motion Codes

Bajaj, Chandrajit, Nguyen, Minh

arXiv.org Machine LearningFeb-21-2024

While time series classification and forecasting problems have been extensively studied, the cases of noisy time series data with arbitrary time sequence lengths have remained challenging. Each time series instance can be thought of as a sample realization of a noisy dynamical model, which is characterized by a continuous stochastic process. For many applications, the data are mixed and consist of several types of noisy time series sequences modeled by multiple stochastic processes, making the forecasting and classification tasks even more challenging. Instead of regressing data naively and individually to each time series type, we take a latent variable model approach using a mixtured Gaussian processes with learned spectral kernels. More specifically, we auto-assign each type of noisy time series data a signature vector called its motion code. Then, conditioned on each assigned motion code, we infer a sparse approximation of the corresponding time series using the concept of the most informative timestamps. Our unmixing classification approach involves maximizing the likelihood across all the mixed noisy time series sequences of varying lengths. This stochastic approach allows us to learn not only within a single type of noisy time series data but also across many underlying stochastic processes, giving us a way to learn multiple dynamical models in an integrated and robust manner. The different learned latent stochastic models allow us to generate specific sub-type forecasting. We provide several quantitative comparisons demonstrating the performance of our approach.

motion code, time sery, time sery classification, (13 more...)

arXiv.org Machine Learning

2402.14081

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ChatGPT in Linear Algebra: Strides Forward, Steps to Go

Bagno, Eli, Dana-Picard, Thierry, Reches, Shulamit

arXiv.org Artificial IntelligenceFeb-18-2024

As soon as a new technology emerges, the education community explores its affordances and the possibilities to apply it in education. In this paper, we analyze sessions with ChatGPT around topics in basic Linear Algebra. We reflect the process undertaken by the ChatGPT along the recent year in our area of interest, emphasising the vast improvement that has been done in grappling with Linear Algebra problems. In particular, the question whether this software can be a teaching assistant or even somehow replace the human teacher, is addressed. As of the time this paper is written, the answer is generally negative. For the small part where the answer can be positive, some reflections about an original instrumental genesis are given. Communication with the software gives the impression to talk to a human, and sometimes the question is whether the software understands the question or not. Therefore, the reader's attention is drawn to the fact that ChatGPT works on a statistical basis and not according to reflection and understanding.

chatgpt, linear algebra, linear transformation, (15 more...)

arXiv.org Artificial Intelligence

2403.15399

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Asia > Singapore (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.68)

Industry:

Education > Curriculum > Subject-Specific Education (0.69)
Education > Educational Setting (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.93)

Add feedback

An Accelerated Distributed Stochastic Gradient Method with Momentum

Huang, Kun, Pu, Shi, Nedić, Angelia

arXiv.org Artificial IntelligenceFeb-18-2024

In this paper, we introduce an accelerated distributed stochastic gradient method with momentum for solving the distributed optimization problem, where a group of $n$ agents collaboratively minimize the average of the local objective functions over a connected network. The method, termed ``Distributed Stochastic Momentum Tracking (DSMT)'', is a single-loop algorithm that utilizes the momentum tracking technique as well as the Loopless Chebyshev Acceleration (LCA) method. We show that DSMT can asymptotically achieve comparable convergence rates as centralized stochastic gradient descent (SGD) method under a general variance condition regarding the stochastic gradients. Moreover, the number of iterations (transient times) required for DSMT to achieve such rates behaves as $\mathcal{O}(n^{5/3}/(1-\lambda))$ for minimizing general smooth objective functions, and $\mathcal{O}(\sqrt{n/(1-\lambda)})$ under the Polyak-{\L}ojasiewicz (PL) condition. Here, the term $1-\lambda$ denotes the spectral gap of the mixing matrix related to the underlying network topology. Notably, the obtained results do not rely on multiple inter-node communications or stochastic gradient accumulation per iteration, and the transient times are the shortest under the setting to the best of our knowledge.

accelerated, momentum, stochastic gradient method

arXiv.org Artificial Intelligence

2402.09714

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Kernels and learning curves for Gaussian process regression on random graphs

Neural Information Processing SystemsFeb-16-2024, 11:36:08 GMT

We investigate how well Gaussian process regression can learn functions defined on graphs, using large regular random graphs as a paradigmatic example. Random-walk based kernels are shown to have some surprising properties: within the standard approximation of a locally tree-like graph structure, the kernel does not become constant, i.e.neighbouring function values do not become fully correlated, when the lengthscale \sigma of the kernel is made large. Instead the kernel attains a non-trivial limiting form, which we calculate. The fully correlated limit is reached only once loops become relevant, and we estimate where the crossover to this regime occurs. Our main subject are learning curves of Bayes error versus training set size.

gaussian process regression, kernel, random graph, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.65)

Add feedback

k-NN Regression Adapts to Local Intrinsic Dimension

Neural Information Processing SystemsFeb-16-2024, 07:53:09 GMT

Many nonparametric regressors were recently shown to converge at rates that depend only on the intrinsic dimension of data. These regressors thus escape the curse of dimension when high-dimensional data has low intrinsic dimension (e.g. a manifold). We show that k -NN regression is also adaptive to intrinsic dimension. In particular our rates are local to a query x and depend only on the way masses of balls centered at x vary with radius. Furthermore, we show a simple way to choose k k(x) locally at any x so as to nearly achieve the minimax rate at x in terms of the unknown intrinsic dimension in the vicinity of x .

artificial intelligence, intrinsic dimension, machine learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

A Computationally Efficient Learning-Based Model Predictive Control for Multirotors under Aerodynamic Disturbances

Akbari, Babak, Greeff, Melissa

arXiv.org Artificial IntelligenceFeb-15-2024

Neglecting complex aerodynamic effects hinders high-speed yet high-precision multirotor autonomy. In this paper, we present a computationally efficient learning-based model predictive controller that simultaneously optimizes a trajectory that can be tracked within the physical limits (on thrust and orientation) of the multirotor system despite unknown aerodynamic forces and adapts the control input. To do this, we leverage the well-known differential flatness property of multirotors, which allows us to transform their nonlinear dynamics into a linear model. The main limitation of current flatness-based planning and control approaches is that they often neglect dynamic feasibility. This is because these constraints are nonlinear as a result of the mapping between the input, i.e., multirotor thrust, and the flat state. In our approach, we learn a novel representation of the drag forces by learning the mapping from the flat state to the multirotor thrust vector (in a world frame) as a Gaussian Process (GP). Our proposed approach leverages the properties of GPs to develop a convex optimal controller that can be iteratively solved as a second-order cone program (SOCP). In simulation experiments, our proposed approach outperforms related model predictive controllers that do not account for aerodynamic effects on trajectory feasibility, leading to a reduction of up to 55% in absolute tracking error.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2402.10323

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Top-$K$ ranking with a monotone adversary

Yang, Yuepeng, Chen, Antares, Orecchia, Lorenzo, Ma, Cong

arXiv.org Artificial IntelligenceFeb-12-2024

In this paper, we address the top-$K$ ranking problem with a monotone adversary. We consider the scenario where a comparison graph is randomly generated and the adversary is allowed to add arbitrary edges. The statistician's goal is then to accurately identify the top-$K$ preferred items based on pairwise comparisons derived from this semi-random comparison graph. The main contribution of this paper is to develop a weighted maximum likelihood estimator (MLE) that achieves near-optimal sample complexity, up to a $\log^2(n)$ factor, where n denotes the number of items under comparison. This is made possible through a combination of analytical and algorithmic innovations. On the analytical front, we provide a refined $\ell_\infty$ error analysis of the weighted MLE that is more explicit and tighter than existing analyses. It relates the $\ell_\infty$ error with the spectral properties of the weighted comparison graph. Motivated by this, our algorithmic innovation involves the development of an SDP-based approach to reweight the semi-random graph and meet specified spectral properties. Additionally, we propose a first-order method based on the Matrix Multiplicative Weight Update (MMWU) framework. This method efficiently solves the resulting SDP in nearly-linear time relative to the size of the semi-random comparison graph.

adversary, comparison graph, graph, (14 more...)

arXiv.org Artificial Intelligence

2402.07445

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)

Add feedback

Using Graph Theory for Improving Machine Learning-based Detection of Cyber Attacks

Zonneveld, Giacomo, Principi, Lorenzo, Baldi, Marco

arXiv.org Artificial IntelligenceFeb-12-2024

Early detection of network intrusions and cyber threats is one of the main pillars of cybersecurity. One of the most effective approaches for this purpose is to analyze network traffic with the help of artificial intelligence algorithms, with the aim of detecting the possible presence of an attacker by distinguishing it from a legitimate user. This is commonly done by collecting the traffic exchanged between terminals in a network and analyzing it on a per-packet or per-connection basis. In this paper, we propose instead to perform pre-processing of network traffic under analysis with the aim of extracting some new metrics on which we can perform more efficient detection and overcome some limitations of classical approaches. These new metrics are based on graph theory, and consider the network as a whole, rather than focusing on individual packets or connections. Our approach is validated through experiments performed on publicly available data sets, from which it results that it can not only overcome some of the limitations of classical approaches, but also achieve a better detection capability of cyber threats.

detection, graph, traffic, (15 more...)

arXiv.org Artificial Intelligence

2402.07878

Country: Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.70)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Global optimality under amenable symmetry constraints

Orbanz, Peter

arXiv.org Artificial IntelligenceFeb-12-2024

We ask whether there exists a function or measure that (1) minimizes a given convex functional or risk and (2) satisfies a symmetry property specified by an amenable group of transformations. Examples of such symmetry properties are invariance, equivariance, or quasi-invariance. Our results draw on old ideas of Stein and Le Cam and on approximate group averages that appear in ergodic theorems for amenable groups. A class of convex sets known as orbitopes in convex analysis emerges as crucial, and we establish properties of such orbitopes in nonparametric settings. We also show how a simple device called a cocycle can be used to reduce different forms of symmetry to a single problem. As applications, we obtain results on invariant kernel mean embeddings and a Monge-Kantorovich theorem on optimality of transport plans under symmetry constraints. We also explain connections to the Hunt-Stein theorem on invariant tests.

g-invariant, orbitope, probability measure, (16 more...)

arXiv.org Artificial Intelligence

2402.07613

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Add feedback