AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Learning to Stop Cut Generation for Efficient Mixed-Integer Linear Programming

arXiv.org Artificial IntelligenceFeb-2-2024

Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), as they significantly tighten the dual bounds and improve the solving performance. A key problem for cuts is when to stop cuts generation, which is important for the efficiency of solving MILPs. However, many modern MILP solvers employ hard-coded heuristics to tackle this problem, which tends to neglect underlying patterns among MILPs from certain applications. To address this challenge, we formulate the cuts generation stopping problem as a reinforcement learning problem and propose a novel hybrid graph representation model (HYGRO) to learn effective stopping strategies. An appealing feature of HYGRO is that it can effectively capture both the dynamic and static features of MILPs, enabling dynamic decision-making for the stopping strategies. To the best of our knowledge, HYGRO is the first data-driven method to tackle the cuts generation stopping problem. By integrating our approach with modern solvers, experiments demonstrate that HYGRO significantly improves the efficiency of solving MILPs compared to competitive baselines, achieving up to 31% improvement.

dataset, hygro, plane, (17 more...)

arXiv.org Artificial Intelligence

2401.17527

Country:

Asia > China (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Universal Imitation Games

Mahadevan, Sridhar

arXiv.org Artificial IntelligenceFeb-1-2024

Alan Turing proposed in 1950 a framework called an imitation game to decide if a machine could think. Using mathematics developed largely after Turing -- category theory -- we analyze a broader class of universal imitation games (UIGs), which includes static, dynamic, and evolutionary games. In static games, the participants are in a steady state. In dynamic UIGs, "learner" participants are trying to imitate "teacher" participants over the long run. In evolutionary UIGs, the participants are competing against each other in an evolutionary game, and participants can go extinct and be replaced by others with higher fitness. We use the framework of category theory -- in particular, two influential results by Yoneda -- to characterize each type of imitation game. Universal properties in categories are defined by initial and final objects. We characterize dynamic UIGs where participants are learning by inductive inference as initial algebras over well-founded sets, and contrast them with participants learning by conductive inference over the final coalgebra of non-well-founded sets. We briefly discuss the extension of our categorical framework for UIGs to imitation games on quantum computers.

arbitrary category, dinatural transformation, universal coalgebra jacob, (15 more...)

arXiv.org Artificial Intelligence

2405.0154

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.13)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.92)
Information Technology (0.87)
(2 more...)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(13 more...)

Add feedback

Data-Driven Projection for Reducing Dimensionality of Linear Programs: Generalization Bound and Learning Methods

Sakaue, Shinsaku, Oki, Taihei

arXiv.org Artificial IntelligenceJan-29-2024

How to solve high-dimensional linear programs (LPs) efficiently is a fundamental question. Recently, there has been a surge of interest in reducing LP sizes using \textit{random projections}, which can accelerate solving LPs independently of improving LP solvers. In this paper, we explore a new direction of \emph{data-driven projections}, which use projection matrices learned from data instead of random projection matrices. Given data of past $n$-dimensional LPs, we learn an $n\times k$ projection matrix such that $n > k$. When addressing a future LP instance, we reduce its dimensionality from $n$ to $k$ via the learned projection matrix, solve the resulting LP to obtain a $k$-dimensional solution, and apply the learned matrix to it to recover an $n$-dimensional solution. On the theoretical side, a natural question is: how much data is sufficient to ensure the quality of recovered solutions? We address this question based on the framework of \textit{data-driven algorithm design}, which connects the amount of data sufficient for establishing generalization bounds to the \textit{pseudo-dimension} of performance metrics. We obtain an $\tilde{\mathrm{O}}(nk^2)$ upper bound on the pseudo-dimension, where $\tilde{\mathrm{O}}$ compresses logarithmic factors. We also provide an $\Omega(nk)$ lower bound, implying our result is tight up to an $\tilde{\mathrm{O}}(k)$ factor. On the practical side, we explore two natural methods for learning projection matrices: PCA- and gradient-based methods. While the former is simple and efficient, the latter can sometimes lead to better solution quality. Our experiments confirm the practical benefit of learning projection matrices from data, achieving significantly higher solution quality than the existing random projection while greatly reducing the time for solving LPs.

constraint, optimal value, projection matrix, (13 more...)

arXiv.org Artificial Intelligence

2309.00203

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Israel (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Finite Sample Confidence Regions for Linear Regression Parameters Using Arbitrary Predictors

Guille-Escuret, Charles, Ndiaye, Eugene

arXiv.org Artificial IntelligenceJan-26-2024

We explore a novel methodology for constructing confidence regions for parameters of linear models, using predictions from any arbitrary predictor. Our framework requires minimal assumptions on the noise and can be extended to functions deviating from strict linearity up to some adjustable threshold, thereby accommodating a comprehensive and pragmatically relevant set of functions. The derived confidence regions can be cast as constraints within a Mixed Integer Linear Programming framework, enabling optimisation of linear objectives. This representation enables robust optimization and the extraction of confidence intervals for specific parameter coordinates. Unlike previous methods, the confidence region can be empty, which can be used for hypothesis testing. Finally, we validate the empirical applicability of our method on synthetic data.

assumption, confidence region, noise, (14 more...)

arXiv.org Artificial Intelligence

2401.15254

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.65)

Add feedback

Sparse identification of nonlinear dynamics in the presence of library and system uncertainty

O'Brien, Andrew

arXiv.org Artificial IntelligenceJan-23-2024

The SINDy algorithm has been successfully used to identify the governing equations of dynamical systems from time series data. However, SINDy assumes the user has prior knowledge of the variables in the system and of a function library that can act as a basis for the system. In this paper, we demonstrate on real world data how the Augmented SINDy algorithm outperforms SINDy in the presence of system variable uncertainty. We then show SINDy can be further augmented to perform robustly when both kinds of uncertainty are present.

augmented sindy, equation, sindy, (16 more...)

arXiv.org Artificial Intelligence

2401.13099

Country:

North America > United States > California > Orange County > Laguna Hills (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.42)

Add feedback

Model-Free $\delta$-Policy Iteration Based on Damped Newton Method for Nonlinear Continuous-Time H$\infty$ Tracking Control

Wang, Qi

arXiv.org Artificial IntelligenceJan-23-2024

This paper presents a {\delta}-PI algorithm which is based on damped Newton method for the H{\infty} tracking control problem of unknown continuous-time nonlinear system. A discounted performance function and an augmented system are used to get the tracking Hamilton-Jacobi-Isaac (HJI) equation. Tracking HJI equation is a nonlinear partial differential equation, traditional reinforcement learning methods for solving the tracking HJI equation are mostly based on the Newton method, which usually only satisfies local convergence and needs a good initial guess. Based upon the damped Newton iteration operator equation, a generalized tracking Bellman equation is derived firstly. The {\delta}-PI algorithm can seek the optimal solution of the tracking HJI equation by iteratively solving the generalized tracking Bellman equation. On-policy learning and off-policy learning {\delta}-PI reinforcement learning methods are provided, respectively. Off-policy version {\delta}-PI algorithm is a model-free algorithm which can be performed without making use of a priori knowledge of the system dynamics. NN-based implementation scheme for the off-policy {\delta}-PI algorithms is shown. The suitability of the model-free {\delta}-PI algorithm is illustrated with a nonlinear system simulation.

algorithm, bellman equation, equation, (14 more...)

arXiv.org Artificial Intelligence

2401.12882

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

PlasmoData.jl -- A Julia Framework for Modeling and Analyzing Complex Data as Graphs

Cole, David L, Zavala, Victor M

arXiv.org Artificial IntelligenceJan-21-2024

Datasets encountered in scientific and engineering applications appear in complex formats (e.g., images, multivariate time series, molecules, video, text strings, networks). Graph theory provides a unifying framework to model such datasets and enables the use of powerful tools that can help analyze, visualize, and extract value from data. In this work, we present PlasmoData.jl, an open-source, Julia framework that uses concepts of graph theory to facilitate the modeling and analysis of complex datasets. The core of our framework is a general data modeling abstraction, which we call a DataGraph. We show how the abstraction and software implementation can be used to represent diverse data objects as graphs and to enable the use of tools from topology, graph theory, and machine learning (e.g., graph neural networks) to conduct a variety of tasks. We illustrate the versatility of the framework by using real datasets: i) an image classification problem using topological data analysis to extract features from the graph model to train machine learning models; ii) a disease outbreak problem where we model multivariate time series as graphs to detect abnormal events; and iii) a technology pathway analysis problem where we highlight how we can use graphs to navigate connectivity. Our discussion also highlights how PlasmoData.jl leverages native Julia capabilities to enable compact syntax, scalable computations, and interfaces with diverse packages.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2401.11404

Country:

Asia (0.27)
South America > Brazil (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals > Polymers & Plastics (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy > Oil & Gas > Downstream (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.74)

Add feedback

Solution of the Probabilistic Lambert Problem: Connections with Optimal Mass Transport, Schr\"odinger Bridge and Reaction-Diffusion PDEs

Teter, Alexis M. H., Nodozi, Iman, Halder, Abhishek

arXiv.org Artificial IntelligenceJan-19-2024

Lambert's problem concerns with transferring a spacecraft from a given initial to a given terminal position within prescribed flight time via velocity control subject to a gravitational force field. We consider a probabilistic variant of the Lambert problem where the knowledge of the endpoint constraints in position vectors are replaced by the knowledge of their respective joint probability density functions. We show that the Lambert problem with endpoint joint probability density constraints is a generalized optimal mass transport (OMT) problem, thereby connecting this classical astrodynamics problem with a burgeoning area of research in modern stochastic control and stochastic machine learning. This newfound connection allows us to rigorously establish the existence and uniqueness of solution for the probabilistic Lambert problem. The same connection also helps to numerically solve the probabilistic Lambert problem via diffusion regularization, i.e., by leveraging further connection of the OMT with the Schr\"odinger bridge problem (SBP). This also shows that the probabilistic Lambert problem with additive dynamic process noise is in fact a generalized SBP, and can be solved numerically using the so-called Schr\"odinger factors, as we do in this work. We explain how the resulting analysis leads to solving a boundary-coupled system of reaction-diffusion PDEs where the nonlinear gravitational potential appears as the reaction rate. We propose novel algorithms for the same, and present illustrative numerical results. Our analysis and the algorithmic framework are nonparametric, i.e., we make neither statistical (e.g., Gaussian, first few moments, mixture or exponential family, finite dimensionality of the sufficient statistic) nor dynamical (e.g., Taylor series) approximations.

algorithm, lambert problem, probabilistic lambert problem, (15 more...)

arXiv.org Artificial Intelligence

2401.07961

Country:

North America > United States > Iowa (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Mathematics of Computing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits

Shen, Yi, Xu, Pan, Zavlanos, Michael M.

arXiv.org Artificial IntelligenceJan-17-2024

Off-policy evaluation and learning are concerned with assessing a given policy and learning an optimal policy from offline data without direct interaction with the environment. Often, the environment in which the data are collected differs from the environment in which the learned policy is applied. To account for the effect of different environments during learning and execution, distributionally robust optimization (DRO) methods have been developed that compute worst-case bounds on the policy values assuming that the distribution of the new environment lies within an uncertainty set. Typically, this uncertainty set is defined based on the KL divergence around the empirical distribution computed from the logging dataset. However, the KL uncertainty set fails to encompass distributions with varying support and lacks awareness of the geometry of the distribution support. As a result, KL approaches fall short in addressing practical environment mismatches and lead to over-fitting to worst-case scenarios. To overcome these limitations, we propose a novel DRO approach that employs the Wasserstein distance instead. While Wasserstein DRO is generally computationally more expensive compared to KL DRO, we present a regularized method and a practical (biased) stochastic gradient descent method to optimize the policy efficiently. We also provide a theoretical analysis of the finite sample complexity and iteration complexity for our proposed method. We further validate our approach using a public dataset that was recorded in a randomized stoke trial.

machine learning research, transaction, wasserstein dro, (12 more...)

arXiv.org Artificial Intelligence

2309.08748

Country: Europe > France (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Information Flow Rate for Cross-Correlated Stochastic Processes

Hristopulos, Dionissios T.

arXiv.org Artificial IntelligenceJan-10-2024

Causal inference seeks to identify cause-and-effect interactions in coupled systems. A recently proposed method by Liang detects causal relations by quantifying the direction and magnitude of information flow between time series. The theoretical formulation of information flow for stochastic dynamical systems provides a general expression and a data-driven statistic for the rate of entropy transfer between different system units. To advance understanding of information flow rate in terms of intuitive concepts and physically meaningful parameters, we investigate statistical properties of the data-driven information flow rate between coupled stochastic processes. We derive relations between the expectation of the information flow rate statistic and properties of the auto- and cross-correlation functions. Thus, we elucidate the dependence of the information flow rate on the analytical properties and characteristic times of the correlation functions. Our analysis provides insight into the influence of the sampling step, the strength of cross-correlations, and the temporal delay of correlations on information flow rate. We support the theoretical results with numerical simulations of correlated Gaussian processes.

ifr, information flow, stochastic process, (15 more...)

arXiv.org Artificial Intelligence

2401.0495

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.62)

Add feedback