AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Random Coordinate Descent on the Wasserstein Space of Probability Measures

arXiv.org Machine LearningApr-3-2026

Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.

artificial intelligence, machine learning, probability measure, (15 more...)

arXiv.org Machine Learning

2604.01606

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Scenario theory for multi-criteria data-driven decision making

Garatti, Simone, Manieri, Lucrezia, Falsone, Alessandro, Carè, Algo, Campi, Marco C., Prandini, Maria

arXiv.org Machine LearningApr-2-2026

The scenario approach provides a powerful data-driven framework for designing solutions under uncertainty with rigorous probabilistic robustness guarantees. Existing theory, however, primarily addresses assessing robustness with respect to a single appropriateness criterion for the solution based on a dataset, whereas many practical applications - including multi-agent decision problems - require the simultaneous consideration of multiple criteria and the assessment of their robustness based on multiple datasets, one per criterion. This paper develops a general scenario theory for multi-criteria data-driven decision making. A central innovation lies in the collective treatment of the risks associated with violations of individual criteria, which yields substantially more accurate robustness certificates than those derived from a naive application of standard results. In turn, this approach enables a sharper quantification of the robustness level with which all criteria are simultaneously satisfied. The proposed framework applies broadly to multi-criteria data-driven decision problems, providing a principled, scalable, and theoretically grounded methodology for design under uncertainty.

artificial intelligence, ktot, optimization problem, (17 more...)

arXiv.org Machine Learning

2604.00553

Country: Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Convergence of projected stochastic natural gradient variational inference for various step size and sample or batch size schedules

Guilmeau, Thomas, Hendrikx, Hadrien, Forbes, Florence

arXiv.org Machine LearningApr-2-2026

Stochastic natural gradient variational inference (NGVI) is a popular and efficient algorithm for Bayesian inference. Despite empirical success, the convergence of this method is still not fully understood. In this work, we define and study a projected stochastic NGVI when variational distributions form an exponential family. Stochasticity arises when either gradients are intractable expectations or large sums. We prove new non-asymptotic convergence results for combinations of constant or decreasing step sizes and constant or increasing sample/batch sizes. When all hyperparameters are fixed, NGVI is shown to converge geometrically to a neighborhood of the optimum, while we establish convergence to the optimum with rates of the form $\mathcal{O}\left(\frac{1}{T^ρ} \right)$, possibly with $ρ\geq 1$, for all other combinations of step size and sample/batch size schedules. These rates apply when the target posterior distribution is close in some sense to the considered exponential family. Our theoretical results extend existing NGVI and stochastic optimization results and provide more flexibility to adjust, in a principled way, step sizes and sample/batch sizes in order to meet speed, resources, or accuracy constraints.

artificial intelligence, intdoma, machine learning, (16 more...)

arXiv.org Machine Learning

2604.00683

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Optimistic Actor-Critic with Parametric Policies for Linear Markov Decision Processes

Lin, Max Qiushi, Asad, Reza, Tan, Kevin, Ishfaq, Haque, Szepesvari, Csaba, Vaswani, Sharan

arXiv.org Machine LearningApr-2-2026

Although actor-critic methods have been successful in practice, their theoretical analyses have several limitations. Specifically, existing theoretical work either sidesteps the exploration problem by making strong assumptions or analyzes impractical methods with complicated algorithmic modifications. Moreover, the actor-critic methods analyzed for linear MDPs often employ natural policy gradient and construct "implicit" policies without explicit parameterization. Such policies are computationally expensive to sample from, making the environment interactions inefficient. To that end, we focus on the finite-horizon linear MDPs and propose an optimistic actor-critic framework that uses parametric log-linear policies. In particular, we introduce a tractable $\textit{logit-matching}$ regression objective for the actor. For the critic, we use approximate Thompson sampling via Langevin Monte Carlo to obtain optimistic value estimates. We prove that the resulting algorithm achieves $\widetilde{\mathcal{O}}(ε^{-4})$ and $\widetilde{\mathcal{O}}(ε^{-2})$ sample complexity in the on-policy and off-policy setting, respectively. Our results match prior theoretical work in achieving the state-of-the-art sample complexity, while our algorithm is more aligned with practice.

bqth, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2603.28595

Country:

Europe > United Kingdom (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)

Add feedback

mlr3mbo: Bayesian Optimization in R

Becker, Marc, Schneider, Lennart, Binder, Martin, Kotthoff, Lars, Bischl, Bernd

arXiv.org Machine LearningApr-1-2026

We present mlr3mbo, a comprehensive and modular toolbox for Bayesian optimization in R. mlr3mbo supports single- and multi-objective optimization, multi-point proposals, batch and asynchronous parallelization, input and output transformations, and robust error handling. While it can be used for many standard Bayesian optimization variants in applied settings, researchers can also construct custom BO algorithms from its flexible building blocks. In addition to an introduction to the software, its design principles, and its building blocks, the paper presents two extensive empirical evaluations of the software on the surrogate-based benchmark suite YAHPO Gym. To identify robust default configurations for both numeric and mixed-hierarchical optimization regimes, and to gain further insights into the respective impacts of individual settings, we run a coordinate descent search over the mlr3mbo configuration space and analyze its results. Furthermore, we demonstrate that mlr3mbo achieves state-of-the-art performance by benchmarking it against a wide range of optimizers, including HEBO, SMAC3, Ax, and Optuna.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2603.2973

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Wyoming (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Transfer Learning in Bayesian Optimization for Aircraft Design

Tfaily, Ali, Diouane, Youssef, Bartoli, Nathalie, Kokkolaras, Michael

arXiv.org Machine LearningApr-1-2026

The use of transfer learning within Bayesian optimization addresses the disadvantages of the so-called \textit{cold start} problem by using source data to aid in the optimization of a target problem. We present a method that leverages an ensemble of surrogate models using transfer learning and integrates it in a constrained Bayesian optimization framework. We identify challenges particular to aircraft design optimization related to heterogeneous design variables and constraints. We propose the use of a partial-least-squares dimension reduction algorithm to address design space heterogeneity, and a \textit{meta} data surrogate selection method to address constraint heterogeneity. Numerical benchmark problems and an aircraft conceptual design optimization problem are used to demonstrate the proposed methods. Results show significant improvement in convergence in early optimization iterations compared to standard Bayesian optimization, with improved prediction accuracy for both objective and constraint surrogate models.

artificial intelligence, machine learning, optimization, (19 more...)

arXiv.org Machine Learning

2603.28999

Country:

Europe > Italy (0.04)
North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Aerospace & Defense > Aircraft (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-fidelity approaches for general constrained Bayesian optimization with application to aircraft design

Cordelier, Oihan, Diouane, Youssef, Bartoli, Nathalie, Laurendeau, Eric

arXiv.org Machine LearningApr-1-2026

Aircraft design relies heavily on solving challenging and computationally expensive Multidisciplinary Design Optimization problems. In this context, there has been growing interest in multi-fidelity models for Bayesian optimization to improve the MDO process by balancing computational cost and accuracy through the combination of high- and low-fidelity simulation models, enabling efficient exploration of the design process at a minimal computational effort. In the existing literature, fidelity selection focuses only on the objective function to decide how to integrate multiple fidelity levels, balancing precision and computational cost using variance reduction criteria. In this work, we propose novel multi-fidelity selection strategies. Specifically, we demonstrate how incorporating information from both the objective and the constraints can further reduce computational costs without compromising the optimality of the solution. We validate the proposed multi-fidelity optimization strategy by applying it to four analytical test cases, showcasing its effectiveness. The proposed method is used to efficiently solve a challenging aircraft wing aero-structural design problem. The proposed setting uses a linear vortex lattice method and a finite element method for the aerodynamic and structural analysis respectively. We show that employing our proposed multi-fidelity approach leads to $86\%$ to $200\%$ more constraint compliant solutions given a limited budget compared to the state-of-the-art approach.

artificial intelligence, modeling & simulation, optimization, (17 more...)

arXiv.org Machine Learning

2603.28987

Country:

North America > United States > Virginia (0.04)
North America > Canada (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report (0.69)

Industry: Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)

Add feedback

Parameter Estimation in Stochastic Differential Equations via Wiener Chaos Expansion and Stochastic Gradient Descent

Delgado-Vences, Francisco, Pavón-Español, José Julián, Ornelas, Arelly

arXiv.org Machine LearningMar-31-2026

This study addresses the inverse problem of parameter estimation for Stochastic Differential Equations (SDEs) by minimizing a regularized discrepancy functional via Stochastic Gradient Descent (SGD). To achieve computational efficiency, we leverage the Wiener Chaos Expansion (WCE), a spectral decomposition technique that projects the stochastic solution onto an orthogonal basis of Hermite polynomials. This transformation effectively maps the stochastic dynamics into a hierarchical system of deterministic functions, termed the \textit{propagator}. By reducing the stochastic inference task to a deterministic optimization problem, our framework circumvents the heavy computational burden and sampling requirements of traditional simulation-based methods like MCMC or MLE. The robustness and scalability of the proposed approach are demonstrated through numerical experiments on various non-linear SDEs, including models for individual biological growth. Results show that the WCE-SGD framework provides accurate parameter recovery even from discrete, noisy observations, offering a significant paradigm shift in the efficient modeling of complex stochastic systems.

artificial intelligence, machine learning, optimization problem, (13 more...)

arXiv.org Machine Learning

2603.27019

Country:

North America > Mexico (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Mixture-Model Preference Learning for Many-Objective Bayesian Optimization

Dubey, Manisha, De Peuter, Sebastiaan, Wang, Wanrong, Kaski, Samuel

arXiv.org Machine LearningMar-31-2026

Preference-based many-objective optimization faces two obstacles: an expanding space of trade-offs and heterogeneous, context-dependent human value structures. Towards this, we propose a Bayesian framework that learns a small set of latent preference archetypes rather than assuming a single fixed utility function, modelling them as components of a Dirichlet-process mixture with uncertainty over both archetypes and their weights. To query efficiently, we designing hybrid queries that target information about (i) mode identity and (ii) within-mode trade-offs. Under mild assumptions, we provide a simple regret guarantee for the resulting mixture-aware Bayesian optimization procedure. Empirically, our method outperforms standard baselines on synthetic and real-world many-objective benchmarks, and mixture-aware diagnostics reveal structure that regret alone fails to capture.

artificial intelligence, machine learning, optimization, (20 more...)

arXiv.org Machine Learning

2603.2841

Country:

Europe > United Kingdom > Scotland (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Machine Learning-Assisted High-Dimensional Matrix Estimation

Tian, Wan, Yang, Hui, Lian, Zhouhui, Zhang, Lingyue, Peng, Yijie

arXiv.org Machine LearningMar-31-2026

Efficient estimation of high-dimensional matrices--including covariance and precision matrices--is a cornerstone of modern multivariate statistics. Most existing studies have focused primarily on the theoretical properties of the estimators (e.g., consistency and sparsity), while largely overlooking the computational challenges inherent in high-dimensional settings. Theoretically, we first prove the convergence of LADMM, and then establish the convergence, convergence rate, and monotonicity of its reparameterized counterpart; importantly, we show that the reparameterized LADMM enjoys a faster convergence rate. Notably, the proposed reparameterization theory and methodology are applicable to the estimation of both high-dimensional covariance and precision matrices. Keywords: ADMM; High-dimensional; Learning-based optimization; Matrix estimation. 1. Introduction High-dimensional matrix estimation--covering both covariance and precision matrix estimation--constitutes a cornerstone of modern statistics and data science [1, 2, 3]. Accurate covariance estimation enables the characterization of dependence structures among a large number of variables [4, 5, 6], which is indispensable in diverse domains such as genomics [7, 8], neuroscience [9], finance [10, 11, 12], and climate science [13, 14]. Over the past two decades, substantial progress has been made in the statistical theory of high-dimensional matrix estimation, particularly with respect to the accuracy of estimators, including properties such as sparsistency and consistency [5, 15, 16]. However, in empirical studies, the dimensionality is often only on the order of tens to hundreds, and in many cases is comparable to the sample size [21, 22, 23, 24]. This observation highlights a notable gap between the statistical theory of estimators and the practical challenges of their computational implementation.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2603.28346

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Liaoning Province > Dalian (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.34)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback