cbo
Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization
De Deyn, William, Herty, Michael, Samaey, Giovanni
Artificial Intelligence has witnessed remarkable progress over the past decades, both in its capabilities and its range of applications. Today, neural networks are present in a variety of fields. One classical application is function approximation, which is supported by the universal approximation theory [34]. In computer vision, convolutional neural networks form the backbone of most modern architectures [39, 38], while the framework of neural ordinary differential equations has contributed significantly to optimal control problems [17, 10]. In natural language processing and speech recognition, recurrent neural networks and the long short-term memory variants have yielded significant performance improvements [33, 51]. More recently, diffusion models have illustrated to be powerful generative models, with applications ranging from image denoising to video generation [56]. Neural networks have even found their way into scientific computing. The most notable example is physics-informed neural networks, which are capable of solving both forward and inverse problems governed by partial differential equations [50]. A neural network can be viewed, in general, as a function parametrized by a set of weights and biases, which we collectively refer to as parameters.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- (7 more...)
- Research Report (0.50)
- Overview (0.46)
Maxitive Donsker-Varadhan Formulation for Possibilistic Variational Inference
Singh, Jasraj, Wongso, Shelvia, Houssineau, Jeremie, Chérief-Abdellatif, Badr-Eddine
V ariational inference (VI) is a cornerstone of modern Bayesian learning, enabling approximate inference in complex models that would otherwise be intractable. However, its formulation depends on expectations and divergences defined through high-dimensional integrals, often rendering analytical treatment impossible and necessitating heavy reliance on approximate learning and inference techniques. Possibility theory, an imprecise probability framework, allows to directly model epistemic uncertainty instead of leveraging subjective probabilities. While this framework provides robustness and interpretability under sparse or imprecise information, adapting VI to the possibilistic setting requires rethinking core concepts such as entropy and divergence, which presuppose additivity. In this work, we develop a principled formulation of possibilistic variational inference and apply it to a special class of exponential-family functions, highlighting parallels with their probabilistic counterparts and revealing the distinctive mathematical structures of possibility theory.
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Consensus-based optimization for closed-box adversarial attacks and a connection to evolution strategies
Roith, Tim, Bungert, Leon, Wacker, Philipp
Consensus-based optimization (CBO) has established itself as an efficient gradient-free optimization scheme, with attractive mathematical properties, such as mean-field convergence results for non-convex loss functions. In this work, we study CBO in the context of closed-box adversarial attacks, which are imperceptible input perturbations that aim to fool a classifier, without accessing its gradient. Our contribution is to establish a connection between the so-called consensus hopping as introduced by Riedl et al. and natural evolution strategies (NES) commonly applied in the context of adversarial attacks and to rigorously relate both methods to gradient-based optimization schemes. Beyond that, we provide a comprehensive experimental study that shows that despite the conceptual similarities, CBO can outperform NES and other evolutionary strategies in certain scenarios.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Germany > Hamburg (0.04)
- North America > United States > Virginia (0.04)
- (5 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Republicans challenge 'irrelevant' budget office as it critiques Trump's 'beautiful bill'
Will Cain tries to make sense of the divide over the'One Big Beautiful Bill.' Plus, Kennedy joins Will to discuss some of the most salacious stories in pop culture and politics. Both Republicans and Democrats have used analysis from the nonpartisan Congressional Budget Office as a political cudgel when it suits them, but with unfavorable reviews of President Donald Trump's "one big, beautiful bill" coming out, some in the GOP are questioning the relevancy of the agency. The CBO's latest analysis of the gargantuan tax cut and spending package found that the House Republican-authored super bill would add 2.4 trillion to the national deficit over the next decade and boot millions off of health insurance. Senate Majority Leader John Thune is signaling that changes are likely to the House's version of President Trump's "big, beautiful bill." Senate Republicans will now get their chance to tweak and change the legislation, and have vowed to do so, despite warnings from Trump to reshape the bill as little as possible.
- North America > United States > Texas (0.07)
- North America > United States > Alaska (0.05)
Automated Computational Energy Minimization of ML Algorithms using Constrained Bayesian Optimization
Mitra, Pallavi, Biessmann, Felix
Bayesian optimization (BO) is an efficient framework for optimization of black-box objectives when function evaluations are costly and gradient information is not easily accessible. BO has been successfully applied to automate the task of hyperparameter optimization (HPO) in machine learning (ML) models with the primary objective of optimizing predictive performance on held-out data. In recent years, however, with ever-growing model sizes, the energy cost associated with model training has become an important factor for ML applications. Here we evaluate Constrained Bayesian Optimization (CBO) with the primary objective of minimizing energy consumption and subject to the constraint that the generalization performance is above some threshold. We evaluate our approach on regression and classification tasks and demonstrate that CBO achieves lower energy consumption without compromising the predictive performance of ML models.
- North America > United States > California (0.04)
- Europe > Germany > Berlin (0.04)
Transfer learning for day-ahead load forecasting: a case study on European national electricity demand time series
Tzortzis, Alexandros-Menelaos, Pelekis, Sotiris, Spiliotis, Evangelos, Mouzakitis, Spiros, Psarras, John, Askounis, Dimitris
Short-term load forecasting (STLF) is crucial for the daily operation of power grids. However, the non-linearity, non-stationarity, and randomness characterizing electricity demand time series renders STLF a challenging task. Various forecasting approaches have been proposed for improving STLF, including neural network (NN) models which are trained using data from multiple electricity demand series that may not necessary include the target series. In the present study, we investigate the performance of this special case of STLF, called transfer learning (TL), by considering a set of 27 time series that represent the national day-ahead electricity demand of indicative European countries. We employ a popular and easy-to-implement NN model and perform a clustering analysis to identify similar patterns among the series and assist TL. In this context, two different TL approaches, with and without the clustering step, are compiled and compared against each other as well as a typical NN training setup. Our results demonstrate that TL can outperform the conventional approach, especially when clustering techniques are considered.
Optimal Observation-Intervention Trade-Off in Optimisation Problems with Causal Structure
We consider the problem of optimising an expensive-to-evaluate grey-box objective function, within a finite budget, where known side-information exists in the form of the causal structure between the design variables. Standard black-box optimisation ignores the causal structure, often making it inefficient and expensive. The few existing methods that consider the causal structure are myopic and do not fully accommodate the observation-intervention trade-off that emerges when estimating causal effects. In this paper, we show that the observation-intervention trade-off can be formulated as a non-myopic optimal stopping problem which permits an efficient solution. We give theoretical results detailing the structure of the optimal stopping times and demonstrate the generality of our approach by showing that it can be integrated with existing causal Bayesian optimisation algorithms. Experimental results show that our formulation can enhance existing algorithms on real and synthetic benchmarks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (7 more...)
- Health & Medicine > Diagnostic Medicine (0.68)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
Gradient is All You Need?
Riedl, Konstantin, Klock, Timo, Geldhauser, Carina, Fornasier, Massimo
In this paper we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradient descent (SGD)-like behavior despite solely relying on evaluations of the objective function. The fundamental value of such link between CBO and SGD lies in the fact that CBO is provably globally convergent to global minimizers for ample classes of nonsmooth and nonconvex objective functions, hence, on the one side, offering a novel explanation for the success of stochastic relaxations of gradient descent. On the other side, contrary to the conventional wisdom for which zero-order methods ought to be inefficient or not to possess generalization abilities, our results unveil an intrinsic gradient descent nature of such heuristics. This viewpoint furthermore complements previous insights into the working principles of CBO, which describe the dynamics in the mean-field limit through a nonlinear nonlocal partial differential equation that allows to alleviate complexities of the nonconvex function landscape. Our proofs leverage a completely nonsmooth analysis, which combines a novel quantitative version of the Laplace principle (log-sum-exp trick) and the minimizing movement scheme (proximal iteration). In doing so, we furnish useful and precise insights that explain how stochastic perturbations of gradient descent overcome energy barriers and reach deep levels of nonconvex functions. Instructive numerical illustrations support the provided theoretical insights.
- Europe > Germany (0.28)
- North America > United States > Michigan (0.14)
- Europe > Switzerland (0.14)
- (2 more...)
- Energy > Oil & Gas > Upstream (0.45)
- Information Technology > Security & Privacy (0.45)
Functional Causal Bayesian Optimization
Gultchin, Limor, Aglietti, Virginia, Bellot, Alexis, Chiappa, Silvia
We propose functional causal Bayesian optimization (fCBO), a method for finding interventions that optimize a target variable in a known causal graph. fCBO extends the CBO family of methods to enable functional interventions, which set a variable to be a deterministic function of other variables in the graph. fCBO models the unknown objectives with Gaussian processes whose inputs are defined in a reproducing kernel Hilbert space, thus allowing to compute distances among vector-valued functions. In turn, this enables to sequentially select functions to explore by maximizing an expected improvement acquisition functional while keeping the typical computational tractability of standard BO settings. We introduce graphical criteria that establish when considering functional interventions allows attaining better target effects, and conditions under which selected interventions are also optimal for conditional target effects. We demonstrate the benefits of the method in a synthetic and in a real-world causal graph.
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Constrained Causal Bayesian Optimization
Aglietti, Virginia, Malek, Alan, Ktena, Ira, Chiappa, Silvia
We propose constrained causal Bayesian optimization (cCBO), an approach for finding interventions in a known causal graph that optimize a target variable under some constraints. cCBO first reduces the search space by exploiting the graph structure and, if available, an observational dataset; and then solves the restricted optimization problem by modelling target and constraint quantities using Gaussian processes and by sequentially selecting interventions via a constrained expected improvement acquisition function. We propose different surrogate models that enable to integrate observational and interventional data while capturing correlation among effects with increasing levels of sophistication. We evaluate cCBO on artificial and real-world causal graphs showing successful trade off between fast convergence and percentage of feasible interventions.
- North America > United States > Virginia (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)