performance function
Proposal-Guided Greedy Surrogate Refinement for PDE-Driven High-Dimensional Rare-Event Estimation
Gao, Zhiwei, Karniadakis, George
Accurate surrogate construction for PDE-driven high-dimensional rare-event simulation is challenging when performance evaluations are expensive. Since a globally accurate surrogate may require many high-fidelity evaluations, adaptive importance sampling provides a natural localization tool: its evolving proposal distribution progressively identifies the failure-relevant region. Motivated by this observation, we propose a surrogate-assisted adaptive importance sampling framework that refines the surrogate locally along the evolving proposal, rather than over the entire input space. The surrogate combines an encoder with a neural network, providing a low-dimensional latent representation for both prediction and sample selection. At each adaptive iteration, candidates drawn from the current proposal are selected by a greedy latent-space rule balancing proximity to the estimated failure boundary and sample diversity. The selected samples are evaluated by the high-fidelity model and used to refine the surrogate, which then guides the subsequent cross-entropy-type adaptive proposal update. We establish one-step proposal stability bounds under local surrogate errors, together with surrogate-induced misclassification and finite-sample estimation error bounds. Numerical experiments on multimodal benchmarks and PDE-driven rare-event problems up to 100 dimensions show that the proposed method achieves accuracy comparable to true-model adaptive importance sampling while requiring substantially fewer high-fidelity evaluations.
Independent policy gradient-based reinforcement learning for economic and reliable energy management of multi-microgrid systems
Efficiency and reliability are both crucial for energy management, especially in multi-microgrid systems (MMSs) integrating intermittent and distributed renewable energy sources. This study investigates an economic and reliable energy management problem in MMSs under a distributed scheme, where each microgrid independently updates its energy management policy in a decentralized manner to optimize the long-term system performance collaboratively. We introduce the mean and variance of the exchange power between the MMS and the main grid as indicators for the economic performance and reliability of the system. Accordingly, we formulate the energy management problem as a mean-variance team stochastic game (MV-TSG), where conventional methods based on the maximization of expected cumulative rewards are unsuitable for variance metrics. To solve MV-TSGs, we propose a fully distributed independent policy gradient algorithm, with rigorous convergence analysis, for scenarios with known model parameters. For large-scale scenarios with unknown model parameters, we further develop a deep reinforcement learning algorithm based on independent policy gradients, enabling data-driven policy optimization. Numerical experiments in two scenarios validate the effectiveness of the proposed methods. Our approaches fully leverage the distributed computational capabilities of MMSs and achieve a well-balanced trade-off between economic performance and operational reliability.
Quasi-Newton Compatible Actor-Critic for Deterministic Policies
Kordabad, Arash Bahari, Brandner, Dean, Gros, Sebastien, Lucia, Sergio, Soudjani, Sadegh
In this paper, we propose a second-order deterministic actor-critic framework in reinforcement learning that extends the classical deterministic policy gradient method to exploit curvature information of the performance function. Building on the concept of compatible function approximation for the critic, we introduce a quadratic critic that simultaneously preserves the true policy gradient and an approximation of the performance Hessian. A least-squares temporal difference learning scheme is then developed to estimate the quadratic critic parameters efficiently. This construction enables a quasi-Newton actor update using information learned by the critic, yielding faster convergence compared to first-order methods. The proposed approach is general and applicable to any differentiable policy class. Numerical examples demonstrate that the method achieves improved convergence and performance over standard deterministic actor-critic baselines.
Do Data Valuations Make Good Data Prices?
Fan, Dongyang, Rotello, Tyler J., Karimireddy, Sai Praneeth
As large language models increasingly rely on external data sources, compensating data contributors has become a central concern. But how should these payments be devised? We revisit data valuations from a $\textit{market-design perspective}$ where payments serve to compensate data owners for the $\textit{private}$ heterogeneous costs they incur for collecting and sharing data. We show that popular valuation methods-such as Leave-One-Out and Data Shapley-make for poor payments. They fail to ensure truthful reporting of the costs, leading to $\textit{inefficient market}$ outcomes. To address this, we adapt well-established payment rules from mechanism design, namely Myerson and Vickrey-Clarke-Groves (VCG), to the data market setting. We show that Myerson payment is the minimal truthful mechanism, optimal from the buyer's perspective. Additionally, we identify a condition under which both data buyers and sellers are utility-satisfied, and the market achieves efficiency. Our findings highlight the importance of incorporating incentive compatibility into data valuation design, paving the way for more robust and efficient data markets. Our data market framework is readily applicable to real-world scenarios. We illustrate this with simulations of contributor compensation in an LLM based retrieval-augmented generation (RAG) marketplace tasked with challenging medical question answering.
Risk-Based Prognostics and Health Management
Introduction As engineering fields mature, new technologies are emerging that are beginning to serve as the foundation of many societal improvements. For example, modern medical diagnostic equipment provides valuable information that gives medical professionals a better understanding of a patient's needs and ultimately improves quality of life [1]. Improvements to vehicle designs make transportation in cars or aircraft safer and more environmentally friendly [2]. Military equipment continues to be developed that better supports and protects personnel in the field [3]. Manufacturing practices and robotic equipment improve work safety conditions and reduce a product's price point, making amenities available to a wider range of consumers [4]. One approach to maximizing system availability is to incorporate some means of health assessment into the system itself. Doing so is often referred to as "integrated system health management" (ISHM) or "prognostics and health management" (PHM), which has been applied successfully to many complex systems [5]. By integrating health assessment into the very functioning of a system, more information can be obtained that provides a better understanding of the system as a whole, thus allowing system owners to become proactive in how they deal with system degradation. ISHM and PHM promise to focus on system conditions, thus supporting initiatives in what has become known as condition-based maintenance (CBM). This, in turn, enables maintenance events to be initiated based on specific system conditions rather than waiting until a failure occurs [6]. One of the key ingredients of ISHM/PHM is diagnostics, which corresponds to the process of determining the health state of the system based on sets of observations (or tests). Such tests are designed specifically to track system behavior and determine whether or not a failure has occurred. In many cases it is impossible to identify a single fault that explains the observations with certainty. Instead, candidate sets of faults are often indicated, and when using applicable models, probabilities or confidence values are associated with the faults to provide additional information. One historic approach to using test observations for diagnosis is to apply a decision tree - sometimes referred to as a fault tree1 [7].
Multi-Armed Sampling Problem and the End of Exploration
Pedramfar, Mohammad, Ravanbakhsh, Siamak
This paper introduces the framework of multi-armed sampling, as the sampling counterpart to the optimization problem of multi-arm bandits. Our primary motivation is to rigorously examine the exploration-exploitation trade-off in the context of sampling. We systematically define plausible notions of regret for this framework and establish corresponding lower bounds. We then propose a simple algorithm that achieves these optimal regret bounds. Our theoretical results demonstrate that in contrast to optimization, sampling does not require exploration. To further connect our findings with those of multi-armed bandits, we define a continuous family of problems and associated regret measures that smoothly interpolates and unifies multi-armed sampling and multi-armed bandit problems using a temperature parameter. We believe the multi-armed sampling framework, and our findings in this setting can have a foundational role in the study of sampling including recent neural samplers, akin to the role of multi-armed bandits in reinforcement learning. In particular, our work sheds light on the need for exploration and the convergence properties of algorithm for entropy-regularized reinforcement learning, fine-tuning of pretrained models and reinforcement learning with human feedback (RLHF).
Performance-driven Constrained Optimal Auto-Tuner for MPC
Puigjaner, Albert Gassol, Prajapat, Manish, Carron, Andrea, Krause, Andreas, Zeilinger, Melanie N.
A key challenge in tuning Model Predictive Control (MPC) cost function parameters is to ensure that the system performance stays consistently above a certain threshold. To address this challenge, we propose a novel method, COAT-MPC, Constrained Optimal Auto-Tuner for MPC. With every tuning iteration, COAT-MPC gathers performance data and learns by updating its posterior belief. It explores the tuning parameters' domain towards optimistic parameters in a goal-directed fashion, which is key to its sample efficiency. We theoretically analyze COAT-MPC, showing that it satisfies performance constraints with arbitrarily high probability at all times and provably converges to the optimum performance within finite time. Through comprehensive simulations and comparative analyses with a hardware platform, we demonstrate the effectiveness of COAT-MPC in comparison to classical Bayesian Optimization (BO) and other state-of-the-art methods. When applied to autonomous racing, our approach outperforms baselines in terms of constraint violations and cumulative regret over time.
On the Implementation of a Bayesian Optimization Framework for Interconnected Systems
Gonzรกlez, Leonardo D., Zavala, Victor M.
Bayesian optimization (BO) is an effective paradigm for the optimization of expensive-to-sample systems. Standard BO learns the performance of a system $f(x)$ by using a Gaussian Process (GP) model; this treats the system as a black-box and limits its ability to exploit available structural knowledge (e.g., physics and sparse interconnections in a complex system). Grey-box modeling, wherein the performance function is treated as a composition of known and unknown intermediate functions $f(x, y(x))$ (where $y(x)$ is a GP model) offers a solution to this limitation; however, generating an analytical probability density for $f$ from the Gaussian density of $y(x)$ is often an intractable problem (e.g., when $f$ is nonlinear). Previous work has handled this issue by using sampling techniques or by solving an auxiliary problem over an augmented space where the values of $y(x)$ are constrained by confidence intervals derived from the GP models; such solutions are computationally intensive. In this work, we provide a detailed implementation of a recently proposed grey-box BO paradigm, BOIS, that uses adaptive linearizations of $f$ to obtain analytical expressions for the statistical moments of the composite function. We show that the BOIS approach enables the exploitation of structural knowledge, such as that arising in interconnected systems as well as systems that embed multiple GP models and combinations of physics and GP models. We benchmark the effectiveness of BOIS against standard BO and existing grey-box BO algorithms using a pair of case studies focused on chemical process optimization and design. Our results indicate that BOIS performs as well as or better than existing grey-box methods, while also being less computationally intensive.
Risk-Averse Certification of Bayesian Neural Networks
Zhang, Xiyue, Wang, Zifan, Gao, Yulong, Romao, Licio, Abate, Alessandro, Kwiatkowska, Marta
In light of the inherently complex and dynamic nature of real-world environments, incorporating risk measures is crucial for the robustness evaluation of deep learning models. In this work, we propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN. Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN, represented using a set of template polytopes. To enhance robustness evaluation, we integrate a coherent distortion risk measure--Conditional Value at Risk (CVaR)--into the certification framework, providing probabilistic guarantees based on empirical distributions obtained through sampling. We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method. The results show that RAC-BNN effectively quantifies robustness under worst-performing risky scenarios, and achieves tighter certified bounds and higher efficiency in complex tasks.
Safe Bayesian Optimization for High-Dimensional Control Systems via Additive Gaussian Processes
Wang, Hongxuan, Li, Xiaocong, Bhaumik, Adrish, Vadakkepat, Prahlad
Controller tuning and optimization have been among the most fundamental problems in robotics and mechatronic systems. The traditional methodology is usually model-based, but its performance heavily relies on an accurate mathematical model of the system. In control applications with complex dynamics, obtaining a precise model is often challenging, leading us towards a data-driven approach. While optimizing a single controller has been explored by various researchers, it remains a challenge to obtain the optimal controller parameters safely and efficiently when multiple controllers are involved. In this paper, we propose a high-dimensional safe Bayesian optimization method based on additive Gaussian processes to optimize multiple controllers simultaneously and safely. Additive Gaussian kernels replace the traditional squared-exponential kernels or Mat\'ern kernels, enhancing the efficiency with which Gaussian processes update information on unknown functions. Experimental results on a permanent magnet synchronous motor (PMSM) demonstrate that compared to existing safe Bayesian optimization algorithms, our method can obtain optimal parameters more efficiently while ensuring safety.