Optimization
Efficient dataset construction using active learning and uncertainty-aware neural networks for plasma turbulent transport surrogate models
Ho, Aaron, Zanisi, Lorenzo, de Leeuw, Bram, Galvan, Vincent, Rodriguez-Fernandez, Pablo, Howard, Nathaniel T.
This work demonstrates a proof-of-principle for using uncertainty-aware architectures, in combination with active learning techniques and an in-the-loop physics simulation code as a data labeller, to construct efficient datasets for data-driven surrogate model generation. Building off of a previous proof-of-principle successfully demonstrating training set reduction on static pre-labelled datasets, using the ADEPT framework, this strategy was applied again to the plasma turbulent transport problem within tokamak fusion plasmas, specifically the QuaLiKiz quasilinear electrostatic gyrokinetic turbulent transport code. While QuaLiKiz provides relatively fast evaluations, this study specifically targeted small datasets to serve as a proxy for more expensive codes, such as CGYRO or GENE. The newly implemented algorithm uses the SNGP architecture for the classification component of the problem and the BNN-NCP architecture for the regression component, training models for all turbulent modes (ITG, TEM, ETG) and all transport fluxes ($Q_e$, $Q_i$, $ฮ_e$, $ฮ_i$, and $ฮ _i$) described by the general QuaLiKiz output. With 45 active learning iterations, moving from a small initial training set of $10^{2}$ to a final set of $10^{4}$, the resulting models reached a $F_1$ classification performance of ~0.8 and a $R^2$ regression performance of ~0.75 on an independent test set across all outputs. This extrapolates to reaching the same performance and efficiency as the previous ADEPT pipeline, although on a problem with 1 extra input dimension. While the improvement rate achieved in this implementation diminishes faster than expected, the overall technique is formulated with components that can be upgraded and generalized to many surrogate modeling applications beyond plasma turbulent transport predictions.
Covering a Few Submodular Constraints and Applications
Bajpai, Tanvi, Chekuri, Chandra, Kulkarni, Pooja
We consider the problem of covering multiple submodular constraints. Given a finite ground set $N$, a cost function $c: N \rightarrow \mathbb{R}_+$, $r$ monotone submodular functions $f_1,f_2,\ldots,f_r$ over $N$ and requirements $b_1,b_2,\ldots,b_r$ the goal is to find a minimum cost subset $S \subseteq N$ such that $f_i(S) \ge b_i$ for $1 \le i \le r$. When $r=1$ this is the well-known Submodular Set Cover problem. Previous work \cite{chekuri2022covering} considered the setting when $r$ is large and developed bi-criteria approximation algorithms, and approximation algorithms for the important special case when each $f_i$ is a weighted coverage function. These are fairly general models and capture several concrete and interesting problems as special cases. The approximation ratios for these problem are at least $ฮฉ(\log r)$ which is unavoidable when $r$ is part of the input. In this paper, motivated by some recent applications, we consider the problem when $r$ is a \emph{fixed constant} and obtain two main results. For covering multiple submodular constraints we obtain a randomized bi-criteria approximation algorithm that for any given integer $ฮฑ\ge 1$ outputs a set $S$ such that $f_i(S) \ge$ $(1-1/e^ฮฑ-ฮต)b_i$ for each $i \in [r]$ and $\mathbb{E}[c(S)] \le (1+ฮต)ฮฑ\cdot \sf{OPT}$. Second, when the $f_i$ are weighted coverage functions from a deletion-closed set system we obtain a $(1+ฮต)$ $(\frac{e}{e-1})$ $(1+ฮฒ)$-approximation where $ฮฒ$ is the approximation ratio for the underlying set cover instances via the natural LP. These results show that one can obtain nearly as good an approximation for any fixed $r$ as what one would achieve for $r=1$. We mention some applications that follow easily from these general results and anticipate more in the future.
Unsupervised Learning of Local Updates for Maximum Independent Set in Dynamic Graphs
Parkar, Devendra, Chaturvedi, Anya, Daymude, Joshua J.
We present the first unsupervised learning model for finding Maximum Independent Sets (MaxIS) in dynamic graphs where edges change over time. Our method combines structural learning from graph neural networks (GNNs) with a learned distributed update mechanism that, given an edge addition or deletion event, modifies nodes' internal memories and infers their MaxIS membership in a single, parallel step. We parameterize our model by the update mechanism's radius and investigate the resulting performance-runtime tradeoffs for various dynamic graph topologies. We evaluate our model against a mixed integer programming solver and the state-of-the-art learning-based methods for MaxIS on static graphs (ICML 2020; NeurIPS 2020, 2023). Across synthetic and empirical dynamic graphs of 50-1,000 nodes, our model achieves competitive approximation ratios with excellent scalability; on large graphs, it significantly outperforms the state-of-the-art learning methods in solution quality, runtime, and memory usage. When generalizing to graphs of 10,000 nodes (100x larger than the ones used for training), our model produces MaxIS solutions 1.05-1.18x larger than any other learning method, even while maintaining competitive runtimes.
FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization
Yao, Yiming, Liu, Fei, Zhao, Liang, Lin, Xi, Zhang, Qingfu
Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregation posteriors. Rather than accessing extensive domain experiments in the real world, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior adaptability to unknown problems, without necessitating any subsequent model training or updates in the optimization process. We evaluate our method across a variety of synthetic benchmarks and real-word applications, and demonstrate its superior generality and competitive performance compared to existing methods.
IL-SLAM: Intelligent Line-assisted SLAM Based on Feature Awareness for Dynamic Environments
Zhang, Haolan, Canh, Thanh Nguyen, Li, Chenghao, Yang, Ruidong, Ji, Yonghoon, Chong, Nak Young
Visual Simultaneous Localization and Mapping (SLAM) plays a crucial role in autonomous systems. Traditional SLAM methods, based on static environment assumptions, struggle to handle complex dynamic environments. Recent dynamic SLAM systems employ geometric constraints and deep learning to remove dynamic features, yet this creates a new challenge: insufficient remaining point features for subsequent SLAM processes. Existing solutions address this by continuously introducing additional line and plane features to supplement point features, achieving robust tracking and pose estimation. However, current methods continuously introduce additional features regardless of necessity, causing two problems: unnecessary computational overhead and potential performance degradation from accumulated low-quality additional features and noise. To address these issues, this paper proposes a feature-aware mechanism that evaluates whether current features are adequate to determine if line feature support should be activated. This decision mechanism enables the system to introduce line features only when necessary, significantly reducing computational complexity of additional features while minimizing the introduction of low-quality features and noise. In subsequent processing, the introduced line features assist in obtaining better initial camera poses through tracking, local mapping, and loop closure, but are excluded from global optimization to avoid potential negative impacts from low-quality additional features in long-term process. Extensive experiments on TUM datasets demonstrate substantial improvements in both ATE and RPE metrics compared to ORB-SLAM3 baseline and superior performance over other dynamic SLAM and multi-feature methods.
Approximate constrained stochastic optimal control via parameterized input inference
Syed, Shahbaz P Qadri, Bai, He
Approximate methods to solve stochastic optimal control (SOC) problems have received significant interest from researchers in the past decade. Probabilistic inference approaches to SOC have been developed to solve nonlinear quadratic Gaussian problems. In this work, we propose an Expectation-Maximization (EM) based inference procedure to generate state-feedback controls for constrained SOC problems. We consider the inequality constraints for the state and controls and also the structural constraints for the controls. We employ barrier functions to address state and control constraints. We show that the expectation step leads to smoothing of the state-control pair while the the maximization step on the non-zero subsets of the control parameters allows inference of structured stochastic optimal controllers. We demonstrate the effectiveness of the algorithm on unicycle obstacle avoidance, four-unicycle formation control, and quadcopter navigation in windy environment examples. In these examples, we perform an empirical study on the parametric effect of barrier functions on the state constraint satisfaction. We also present a comparative study of smoothing algorithms on the performance of the proposed approach.
Asynchronous and Stochastic Distributed Resource Allocation
Li, Qiang, Yemini, Michal, Wai, Hoi-To
This work proposes and studies the distributed resource allocation problem in asynchronous and stochastic settings. We consider a distributed system with multiple workers and a coordinating server with heterogeneous computation and communication times. We explore an approximate stochastic primal-dual approach with the aim of 1) adhering to the resource budget constraints, 2) allowing for the asynchronicity between the workers and the server, and 3) relying on the locally available stochastic gradients. We analyze our Asynchronous stochastic Primal-Dual (Asyn-PD) algorithm and prove its convergence in the second moment to the saddle point solution of the approximate problem at the rate of $O(1/t)$, where $t$ is the iteration number. Furthermore, we verify our algorithm numerically to validate the analytically derived convergence results, and demonstrate the advantages of utilizing our asynchronous algorithm rather than deploying a synchronous algorithm where the server must wait until it gets update from all workers.
Gaming and Cooperation in Federated Learning: What Can Happen and How to Monitor It
Kim, Dongseok, Jeong, Wonjun, Oh, Gisung
The success of Federated Learning depends on the actions that participants take out of sight. We model Federated Learning not as a mere optimization task but as a strategic system entangled with rules and incentives. From this perspective, we present an analytical framework that makes it possible to clearly identify where behaviors that genuinely improve performance diverge from those that merely target metrics. We introduce two indices that respectively quantify behavioral incentives and collective performance loss, and we use them as the basis for consistently interpreting the impact of operational choices such as rule design, the level of information disclosure, evaluation methods, and aggregator switching. We further summarize thresholds, auto-switch rules, and early warning signals into a checklist that can be applied directly in practice, and we provide both a practical algorithm for allocating limited audit resources and a performance guarantee. Simulations conducted across diverse environments consistently validate the patterns predicted by our framework, and we release all procedures for full reproducibility. While our approach operates most strongly under several assumptions, combining periodic recalibration, randomization, and connectivity-based alarms enables robust application under the variability of real-world operations. We present both design principles and operational guidelines that lower the incentives for metric gaming while sustaining and expanding stable cooperation.
A Rollout-Based Algorithm and Reward Function for Resource Allocation in Business Processes
Middelhuis, Jeroen, Bukhsh, Zaharah, Adan, Ivo, Dijkman, Remco
Resource allocation plays a critical role in minimizing cycle time and improving the efficiency of business processes. Recently, Deep Reinforcement Learning (DRL) has emerged as a powerful technique to optimize resource allocation policies in business processes. In the DRL framework, an agent learns a policy through interaction with the environment, guided solely by reward signals that indicate the quality of its decisions. However, existing algorithms are not suitable for dynamic environments such as business processes. Furthermore, existing DRL-based methods rely on engineered reward functions that approximate the desired objective, but a misalignment between reward and objective can lead to undesired decisions or suboptimal policies. To address these issues, we propose a rollout-based DRL algorithm and a reward function to optimize the objective directly. Our algorithm iteratively improves the policy by evaluating execution trajectories following different actions. Our reward function directly decomposes the objective function of minimizing the cycle time, such that trial-and-error reward engineering becomes unnecessary. We evaluated our method in six scenarios, for which the optimal policy can be computed, and on a set of increasingly complex, realistically sized process models. The results show that our algorithm can learn the optimal policy for the scenarios and outperform or match the best heuristics on the realistically sized business processes.
Surrogate Benchmarks for Model Merging Optimization
Akizuki, Rio, Kudo, Yuya, Yoshinari, Nozomu, Hirose, Yoichi, Nishimoto, Toshiyuki, Uchida, Kento, Shirakawa, Shinichi
Model merging techniques aim to integrate the abilities of multiple models into a single model. Most model merging techniques have hyperparameters, and their setting affects the performance of the merged model. Because several existing works show that tuning hyperparameters in model merging can enhance the merging outcome, developing hyperparameter optimization algorithms for model merging is a promising direction. However, its optimization process is computationally expensive, particularly in merging LLMs. In this work, we develop surrogate benchmarks for optimization of the merging hyperparameters to realize algorithm development and performance comparison at low cost. We define two search spaces and collect data samples to construct surrogate models to predict the performance of a merged model from a hyperparameter. We demonstrate that our benchmarks can predict the performance of merged models well and simulate optimization algorithm behaviors.