Optimization
A Review of Causal Decision Making
Ge, Lin, Cai, Hengrui, Wan, Runzhe, Xu, Yang, Song, Rui
To make effective decisions, it is important to have a thorough understanding of the causal relationships among actions, environments, and outcomes. This review aims to surface three crucial aspects of decision-making through a causal lens: 1) the discovery of causal relationships through causal structure learning, 2) understanding the impacts of these relationships through causal effect learning, and 3) applying the knowledge gained from the first two aspects to support decision making via causal policy learning. Moreover, we identify challenges that hinder the broader utilization of causal decision-making and discuss recent advances in overcoming these challenges. Finally, we provide future research directions to address these challenges and to further enhance the implementation of causal decision-making in practice, with real-world applications illustrated based on the proposed causal decision-making. We aim to offer a comprehensive methodology and practical implementation framework by consolidating various methods in this area into a Python-based collection. URL: https://causaldm.github.io/Causal-Decision-Making.
Auto-ADMET: An Effective and Interpretable AutoML Method for Chemical ADMET Property Prediction
de Sรก, Alex G. C., Ascher, David B.
Machine learning (ML) has been playing important roles in drug discovery in the past years by providing (pre-)screening tools for prioritising chemical compounds to pass through wet lab experiments. One of the main ML tasks in drug discovery is to build quantitative structure-activity relationship (QSAR) models, associating the molecular structure of chemical compounds with an activity or property. These properties -- including absorption, distribution, metabolism, excretion and toxicity (ADMET) -- are essential to model compound behaviour, activity and interactions in the organism. Although several methods exist, the majority of them do not provide an appropriate model's personalisation, yielding to bias and lack of generalisation to new data since the chemical space usually shifts from application to application. This fact leads to low predictive performance when completely new data is being tested by the model. The area of Automated Machine Learning (AutoML) emerged aiming to solve this issue, outputting tailored ML algorithms to the data at hand. Although an important task, AutoML has not been practically used to assist cheminformatics and computational chemistry researchers often, with just a few works related to the field. To address these challenges, this work introduces Auto-ADMET, an interpretable evolutionary-based AutoML method for chemical ADMET property prediction. Auto-ADMET employs a Grammar-based Genetic Programming (GGP) method with a Bayesian Network Model to achieve comparable or better predictive performance against three alternative methods -- standard GGP method, pkCSM and XGBOOST model -- on 12 benchmark chemical ADMET property prediction datasets. The use of a Bayesian Network model on Auto-ADMET's evolutionary process assisted in both shaping the search procedure and interpreting the causes of its AutoML performance.
Understanding Fixed Predictions via Confined Regions
Lawless, Connor, Weng, Tsui-Wei, Ustun, Berk, Udell, Madeleine
Machine learning models are designed to predict outcomes using features about an individual, but fail to take into account how individuals can change them. Consequently, models can assign fixed predictions that deny individuals recourse to change their outcome. This work develops a new paradigm to identify fixed predictions by finding confined regions in which all individuals receive fixed predictions. We introduce the first method, ReVer, for this task, using tools from mixed-integer quadratically constrained programming. Our approach certifies recourse for out-of-sample data, provides interpretable descriptions of confined regions, and runs in seconds on real world datasets. We conduct a comprehensive empirical study of confined regions across diverse applications. Our results highlight that existing point-wise verification methods fail to discover confined regions, while ReVer provably succeeds.
Multi-objective Cat Swarm Optimization Algorithm based on a Grid System
Ahmed, Aram M., Hassan, Bryar A., Rashid, Tarik A., Noori, Kaniaw A., Saeed, Soran Ab. M., Ahmed, Omed H., Umar, Shahla U.
This paper presents a multi-objective version of the Cat Swarm Optimization Algorithm called the Grid-based Multi-objective Cat Swarm Optimization Algorithm (GMOCSO). Convergence and diversity preservation are the two main goals pursued by modern multi-objective algorithms to yield robust results. To achieve these goals, we first replace the roulette wheel method of the original CSO algorithm with a greedy method. Then, two key concepts from Pareto Archived Evolution Strategy Algorithm (PAES) are adopted: the grid system and double archive strategy. Several test functions and a real-world scenario called the Pressure vessel design problem are used to evaluate the proposed algorithm's performance. In the experiment, the proposed algorithm is compared with other well-known algorithms using different metrics such as Reversed Generational Distance, Spacing metric, and Spread metric. The optimization results show the robustness of the proposed algorithm, and the results are further confirmed using statistical methods and graphs. Finally, conclusions and future directions were presented..
Attention-based UAV Trajectory Optimization for Wireless Power Transfer-assisted IoT Systems
Dong, Li, Jiang, Feibo, Peng, Yubo
--Unmanned Aerial V ehicles (UA Vs) in Wireless Power Transfer (WPT)-assisted Internet of Things (IoT) systems face the following challenges: limited resources and suboptimal trajectory planning. Reinforcement learning-based trajectory planning schemes face issues of low search efficiency and learning instability when optimizing large-scale systems. T o address these issues, we present an Attention-based UA V Trajectory Optimization (AUTO) framework based on the graph transformer, which consists of an Attention Trajectory Optimization Model (A TOM) and a Trajectory lEarNing Method based on Actor-critic (TENMA). In A TOM, a graph encoder is used to calculate the self-attention characteristics of all IoTDs, and a trajectory decoder is developed to optimize the number and trajectories of UA Vs. TENMA then trains the A TOM using an improved Actor-Critic method, in which the real reward of the system is applied as the baseline to reduce variances in the critic network. This method is suitable for high-quality and large-scale multi-UA V trajectory planning. Finally, we develop numerous experiments, including a hardware experiment in the field case, to verify the feasibility and efficiency of the AUTO framework. I NTRODUCTION With the advancement of 5G, the Internet of Things (IoT) has become widely used in a variety of fields, including environmental monitoring, healthcare, and industry 4.0, among others. However, due to limited transmitting power and battery capacity, Internet of Things Devices (IoTDs) perform poorly in long-distance communication.
From Target Tracking to Targeting Track -- Part II: Regularized Polynomial Trajectory Optimization
Li, Tiancheng, Song, Yan, Li, Guchong, Li, Hao
Target tracking entails the estimation of the evolution of the target state over time, namely the target trajectory. Different from the classical state space model, our series of studies, including this paper, model the collection of the target state as a stochastic process (SP) that is further decomposed into a deterministic part which represents the trend of the trajectory and a residual SP representing the residual fitting error. Subsequently, the tracking problem is formulated as a learning problem regarding the trajectory SP for which a key part is to estimate a trajectory FoT (T-FoT) best fitting the measurements in time series. For this purpose, we consider the polynomial T-FoT and address the regularized polynomial T-FoT optimization employing two distinct regularization strategies seeking trade-off between the accuracy and simplicity. One limits the order of the polynomial and then the best choice is determined by grid searching in a narrow, bounded range while the other adopts $\ell_0$ norm regularization for which the hybrid Newton solver is employed. Simulation results obtained in both single and multiple maneuvering target scenarios demonstrate the effectiveness of our approaches.
A Fenchel-Young Loss Approach to Data-Driven Inverse Optimization
Li, Zhehao, Wu, Yanchen, Mao, Xiaojie
Data-driven inverse optimization seeks to estimate unknown parameters in an optimization model from observations of optimization solutions. Many existing methods are ineffective in handling noisy and suboptimal solution observations and also suffer from computational challenges. In this paper, we build a connection between inverse optimization and the Fenchel-Young (FY) loss originally designed for structured prediction, proposing a FY loss approach to data-driven inverse optimization. This new approach is amenable to efficient gradient-based optimization, hence much more efficient than existing methods. We provide theoretical guarantees for the proposed method and use extensive simulation and real-data experiments to demonstrate its significant advantage in parameter estimation accuracy, decision error and computational speed.
New Spectral Algorithms for Refuting Smoothed k-SAT
Despite being a quintessential example of a hard problem, the quest for finding fast algorithms for deciding satisfiability of propositional formulas has occupied computer scientists both in theory and in practice. In this article, we survey recent progress on designing algorithms with strong refutation guarantees for smoothed instances of the k-SAT problem. Smoothed instances are formed by slight random perturbations of arbitrary instances, and their study is a way to bridge the gap between worst-case and average-case models of problem instances. Our methods yield new algorithms for smoothed k-SAT instances with guarantees that match those for the significantly simpler and well-studied model of random formulas. Additionally, they have led to a novel and unexpected line of attack on some longstanding extremal combinatorial problems in graph theory and coding theory. As an example, we will discuss the resolution of a 2008 conjecture of Feige on the existence of short cycles in hypergraphs. The famous SAT problem asks to determine if a given propositional formula is satisfiable. That is, can we set the formula's variables to 0 (False) or 1 (True) in a way so that the formula evaluates to 1 (True). In this article, we will focus on the k-SAT problem where the propositional formula is further restricted to be in the k-CNF form, that is, logical AND of a collection of k-clauses, each of which is a logical OR of at most k literals (variables or their logical negations). For example, (x1 x2 x3) (x2 x4 x5) is a 3-CNF formula in variables x1,x2,โฆ,x5 where,, and denote the logical AND, OR and NOT operations, respectively. Given a k-CNF formula, we are interested in either finding a satisfying truth assignment, if it exists, or a "refutation"--a short, easily-checkable proof that the formula is unsatisfiable. Despite its simplicity, k-SAT is phenomenally expressive and models a long list of important discrete optimization problems. A decades-long quest has thus focused on designing algorithms for k-SAT in both theory and practice.
Multi-Objective Optimization of Water Resource Allocation for Groundwater Recharge and Surface Runoff Management in Watershed Systems
Sharifi, Abbas, Naeini, Hajar Kazemi, Ahmadi, Mohsen, Asadi, Saeed, Varmaghani, Abbas
Land degradation and air pollution are primarily caused by the salinization of soil and desertification that occurs from the drying of salinity lakes and the release of dust into the atmosphere because of their dried bottom. The complete drying up of a lake has caused a community environmental catastrophe. In this study, we presented an optimization problem to determine the total surface runoff to maintain the level of salinity lake (Urmia Lake). The proposed process has two key stages: identifying the influential factors in determining the lake water level using sensitivity analysis approaches based upon historical data and optimizing the effective variable to stabilize the lake water level under changing design variables. Based upon the Sobol'-Jansen and Morris techniques, the groundwater level and total surface runoff flow are highly effective with nonlinear and interacting impacts of the lake water level. As a result of the sensitivity analysis, we found that it may be possible to effectively manage lake levels by adjusting total surface runoff. We used genetic algorithms, non-linear optimization, and pattern search techniques to solve the optimization problem. Furthermore, the lake level constraint is established based on a pattern as a constant number every month. In order to maintain a consistent pattern of lake levels, it is necessary to increase surface runoff by approximately 8.7 times during filling season. It is necessary to increase this quantity by 33.5 times during the draining season. In the future, the results may serve as a guide for the rehabilitation of the lake.
Orthogonal Calibration for Asynchronous Federated Learning
Zhang, Jiayun, Li, Shuheng, Huang, Haiyu, Yu, Xiaofan, Gupta, Rajesh K., Shang, Jingbo
Asynchronous federated learning mitigates the inefficiency of conventional synchronous aggregation by integrating updates as they arrive and adjusting their influence based on staleness. Due to asynchrony and data heterogeneity, learning objectives at the global and local levels are inherently inconsistent -- global optimization trajectories may conflict with ongoing local updates. Existing asynchronous methods simply distribute the latest global weights to clients, which can overwrite local progress and cause model drift. In this paper, we propose OrthoFL, an orthogonal calibration framework that decouples global and local learning progress and adjusts global shifts to minimize interference before merging them into local models. In OrthoFL, clients and the server maintain separate model weights. Upon receiving an update, the server aggregates it into the global weights via a moving average. For client weights, the server computes the global weight shift accumulated during the client's delay and removes the components aligned with the direction of the received update. The resulting parameters lie in a subspace orthogonal to the client update and preserve the maximal information from the global progress. The calibrated global shift is then merged into the client weights for further training. Extensive experiments show that OrthoFL improves accuracy by 9.6% and achieves a 12$\times$ speedup compared to synchronous methods. Moreover, it consistently outperforms state-of-the-art asynchronous baselines under various delay patterns and heterogeneity scenarios.