Optimization
Causal Bayesian Optimization
Aglietti, Virginia, Lu, Xiaoyu, Paleyes, Andrei, González, Javier
This paper studies the problem of globally optimizing a variable of interest that is part of a causal model in which a sequence of interventions can be performed. This problem arises in biology, operational research, communications and, more generally, in all fields where the goal is to optimize an output metric of a system of interconnected nodes. Our approach combines ideas from causal inference, uncertainty quantification and sequential decision making. In particular, it generalizes Bayesian optimization, which treats the input variables of the objective function as independent, to scenarios where causal information is available. We show how knowing the causal graph significantly improves the ability to reason about optimal decision making strategies decreasing the optimization cost while avoiding suboptimal solutions. We propose a new algorithm called Causal Bayesian Optimization (CBO). CBO automatically balances two trade-offs: the classical exploration-exploitation and the new observation-intervention, which emerges when combining real interventional data with the estimated intervention effects computed via do-calculus. We demonstrate the practical benefits of this method in a synthetic setting and in two real-world applications.
Robust Matrix Completion with Mixed Data Types
We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas these estimators to the case where the underlying distributions belongs to the exponential family. Most of these approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm. We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step. We also provide extensive simulation evidence that corroborate our theoretical results.
Fair Policy Targeting
Viviano, Davide, Bradic, Jelena
One of the major concerns of targeting interventions on individuals in social welfare programs is discrimination: individualized treatments may induce disparities on sensitive attributes such as age, gender, or race. This paper addresses the question of the design of fair and efficient treatment allocation rules. We adopt the non-maleficence perspective of "first do no harm": we propose to select the fairest allocation within the Pareto frontier. We provide envy-freeness justifications to novel counterfactual notions of fairness. We discuss easy-to-implement estimators of the policy function, by casting the optimization into a mixed-integer linear program formulation. We derive regret bounds on the unfairness of the estimated policy function, and small sample guarantees on the Pareto frontier. Finally, we illustrate our method using an application from education economics.
Reactive Sample Size for Heuristic Search in Simulation-based Optimization
Dalcastagné, Manuel, Mariello, Andrea, Battiti, Roberto
In simulation-based optimization, the optimal setting of the input parameters of the objective function can be determined by heuristic optimization techniques. However, when simulators model the stochasticity of real-world problems, their output is a random variable and multiple evaluations of the objective function are necessary to properly compare the expected performance of different parameter settings. This paper presents a novel reactive sample size algorithm based on parametric tests and indifference-zone selection, which can be used for improving the efficiency and robustness of heuristic optimization methods. The algorithm reactively decides, in an online manner, the sample size to be used for each comparison during the optimization according to observed statistical evidence. Tests employ benchmark functions extended with artificial levels of noise and a simulation-based optimization tool for hotel revenue management. Experimental results show that the reactive method can improve the efficiency and robustness of simulation-based optimization techniques.
Applying Evolutionary Metaheuristics for Parameter Estimation of Individual-Based Models
García, Antonio Prestes, Rodríguez-Patón, Alfonso
Modeling and simulation is certainly a vast discipline with a broad and complex body of knowledge having, beyond the surface, a large technical and theoretical background (Minsky, 1965) (Banks et al., 2009) (Zeigler et al., 2000) (Boccara, 2003) which consequently, is hard of being completely mastered from modelers coming from disperse domains like biology, ecology or even computer science. Among the existing formalisms, the agent-based or individual-based is increasing gradually the number of adepts in the recent years. The Individual-based modeling is a powerful methodology which is having more and more acceptance between researchers and practitioners of distinct branches from social to biological sciences, including specifically the modeling of ecological processes and microbial consortia studies. Certainly, one of the main reasons for the success of this approach is the relative simplicity for capturing micro-level properties, stochasticity and spatially complex phenomena without the requirement of a high level of mathematical background (Grimm and Railsback, 2005). But the counterpart of the ease for building complex and feature rich models, is the lack of a closed formal mathematical form of the model which implies that the study of these models cannot be attacked analytically.
Predicting Online Item-choice Behavior: A Shape-restricted Regression Perspective
Nishimura, Naoki, Sukegawa, Noriyoshi, Takano, Yuichi, Iwanaga, Jiro
This paper examines the relationship between user pageview (PV) histories and their item-choice behavior on an e-commerce website. We focus on PV sequences, which represent time series of the number of PVs for each user--item pair. We propose a shape-restricted optimization model that accurately estimates item-choice probabilities for all possible PV sequences. This model imposes monotonicity constraints on item-choice probabilities by exploiting partial orders for PV sequences, according to the recency and frequency of a user's previous PVs. To improve the computational efficiency of our optimization model, we devise efficient algorithms for eliminating all redundant constraints according to the transitivity of the partial orders. Experimental results using real-world clickstream data demonstrate that our method achieves higher prediction performance than that of a state-of-the-art optimization model and common machine learning methods.
Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems
Nazari, Parvin, Tarzanagh, Davoud Ataee, Michailidis, George
In this paper, we design and analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) stochastic optimization problems. Adaptive methods that use exponential moving averages of past gradients to update search directions and learning rates have recently attracted a lot of attention for solving optimization problems that arise in machine learning. Nevertheless, their convergence analysis almost exclusively requires smoothness and/or convexity of the objective function. In contrast, we establish non-asymptotic rates of convergence of first and zeroth-order adaptive methods and their proximal variants for a reasonably broad class of nonsmooth \& nonconvex optimization problems. Experimental results indicate how the proposed algorithms empirically outperform stochastic gradient descent and its zeroth-order variant for solving such optimization problems.
Fast differentiable DNA and protein sequence optimization for molecular design
Linder, Johannes, Seelig, Georg
Designing DNA and protein sequences with improved or novel function has the potential to greatly accelerate synthetic biology. Machine learning models that accurately predict biological fitness from sequence are becoming a powerful tool for molecular design. Activation maximization offers a simple design strategy for differentiable models: one-hot coded sequences are first approximated by a continuous representation which is then iteratively optimized with respect to the predictor oracle by gradient ascent. While elegant, this method is limited by technical challenges, as it suffers from vanishing gradients and may cause predictor pathologies leading to poor convergence. Here, we build on a previously proposed straight-through approximation method to optimize through discrete sequence samples. By normalizing nucleotide logits across positions and introducing an adaptive entropy variable, we remove bottlenecks arising from overly large or skewed sampling parameters. This results in a markedly improved algorithm with up to 100-fold faster convergence. Moreover, our method finds improved fitness optima compared to existing methods, including the original algorithm without normalization and global optimization heuristics such as Simulated Annealing. We demonstrate our improved method by designing DNA and enzyme sequences for six deep learning predictors, including a protein structure predictor (trRosetta).
MANGO: A Python Library for Parallel Hyperparameter Tuning
Sandha, Sandeep Singh, Aggarwal, Mohit, Fedorov, Igor, Srivastava, Mani
Tuning hyperparameters for machine learning algorithms is a tedious task, one that is typically done manually. To enable automated hyperparameter tuning, recent works have started to use techniques based on Bayesian optimization. However, to practically enable automated tuning for large scale machine learning training pipelines, significant gaps remain in existing libraries, including lack of abstractions, fault tolerance, and flexibility to support scheduling on any distributed computing framework. To address these challenges, we present Mango, a Python library for parallel hyperparameter tuning. Mango enables the use of any distributed scheduling framework, implements intelligent parallel search strategies, and provides rich abstractions for defining complex hyperparameter search spaces that are compatible with scikit-learn. Mango is comparable in performance to Hyperopt, another widely used library. Mango is available open-source and is currently used in production at Arm Research to provide state-of-art hyperparameter tuning capabilities.
Online Non-convex Learning for River Pollution Source Identification
Huang, Wenjie, Jiang, Jing, Liu, Xiao
In this paper, novel gradient based online learning algorithms are developed to investigate an important environmental application: real-time river pollution source identification, which aims at estimating the released mass, the location and the released time of a river pollution source based on downstream sensor data monitoring the pollution concentration. The problem can be formulated as a non-convex loss minimization problem in statistical learning, and our online algorithms have vectorized and adaptive step-sizes to ensure high estimation accuracy on dimensions having different magnitudes. In order to avoid gradient-based method sticking into the saddle points of non-convex loss, the "escaping from saddle points" module and multi-start version of algorithms are derived to further improve the estimation accuracy by searching for the global minimimals of the loss functions. It can be shown theoretically and experimentally $O(N)$ local regret of the algorithms, and the high probability cumulative regret bound $O(N)$ under particular error bound condition on loss functions. A real-life river pollution source identification example shows superior performance of our algorithms than the existing methods in terms of estimating accuracy. The managerial insights for decision maker to use the algorithm in reality are also provided.