Optimization
Using Tabu Search Algorithm for Map Generation in the Terra Mystica Tabletop Game
Grichshenko, Alexandr, de Araujo, Luiz Jonata Pires, Gimaeva, Susanna, Brown, Joseph Alexander
Tabu Search (TS) metaheuristic improves simple local search algorithms (e.g. steepest ascend hill-climbing) by enabling the algorithm to escape local optima points. It has shown to be useful for addressing several combinatorial optimization problems. This paper investigates the performance of TS and considers the effects of the size of the Tabu list and the size of the neighbourhood for a procedural content generation, specifically the generation of maps for a popular tabletop game called Terra Mystica. The results validate the feasibility of the proposed method and how it can be used to generate maps that improve existing maps for the game.
Sample Efficient Graph-Based Optimization with Noisy Observations
Nguyen, Tan, Shameli, Ali, Abbasi-Yadkori, Yasin, Rao, Anup, Kveton, Branislav
We study sample complexity of optimizing "hill-climbing friendly" functions defined on a graph under noisy observations. We define a notion of convexity, and we show that a variant of best-arm identification can find a near-optimal solution after a small number of queries that is independent of the size of the graph. For functions that have local minima and are nearly convex, we show a sample complexity for the classical simulated annealing under noisy observations. We show effectiveness of the greedy algorithm with restarts and the simulated annealing on problems of graph-based nearest neighbor classification as well as a web document re-ranking application.
Autonomous Materials Discovery Driven by Gaussian Process Regression with Inhomogeneous Measurement Noise and Anisotropic Kernels
Noack, Marcus M., Doerk, Gregory S., Li, Ruipeng, Streit, Jason K., Vaia, Richard A., Yager, Kevin G., Fukuto, Masafumi
A majority of experimental disciplines face the challenge of exploring large and high-dimensional parameter spaces in search of new scientific discoveries. Materials science is no exception; the wide variety of synthesis, processing, and environmental conditions that influence material properties gives rise to particularly vast parameter spaces. Recent advances have led to an increase in efficiency of materials discovery by increasingly automating the exploration processes. Methods for autonomous experimentation have become more sophisticated recently, allowing for multi-dimensional parameter spaces to be explored efficiently and with minimal human intervention, thereby liberating the scientists to focus on interpretations and big-picture decisions. Gaussian process regression (GPR) techniques have emerged as the method of choice for steering many classes of experiments. We have recently demonstrated the positive impact of GPR-driven decision-making algorithms on autonomously steering experiments at a synchrotron beamline. However, due to the complexity of the experiments, GPR often cannot be used in its most basic form, but rather has to be tuned to account for the special requirements of the experiments. Two requirements seem to be of particular importance, namely inhomogeneous measurement noise (input dependent or non-i.i.d.) and anisotropic kernel functions, which are the two concepts that we tackle in this paper. Our synthetic and experimental tests demonstrate the importance of both concepts for experiments in materials science and the benefits that result from including them in the autonomous decision-making process.
An optimizable scalar objective value cannot be objective and should not be the sole objective
Kloumann, Isabel, Tygert, Mark
The morality of algorithms and their potential for bias and discrimination are important concerns. A popular approach to machine learning and artificial intelligence is via the numerical optimization of objective functions, and adapting such an approach to handle ethics could seem natural: with a hammer in hand, everything looks like a nail. The hammer of much artificial intelligence is the optimization of objective values, so some might like to treat morality solely through such objective functions. However, relying solely on the optimization of scalar objective values is fraught with unavoidable flaws when dealing with real people.
A quest for a fair schedule: The Young Physicists' Tournament
Cechlárová, Katarína, Cseh, Ágnes, Jankó, Zsuzsanna, Kireš, Marián, Miňo, Lukáš
The Young Physicists Tournament is an established team-oriented scientific competition between high school students from 37 countries on 5 continents. The competition consists of scientific discussions called Fights. Three or four teams participate in each Fight, each of whom presents a problem while rotating the roles of Presenter, Opponent, Reviewer, and Observer among them. The rules of a few countries require that each team announce in advance 3 problems they will present at the national tournament. The task of the organizers is to choose the composition of Fights in such a way that each team presents each of its chosen problems exactly once and within a single Fight no problem is presented more than once. Besides formalizing these feasibility conditions, in this paper we formulate several additional fairness conditions for tournament schedules. We show that the fulfillment of some of them can be ensured by constructing suitable edge colorings in bipartite graphs. To find fair schedules, we propose integer linear programs and test them on real as well as randomly generated data.
Predicting molecular dipole moments by combining atomic partial charges and atomic dipoles
Veit, Max, Wilkins, David M., Yang, Yang, DiStasio, Robert A. Jr., Ceriotti, Michele
The molecular dipole moment ($\boldsymbol{\mu}$) is a central quantity in chemistry. It is essential in predicting infrared and sum-frequency generation spectra, as well as induction and long-range electrostatic interactions. Furthermore, it can be extracted directly from high-level quantum mechanical calculations, making it an ideal target for machine learning (ML). In this work, we choose to represent this quantity with a physically inspired ML model that captures two distinct physical effects: local atomic polarization is captured within the symmetry-adapted Gaussian process regression (SA-GPR) framework, which assigns a (vector) dipole moment to each atom, while movement of charge across the entire molecule is captured by assigning a partial (scalar) charge to each atom. The resulting "MuML" models are fitted together to reproduce molecular $\boldsymbol{\mu}$ computed using high-level coupled-cluster theory (CCSD) and density functional theory (DFT) on the QM7b dataset. The combined model shows excellent transferability when applied to a showcase dataset of larger and more complex molecules, approaching the accuracy of DFT at a small fraction of the computational cost. We also demonstrate that the uncertainty in the predictions can be estimated reliably using a calibrated committee model. The ultimate performance of the models depends, however, on the details of the system at hand, with the scalar model being clearly superior when describing large molecules whose dipole is almost entirely generated by charge separation. These observations point to the importance of simultaneously accounting for the local and non-local effects that contribute to $\boldsymbol{\mu}$; further, they define a challenging task to benchmark future models, particularly those aimed at the description of condensed phases.
Bayesian optimization for modular black-box systems with switching costs
Lin, Chi-Heng, Miano, Joseph D., Dyer, Eva L.
Most existing black-box optimization methods assume that all variables in the system being optimized have equal cost and can change freely at each iteration. However, in many real world systems, inputs are passed through a sequence of different operations or modules, making variables in earlier stages of processing more costly to update. Such structure imposes a cost on switching variables in early parts of a data processing pipeline. In this work, we propose a new algorithm for switch cost-aware optimization called Lazy Modular Bayesian Optimization (LaMBO). This method efficiently identifies the global optimum while minimizing cost through a passive change of variables in early modules. The method is theoretical grounded and achieves vanishing regret when augmented with switching cost. We apply LaMBO to multiple synthetic functions and a three-stage image segmentation pipeline used in a neuroscience application, where we obtain promising improvements over prevailing cost-aware Bayesian optimization algorithms. Our results demonstrate that LaMBO is an effective strategy for black-box optimization that is capable of minimizing switching costs in modular systems.
An Efficient Shared-memory Parallel Sinkhorn-Knopp Algorithm to Compute the Word Mover's Distance
Tithi, Jesmin Jahan, Petrini, Fabrizio
The Word Mover's Distance (WMD) is a metric that measures the semantic dissimilarity between two text documents by computing the cost of moving all words of a source/query document to the most similar words of a target document optimally. Computing WMD between two documents is costly because it requires solving an optimization problem that costs \(O(V^3log(V))\) where \(V\) is the number of unique words in the document. Fortunately, the WMD can be framed as the Earth Mover's Distance (EMD) (also known as the Optimal Transportation Distance) for which it has been shown that the algorithmic complexity can be reduced to \(O(V^2)\) by adding an entropy penalty to the optimization problem and a similar idea can be adapted to compute WMD efficiently. Additionally, the computation can be made highly parallel by computing WMD of a single query document against multiple target documents at once (e.g., finding whether a given tweet is similar to any other tweets happened in a day). In this paper, we present a shared-memory parallel Sinkhorn-Knopp Algorithm to compute the WMD of one document against many other documents by adopting the \(O(V^2)\) EMD algorithm. We used algorithmic transformations to change the original dense compute-heavy kernel to a sparse compute kernel and obtained \(67\times\) speedup using \(96\) cores on the state-of-the-art of Intel\textregistered{} 4-sockets Cascade Lake machine w.r.t. its sequential run. Our parallel algorithm is over \(700\times\) faster than the naive parallel python code that internally uses optimized matrix library calls.
Objective-Sensitive Principal Component Analysis for High-Dimensional Inverse Problems
Elizarev, Maksim, Mukhin, Andrei, Khlyupin, Aleksey
We present a novel approach for adaptive, differentiable parameterization of large-scale random fields. If the approach is coupled with any gradient-based optimization algorithm, it can be applied to a variety of optimization problems, including history matching. The developed technique is based on principal component analysis (PCA) but modifies a purely data-driven basis of principal components considering objective function behavior. To define an efficient encoding, Gradient-Sensitive PCA uses an objective function gradient with respect to model parameters. We propose computationally efficient implementations of the technique, and two of them are based on stationary perturbation theory (SPT). Optimality, correctness, and low computational costs of the new encoding approach are tested, verified, and discussed. Three algorithms for optimal parameter decomposition are presented and applied to an objective of 2D synthetic history matching. The results demonstrate improvements in encoding quality regarding objective function minimization and distributional patterns of the desired field. Possible applications and extensions are proposed.
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Cappart, Quentin, Moisan, Thierry, Rousseau, Louis-Martin, Prémont-Schwarz, Isabeau, Cire, Andre
Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving intractable for large problems. In the last years, deep reinforcement learning (DRL) has shown its promise for designing good heuristics dedicated to solve NP-hard combinatorial optimization problems. However, current approaches have two shortcomings: (1) they mainly focus on the standard travelling salesman problem and they cannot be easily extended to other problems, and (2) they only provide an approximate solution with no systematic ways to improve it or to prove optimality. In another context, constraint programming (CP) is a generic tool to solve combinatorial optimization problems. Based on a complete search procedure, it will always find the optimal solution if we allow an execution time large enough. A critical design choice, that makes CP non-trivial to use in practice, is the branching decision, directing how the search space is explored. In this work, we propose a general and hybrid approach, based on DRL and CP, for solving combinatorial optimization problems. The core of our approach is based on a dynamic programming formulation, that acts as a bridge between both techniques. We experimentally show that our solver is efficient to solve two challenging problems: the traveling salesman problem with time windows, and the 4-moments portfolio optimization problem. Results obtained show that the framework introduced outperforms the stand-alone RL and CP solutions, while being competitive with industrial solvers.