Optimization
Department of Energy Announces $15 Million for Development of Artificial Intelligence and Machine Learning Tools
WASHINGTON, D.C. – Today, the U.S. Department of Energy's (DOE's) Advanced Research Projects Agency-Energy (ARPA-E) announced $15 million in funding for 23 projects to accelerate the incorporation of machine learning and artificial intelligence into the energy technology and product design processes as part of the Design Intelligence Fostering Formidable Energy Reduction (and) Enabling Novel Totally Impactful Advanced Technology Enhancements (DIFFERENTIATE) program. Launched in April of this year, the DIFFERENTIATE program aims to develop streamlined solutions to next-generation energy challenges. The program identified three general mathematical optimization problems that are common to many design processes. The selected projects then conceptualized machine learning and artificial intelligence-based solutions to help engineers execute and solve these problems in a manner that dramatically accelerates the pace of energy innovation. "The incorporation of AI and Machine Learning into our energy technology design and engineering processes has great potential to increase the productivity of our nation's engineers and scientists," said Secretary of Energy Rick Perry.
Gradient-based Optimization for Bayesian Preference Elicitation
Vendrov, Ivan, Lu, Tyler, Huang, Qingqing, Boutilier, Craig
Effective techniques for eliciting user preferences have taken on added importance as recommender systems (RSs) become increasingly interactive and conversational. A common and conceptually appealing Bayesian criterion for selecting queries is expected value of information (EVOI) . Unfortunately, it is computationally prohibitive to construct queries with maximum EVOI in RSs with large item spaces. We tackle this issue by introducing a continuous formulation of EVOI as a differentiable network that can be optimized using gradient methods available in modern machine learning (ML) computational frameworks (e.g., TensorFlow, PyTorch). We exploit this to develop a novel, scalable Monte Carlo method for EVOI optimization, which is more scalable for large item spaces than methods requiring explicit enumeration of items. While we emphasize the use of this approach for pairwise (or k -wise) comparisons of items, we also demonstrate how our method can be adapted to queries involving subsets of item attributes or "partial items," which are often more cognitively manageable for users. Experiments show that our gradient-based EVOI technique achieves state-of-the-art performance across several domains while scaling to large item spaces.
Bayesian optimization with local search
Gao, Yuzhou, Yu, Tengchao, Li, Jinglai
Global optimization finds applications in a wide range of real world problems. The multi-start methods are a popular class of global optimization techniques, which are based on the ideas of conducting local searches at multiple starting points, and then sequentially determine the starting points according to some prescribed rules. In this work we propose a new multi-start algorithm where the starting points are determined in a Bayesian optimization framework. Specifically, the method can be understood as to construct a new function by conducting local searches of the original objective function, where the new function attains the same global optima as the original one. Bayesian optimization is then applied to find the global optima of the new local search based function.
Iterative Peptide Modeling With Active Learning And Meta-Learning
Barrett, Rainier, White, Andrew D.
Often the development of novel materials is not amenable to high-throughput or purely computational screening methods. Instead, materials must be synthesized one at a time in a process that does not generate significant amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both material properties and predictive modeling accuracy. In this work, we study the effectiveness of active learning, which optimizes the order of experiments, and meta learning, which transfers knowledge from one context to another, to reduce the number of experiments necessary to build a predictive model. We present a novel multi-task benchmark database of peptides designed to advance active, few-shot, and meta-learning methods for experimental design. Each task is binary classification of peptides represented as a sequence string. We show results of standard active learning and meta-learning methods across these datasets to assess their ability to improve predictive models with the fewest number of experiments. We find the ensemble query by committee active learning method to be effective. The meta-learning method Reptile was found to improve accuracy. The robustness of these conclusions were tested across multiple model choices.
Black-box Combinatorial Optimization using Models with Integer-valued Minima
Bliek, Laurens, Verwer, Sicco, de Weerdt, Mathijs
When a black-box optimization objective can only be evaluated with costly or noisy measurements, most standard optimization algorithms are unsuited to find the optimal solution. Specialized algorithms that deal with exactly this situation make use of surrogate models. These models are usually continuous and smooth, which is beneficial for continuous optimization problems, but not necessarily for combinatorial problems. However, by choosing the basis functions of the surrogate model in a certain way, we show that it can be guaranteed that the optimal solution of the surrogate model is integer. This approach outperforms random search, simulated annealing and one Bayesian optimization algorithm on the problem of finding robust routes for a noise-perturbed traveling salesman benchmark problem, with similar performance as another Bayesian optimization algorithm, and outperforms all compared algorithms on a convex binary optimization problem with a large number of variables.
Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning
Shah, Sanket, Sinha, Arunesh, Varakantham, Pradeep, Perrault, Andrew, Tambe, Milind
Large-scale screening for potential threats with limited resources and capacity for screening is a problem of interest at airports, seaports, and other ports of entry. Adversaries can observe screening procedures and arrive at a time when there will be gaps in screening due to limited resource capacities. To capture this game between ports and adversaries, this problem has been previously represented as a Stackelberg game, referred to as a Threat Screening Game (TSG). Given the significant complexity associated with solving TSGs and uncertainty in arrivals of customers, existing work has assumed that screenees arrive and are allocated security resources at the beginning of the time window. In practice, screenees such as airport passengers arrive in bursts correlated with flight time and are not bound by fixed time windows. To address this, we propose an online threat screening model in which screening strategy is determined adaptively as a passenger arrives while satisfying a hard bound on acceptable risk of not screening a threat. To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk). We provide a novel way to efficiently enforce linear inequality constraints on the action output in Deep Reinforcement Learning. We show that our solution allows us to significantly reduce screenee wait time while guaranteeing a bound on risk.
A Generalized Markov Chain Model to Capture Dynamic Preferences and Choice Overload
Goutam, Kumar, Goyal, Vineet, Soret, Agathe
Assortment optimization is an important problem that arises in many practical applications such as retailing and online advertising where the goal is to find a subset of products from a universe of substitutable products that maximize a seller's expected revenue. The demand and the revenue depend on the substitution behavior of the customers that is captured by a choice model. One of the key challenges is to find the right model for the customer substitution behavior. Many parametric random utility based models have been considered in the literature to capture substitution. However, in all these models, the probability of purchase increases as we add more options to the assortment. This is not true in general and in many settings, the probability of purchase may decrease if we add more products to the assortment, referred to as the choice overload. In this paper we attempt to address these serious limitations and propose a generalization of the Markov chain based choice model considered in Blanchet et al. In particular, we handle dynamic preferences and the choice overload phenomenon using a Markovian comparison model that is a generalization of the Markovian substitution framework of Blanchet et al. The Markovian comparison framework allows us to implicitly model the search cost in the choice process and thereby, modeling both dynamic preferences as well as the choice overload phenomenon. We consider the assortment optimization problem for the special case of our generalized Markov chain model where the underlying Markov chain is rank-1 (this is a generalization of the Multinomial Logit model). We show that the assortment optimization problem under this model is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) for this problem.
Safe squeezing for antisparse coding
Elvira, Clément, Herzet, Cédric
Spreading the information over all coefficients of a representation is a desirable property in many applications such as digital communication or machine learning. This so-called antisparse representation can be obtained by solving a convex program involving an $\ell_\infty$-norm penalty combined with a quadratic discrepancy. In this paper, we propose a new methodology, dubbed safe squeezing, to accelerate the computation of antisparse representation. We describe a test that allows to detect saturated entries in the solution of the optimization problem. The contribution of these entries is compacted into a single vector, thus operating a form of dimensionality reduction. We propose two algorithms to solve the resulting lower dimensional problem. Numerical experiments show the effectiveness of the proposed method to detect the saturated components of the solution and illustrates the induced computational gains in the resolution of the antisparse problem.
Gradientless Descent: High-Dimensional Zeroth-Order Optimization
Golovin, Daniel, Karro, John, Kochanski, Greg, Lee, Chansoo, Song, Xingyou, Zhang, Qiuyi
Zeroth-order optimization is the process of minimizing an objective $f(x)$, given oracle access to evaluations at adaptively chosen inputs $x$. In this paper, we present two simple yet powerful GradientLess Descent (GLD) algorithms that do not rely on an underlying gradient estimate and are numerically stable. We analyze our algorithm from a novel geometric perspective and present a novel analysis that shows convergence within an $\epsilon$-ball of the optimum in $O(kQ\log(n)\log(R/\epsilon))$ evaluations, for any monotone transform of a smooth and strongly convex objective with latent dimension $k < n$, where the input dimension is $n$, $R$ is the diameter of the input space and $Q$ is the condition number. Our rates are the first of its kind to be both 1) poly-logarithmically dependent on dimensionality and 2) invariant under monotone transformations. We further leverage our geometric perspective to show that our analysis is optimal. Both monotone invariance and its ability to utilize a low latent dimensionality are key to the empirical success of our algorithms, as demonstrated on BBOB and MuJoCo benchmarks.
Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery
Fan, Jicong, Ding, Lijun, Chen, Yudong, Udell, Madeleine
This paper develops a new class of nonconvex regularizers for low-rank matrix recovery. Many regularizers are motivated as convex relaxations of the matrix rank function. Our new factor group-sparse regularizers are motivated as a relaxation of the number of nonzero columns in a factorization of the matrix. These nonconvex regularizers are sharper than the nuclear norm; indeed, we show they are related to Schatten-$p$ norms with arbitrarily small $0 < p \leq 1$. Moreover, these factor group-sparse regularizers can be written in a factored form that enables efficient and effective nonconvex optimization; notably, the method does not use singular value decomposition. We provide generalization error bounds for low-rank matrix completion which show improved upper bounds for Schatten-$p$ norm reglarization as $p$ decreases. Compared to the max norm and the factored formulation of the nuclear norm, factor group-sparse regularizers are more efficient, accurate, and robust to the initial guess of rank. Experiments show promising performance of factor group-sparse regularization for low-rank matrix completion and robust principal component analysis.