Optimization
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Agarwal, Mridul, Bai, Qinbo, Aggarwal, Vaneet
We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. Various learning applications with constraints, such as robotics, do not allow for policies that can violate constraints. To this end, we propose a model-based learning algorithm that achieves zero constraint violations. To obtain this result, we assume that the concave objective and the convex constraints have a solution interior to the set of feasible occupation measures. We then solve a tighter optimization problem to ensure that the constraints are never violated despite the imprecise model knowledge and model stochasticity. We also propose a novel Bellman error based analysis for tabular infinite-horizon setups which allows to analyse stochastic policies. Combining the Bellman error based analysis and tighter optimization equation, for $T$ interactions with the environment, we obtain a regret guarantee for objective which grows as $\Tilde{O}(1/\sqrt{T})$, excluding other factors.
Why Silicon Valley's Optimization Mindset Sets Us Up for Failure
In 2013 a Silicon Valley software engineer decided that food is an inconvenience--a pain point in a busy life. Buying food, preparing it, and cleaning up afterwards struck him as an inefficient way to feed himself. And so was born the idea of Soylent, Rob Rhinehart's meal replacement powder, described on its website as an International Complete Nutrition Platform. Soylent is the logical result of an engineer's approach to the "problem" of feeding oneself with food: there must be a more optimal solution. It's not hard to sense the trouble with this crushingly instrumental approach to nutrition.
Risk-Averse Decision Making Under Uncertainty
Ahmadi, Mohamadreza, Rosolia, Ugo, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.
A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others. Traditionally, policy synthesis techniques are proposed such that a total expected cost or reward is minimized or maximized. However, optimality in the total expected cost sense is only reasonable if system behavior in the large number of runs is of interest, which has limited the use of such policies in practical mission-critical scenarios, wherein large deviations from the expected behavior may lead to mission failure. In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures, which we refer to as the constrained risk-averse problem. For MDPs, we reformulate the problem into a infsup problem via the Lagrangian framework and propose an optimization-based method to synthesize Markovian policies. For MDPs, we demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. For POMDPs, we show that, if the coherent risk measures can be defined as a Markov risk transition mapping, an infinite-dimensional optimization can be used to design Markovian belief-based policies. For stochastic finite-state controllers (FSCs), we show that the latter optimization simplifies to a (finite-dimensional) DCP and can be solved by the DCCP framework. We incorporate these DCPs in a policy iteration algorithm to design risk-averse FSCs for POMDPs.
NTS-NOTEARS: Learning Nonparametric Temporal DAGs With Time-Series Data and Prior Knowledge
Sun, Xiangyu, Liu, Guiliang, Poupart, Pascal, Schulte, Oliver
We propose a score-based DAG structure learning method for time-series data that captures linear, nonlinear, lagged and instantaneous relations among variables while ensuring acyclicity throughout the entire graph. The proposed method extends nonparametric NOTEARS, a recent continuous optimization approach for learning nonparametric instantaneous DAGs. The proposed method is faster than constraint-based methods using nonlinear conditional independence tests. We also promote the use of optimization constraints to incorporate prior knowledge into the structure learning process. A broad set of experiments with simulated data demonstrates that the proposed method discovers better DAG structures than several recent comparison methods. We also evaluate the proposed method on complex real-world data acquired from NHL ice hockey games containing a mixture of continuous and discrete variables. The code is available at https://github.com/xiangyu-sun-789/NTS-NOTEARS/.
A Black-box Adversarial Attack Strategy with Adjustable Sparsity and Generalizability for Deep Image Classifiers
Ghosh, Arka, Mullick, Sankha Subhra, Datta, Shounak, Das, Swagatam, Mallipeddi, Rammohan, Das, Asit Kr.
Constructing adversarial perturbations for deep neural networks is an important direction of research. Crafting image-dependent adversarial perturbations using white-box feedback has hitherto been the norm for such adversarial attacks. However, black-box attacks are much more practical for real-world applications. Universal perturbations applicable across multiple images are gaining popularity due to their innate generalizability. There have also been efforts to restrict the perturbations to a few pixels in the image. This helps to retain visual similarity with the original images making such attacks hard to detect. This paper marks an important step which combines all these directions of research. We propose the DEceit algorithm for constructing effective universal pixel-restricted perturbations using only black-box feedback from the target network. We conduct empirical investigations using the ImageNet validation set on the state-of-the-art deep neural classifiers by varying the number of pixels to be perturbed from a meagre 10 pixels to as high as all pixels in the image. We find that perturbing only about 10% of the pixels in an image using DEceit achieves a commendable and highly transferable Fooling Rate while retaining the visual quality. We further demonstrate that DEceit can be successfully applied to image dependent attacks as well. In both sets of experiments, we outperformed several state-of-the-art methods.
Robust Optimal Classification Trees Against Adversarial Examples
Decision trees are a popular choice of explainable model, but just like neural networks, they suffer from adversarial examples. Existing algorithms for fitting decision trees robust against adversarial examples are greedy heuristics and lack approximation guarantees. In this paper we propose ROCT, a collection of methods to train decision trees that are optimally robust against user-specified attack models. We show that the min-max optimization problem that arises in adversarial learning can be solved using a single minimization formulation for decision trees with 0-1 loss. We propose such formulations in Mixed-Integer Linear Programming and Maximum Satisfiability, which widely available solvers can optimize. We also present a method that determines the upper bound on adversarial accuracy for any model using bipartite matching. Our experimental results demonstrate that the existing heuristics achieve close to optimal scores while ROCT achieves state-of-the-art scores.
Initialization for Nonnegative Matrix Factorization: a Comprehensive Review
Hafshejani, Sajad Fathi, Moaberfard, Zahra
Non-negative matrix factorization (NMF) has become a popular method for representing meaningful data by extracting a non-negative basis feature from an observed non-negative data matrix. Some of the unique features of this method in identifying hidden data put this method amongst the powerful methods in the machine learning area. The NMF is a known non-convex optimization problem and the initial point has a significant effect on finding an efficient local solution. In this paper, we investigate the most popular initialization procedures proposed for NMF so far. We describe each method and present some of their advantages and disadvantages. Finally, some numerical results to illustrate the performance of each algorithm are presented.
tvopt: A Python Framework for Time-Varying Optimization
This paper introduces tvopt, a Python framework for prototyping and benchmarking time-varying (or online) optimization algorithms. The paper first describes the theoretical approach that informed the development of tvopt. Then it discusses the different components of the framework and their use for modeling and solving time-varying optimization problems. In particular, tvopt provides functionalities for defining both centralized and distributed online problems, and a collection of built-in algorithms to solve them, for example gradient-based methods, ADMM and other splitting methods. Moreover, the framework implements prediction strategies to improve the accuracy of the online solvers. The paper then proposes some numerical results on a benchmark problem and discusses their implementation using tvopt. The code for tvopt is available at https://github.com/nicola-bastianello/tvopt.
Fixed Support Tree-Sliced Wasserstein Barycenter
Takezawa, Yuki, Sato, Ryoma, Kozareva, Zornitsa, Ravi, Sujith, Yamada, Makoto
The Wasserstein barycenter has been widely studied in various fields, including natural language processing, and computer vision. However, it requires a high computational cost to solve the Wasserstein barycenter problem because the computation of the Wasserstein distance requires a quadratic time with respect to the number of supports. By contrast, the Wasserstein distance on a tree, called the tree-Wasserstein distance, can be computed in linear time and allows for the fast comparison of a large number of distributions. In this study, we propose a barycenter under the tree-Wasserstein distance, called the fixed support tree-Wasserstein barycenter (FS-TWB) and its extension, called the fixed support tree-sliced Wasserstein barycenter (FS-TSWB). More specifically, we first show that the FS-TWB and FS-TSWB problems are convex optimization problems and can be solved by using the projected subgradient descent. Moreover, we propose a more efficient algorithm to compute the subgradient and objective function value by using the properties of tree-Wasserstein barycenter problems. Through real-world experiments, we show that, by using the proposed algorithm, the FS-TWB and FS-TSWB can be solved two orders of magnitude faster than the original Wasserstein barycenter.
YAHPO Gym -- Design Criteria and a new Multifidelity Benchmark for Hyperparameter Optimization
Pfisterer, Florian, Schneider, Lennart, Moosbauer, Julia, Binder, Martin, Bischl, Bernd
When developing and analyzing new hyperparameter optimization (HPO) methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we list desirable properties and requirements for such benchmarks and propose a new set of challenging and relevant multifidelity HPO benchmark problems motivated by these requirements. For this, we revisit the concept of surrogate-based benchmarks and empirically compare them to more widely-used tabular benchmarks, showing that the latter ones may induce bias in performance estimation and ranking of HPO methods. We present a new surrogate-based benchmark suite for multifidelity HPO methods consisting of 9 benchmark collections that constitute over 700 multifidelity HPO problems in total. All our benchmarks also allow for querying of multiple optimization targets, enabling the benchmarking of multi-objective HPO. We examine and compare our benchmark suite with respect to the defined requirements and show that our benchmarks provide viable additions to existing suites.