Optimization
Improving the Efficiency of Gradient Descent Algorithms Applied to Optimization Problems with Dynamical Constraints
Matei, Ion, Zhenirovskyy, Maksym, de Kleer, Johan, Maxwell, John
We introduce two block coordinate descent algorithms for solving optimization problems with ordinary differential equations (ODEs) as dynamical constraints. The algorithms do not need to implement direct or adjoint sensitivity analysis methods to evaluate loss function gradients. They results from reformulation of the original problem as an equivalent optimization problem with equality constraints. The algorithms naturally follow from steps aimed at recovering the gradient-decent algorithm based on ODE solvers that explicitly account for sensitivity of the ODE solution. In our first proposed algorithm we avoid explicitly solving the ODE by integrating the ODE solver as a sequence of implicit constraints. In our second algorithm, we use an ODE solver to reset the ODE solution, but no direct are adjoint sensitivity analysis methods are used. Both algorithm accepts mini-batch implementations and show significant efficiency benefits from GPU-based parallelization. We demonstrate the performance of the algorithms when applied to learning the parameters of the Cucker-Smale model. The algorithms are compared with gradient descent algorithms based on ODE solvers endowed with sensitivity analysis capabilities, for various number of state size, using Pytorch and Jax implementations. The experimental results demonstrate that the proposed algorithms are at least 4x faster than the Pytorch implementations, and at least 16x faster than Jax implementations. For large versions of the Cucker-Smale model, the Jax implementation is thousands of times faster than the sensitivity analysis-based implementation. In addition, our algorithms generate more accurate results both on training and test data. Such gains in computational efficiency is paramount for algorithms that implement real time parameter estimations, such as diagnosis algorithms.
Sparse Polynomial Optimization: Theory and Practice
The problem of minimizing a polynomial over a set of polynomial inequalities is an NP-hard non-convex problem. Thanks to powerful results from real algebraic geometry, one can convert this problem into a nested sequence of finite-dimensional convex problems. At each step of the associated hierarchy, one needs to solve a fixed size semidefinite program, which can be in turn solved with efficient numerical tools. On the practical side however, there is no-free lunch and such optimization methods usually encompass severe scalability issues. Fortunately, for many applications, we can look at the problem in the eyes and exploit the inherent data structure arising from the cost and constraints describing the problem, for instance sparsity or symmetries. This book presents several research efforts to tackle this scientific challenge with important computational implications, and provides the development of alternative optimization schemes that scale well in terms of computational complexity, at least in some identified class of problems. The presented algorithmic framework in this book mainly exploits the sparsity structure of the input data to solve large-scale polynomial optimization problems. We present sparsity-exploiting hierarchies of relaxations, for either unconstrained or constrained problems. By contrast with the dense hierarchies, they provide faster approximation of the solution in practice but also come with the same theoretical convergence guarantees. Our framework is not restricted to static polynomial optimization, and we expose hierarchies of approximations for values of interest arising from the analysis of dynamical systems. We also present various extensions to problems involving noncommuting variables, e.g., matrices of arbitrary size or quantum physic operators.
Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing
We investigate Nash equilibrium learning in a competitive Markov Game (MG) environment, where multiple agents compete, and multiple Nash equilibria can exist. In particular, for an oligopolistic dynamic pricing environment, exact Nash equilibria are difficult to obtain due to the curse-of-dimensionality. We develop a new model-free method to find approximate Nash equilibria. Gradient-free black box optimization is then applied to estimate $\epsilon$, the maximum reward advantage of an agent unilaterally deviating from any joint policy, and to also estimate the $\epsilon$-minimizing policy for any given state. The policy-$\epsilon$ correspondence and the state to $\epsilon$-minimizing policy are represented by neural networks, the latter being the Nash Policy Net. During batch update, we perform Nash Q learning on the system, by adjusting the action probabilities using the Nash Policy Net. We demonstrate that an approximate Nash equilibrium can be learned, particularly in the dynamic pricing domain where exact solutions are often intractable.
SONAR: Joint Architecture and System Optimization Search
Jääsaari, Elias, Ma, Michelle, Talwalkar, Ameet, Chen, Tianqi
There is a growing need to deploy machine learning for different tasks on a wide array of new hardware platforms. Such deployment scenarios require tackling multiple challenges, including identifying a model architecture that can achieve a suitable predictive accuracy (architecture search), and finding an efficient implementation of the model to satisfy underlying hardware-specific systems constraints such as latency (system optimization search). Existing works treat architecture search and system optimization search as separate problems and solve them sequentially. In this paper, we instead propose to solve these problems jointly, and introduce a simple but effective baseline method called SONAR that interleaves these two search problems. SONAR aims to efficiently optimize for predictive accuracy and inference latency by applying early stopping to both search processes. Our experiments on multiple different hardware back-ends show that SONAR identifies nearly optimal architectures 30 times faster than a brute force approach.
Mango: a new way to make Bayesian optimisation in Python
Now, let's dive into Mango! In recent years, the amount of data has grown considerably. This represents a challenge for data scientists who need their machine learning pipelines to be scalable. Distributed computing might solve this issue. Distributed computing refers to a set of computers that work on a common task while communicating with each other.
Multi-objective optimization of actuation waveform for high-precision drop-on-demand inkjet printing
Wang, Hanzhi, Hasegawa, Yosuke
Drop-on-demand (DOD) inkjet printing has been considered as one of promising technologies for the fabrication of advanced functional materials. For a DOD printer, high-precision dispensing techniques for achieving satellite-free smaller droplets, have long been desired for patterning thin-film structures. The present study considers the inlet velocity of a liquid chamber located upstream of a dispensing nozzle as a control variable and aims to optimize its waveform using a sample-efficient Bayesian optimization algorithm. Firstly, the droplet dispensing dynamics are numerically reproduced by using an open-source OpenFOAM solver, interFoam, and the results are passed on to another code based on pyFoam. Then, the parameters characterizing the actuation waveform driving a DOD printer are determined by the Bayesian optimization (BO) algorithm so as to maximize a prescribed multi-objective function expressed as the sum of two factors, i.e., the size of a primary droplet and the presence of satellite droplets. The results show that the present BO algorithm can successfully find high-precision dispensing waveforms within 150 simulations. Specifically, satellite droplets can be effectively eliminated and the droplet diameter can be significantly reduced to 24.9% of the nozzle diameter by applying the optimal waveform.
A Consistency Constraint-Based Approach to Coupled State Constraints in Distributed Model Predictive Control
Wiltz, Adrian, Chen, Fei, Dimarogonas, Dimos V.
In this paper, we present a distributed model predictive control (DMPC) scheme for dynamically decoupled systems which are subject to state constraints, coupling state constraints and input constraints. In the proposed control scheme, neighbor-to-neighbor communication suffices and all subsystems solve their local optimization problem in parallel. The approach relies on consistency constraints which define a neighborhood around each subsystem's reference trajectory where the state of the respective subsystem is guaranteed to stay in. Reference trajectories and consistency constraints are known to neighboring subsystems. Contrary to other relevant approaches, the reference trajectories are improved iteratively. Besides, the presented approach allows the formulation of convex optimization problems even in the presence of non-convex state constraints. The algorithm's effectiveness is demonstrated with a simulation.
A Survey of Open Source Automation Tools for Data Science Predictions
We present an expository overview of technical and cultural challenges to the development and adoption of automation at various stages in the data science prediction lifecycle, restricting focus to supervised learning with structured datasets. In addition, we review popular open source Python tools implementing common solution patterns for the automation challenges and highlight gaps where we feel progress still demands to be made.
On Differential Privacy for Federated Learning in Wireless Systems with Multiple Base Stations
Tavangaran, Nima, Chen, Mingzhe, Yang, Zhaohui, Silva, José Mairton B. Da Jr., Poor, H. Vincent
In this work, we consider a federated learning model in a wireless system with multiple base stations and inter-cell interference. We apply a differential private scheme to transmit information from users to their corresponding base station during the learning phase. We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap. Furthermore, we define an optimization problem to reduce this upper bound and the total privacy leakage. To find the locally optimal solutions of this problem, we first propose an algorithm that schedules the resource blocks and users. We then extend this scheme to reduce the total privacy leakage by optimizing the differential privacy artificial noise. We apply the solutions of these two procedures as parameters of a federated learning system. In this setting, we assume that each user is equipped with a classifier. Moreover, the communication cells are assumed to have mostly fewer resource blocks than numbers of users. The simulation results show that our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler. Furthermore, its extended version with noise optimizer significantly reduces the amount of privacy leakage.
Pushing the limits of fairness impossibility: Who's the fairest of them all?
Hsu, Brian, Mazumder, Rahul, Nandy, Preetam, Basu, Kinjal
The impossibility theorem of fairness is a foundational result in the algorithmic fairness literature. It states that outside of special cases, one cannot exactly and simultaneously satisfy all three common and intuitive definitions of fairness - demographic parity, equalized odds, and predictive rate parity. This result has driven most works to focus on solutions for one or two of the metrics. Rather than follow suit, in this paper we present a framework that pushes the limits of the impossibility theorem in order to satisfy all three metrics to the best extent possible. We develop an integer-programming based approach that can yield a certifiably optimal post-processing method for simultaneously satisfying multiple fairness criteria under small violations. We show experiments demonstrating that our post-processor can improve fairness across the different definitions simultaneously with minimal model performance reduction. We also discuss applications of our framework for model selection and fairness explainability, thereby attempting to answer the question: who's the fairest of them all?