Goto

Collaborating Authors

 Optimization


Polynomial-time algorithms for Multimarginal Optimal Transport problems with structure

arXiv.org Artificial Intelligence

Multimarginal Optimal Transport (MOT) has attracted significant interest due to applications in machine learning, statistics, and the sciences. However, in most applications, the success of MOT is severely limited by a lack of efficient algorithms. Indeed, MOT in general requires exponential time in the number of marginals k and their support sizes n. This paper develops a general theory about what "structure" makes MOT solvable in poly(n,k) time. We develop a unified algorithmic framework for solving MOT in poly(n,k) time by characterizing the "structure" that different algorithms require in terms of simple variants of the dual feasibility oracle. This framework has several benefits. First, it enables us to show that the Sinkhorn algorithm, which is currently the most popular MOT algorithm, requires strictly more structure than other algorithms do to solve MOT in poly(n,k) time. Second, our framework makes it much simpler to develop poly(n,k) time algorithms for a given MOT problem. In particular, it is necessary and sufficient to (approximately) solve the dual feasibility oracle -- which is much more amenable to standard algorithmic techniques. We illustrate this ease-of-use by developing poly(n,k) time algorithms for three general classes of MOT cost structures: (1) graphical structure; (2) set-optimization structure; and (3) low-rank plus sparse structure. For structure (1), we recover the known result that Sinkhorn has poly(n,k) runtime; moreover, we provide the first poly(n,k) time algorithms for computing solutions that are exact and sparse. For structures (2)-(3), we give the first poly(n,k) time algorithms, even for approximate computation. Together, these three structures encompass many -- if not most -- current applications of MOT.


Demystifying the Adversarial Robustness of Random Transformation Defenses

arXiv.org Artificial Intelligence

Neural networks' lack of robustness against attacks raises concerns in security-sensitive settings such as autonomous vehicles. While many countermeasures may look promising, only a few withstand rigorous evaluation. Defenses using random transformations (RT) have shown impressive results, particularly BaRT (Raff et al., 2019) on ImageNet. However, this type of defense has not been rigorously evaluated, leaving its robustness properties poorly understood. Their stochastic properties make evaluation more challenging and render many proposed attacks on deterministic models inapplicable. First, we show that the BPDA attack (Athalye et al., 2018a) used in BaRT's evaluation is ineffective and likely overestimates its robustness. We then attempt to construct the strongest possible RT defense through the informed selection of transformations and Bayesian optimization for tuning their parameters. Furthermore, we create the strongest possible attack to evaluate our RT defense. Our new attack vastly outperforms the baseline, reducing the accuracy by 83% compared to the 19% reduction by the commonly used EoT attack ($4.3\times$ improvement). Our result indicates that the RT defense on the Imagenette dataset (a ten-class subset of ImageNet) is not robust against adversarial examples. Extending the study further, we use our new attack to adversarially train RT defense (called AdvRT), resulting in a large robustness gain. Code is available at https://github.com/wagner-group/demystify-random-transform.


Selection of the Most Probable Best

arXiv.org Artificial Intelligence

We consider an expected-value ranking and selection problem where all k solutions' simulation outputs depend on a common uncertain input model. Given that the uncertainty of the input model is captured by a probability simplex on a finite support, we define the most probable best (MPB) to be the solution whose probability of being optimal is the largest. To devise an efficient sampling algorithm to find the MPB, we first derive a lower bound to the large deviation rate of the probability of falsely selecting the MPB, then formulate an optimal computing budget allocation (OCBA) problem to find the optimal static sampling ratios for all solution-input model pairs that maximize the lower bound. We devise a series of sequential algorithms that apply interpretable and computationally efficient sampling rules and prove their sampling ratios achieve the optimality conditions for the OCBA problem as the simulation budget increases. The algorithms are benchmarked against a state-of-the-art sequential sampling algorithm designed for contextual ranking and selection problems and demonstrated to have superior empirical performances at finding the MPB.


Teaching Networks to Solve Optimization Problems

arXiv.org Artificial Intelligence

Leveraging machine learning to facilitate the optimization process is an emerging field that holds the promise to bypass the fundamental computational bottleneck caused by classic iterative solvers in critical applications requiring near-real-time optimization. The majority of existing approaches focus on learning data-driven optimizers that lead to fewer iterations in solving an optimization. In this paper, we take a different approach and propose to replace the iterative solvers altogether with a trainable parametric set function, that outputs the optimal arguments/parameters of an optimization problem in a single feed forward. We denote our method as Learning to Optimize the Optimization Process (LOOP). We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems including linear/nonlinear regression, principal component analysis, transport-based coreset, and quadratic programming in supply management applications. In addition, we propose two alternative approaches for learning such parametric functions, with and without a solver in the LOOP. Finally, through various numerical experiments, we show that the trained solvers could be orders of magnitude faster than the classic iterative solvers while providing near optimal solutions.


A Versatile Co-Design Approach For Dynamic Legged Robots

arXiv.org Artificial Intelligence

We present a versatile framework for the computational co-design of legged robots and dynamic maneuvers. Current state-of-the-art approaches are typically based on random sampling or concurrent optimization. We propose a novel bilevel optimization approach that exploits the derivatives of the motion planning sub-problem (i.e., the lower level). These motion-planning derivatives allow us to incorporate arbitrary design constraints and costs in an general-purpose nonlinear program (i.e., the upper level). Our approach allows for the use of any differentiable motion planner in the lower level and also allows for an upper level that captures arbitrary design constraints and costs. It efficiently optimizes the robot's morphology, payload distribution and actuator parameters while considering its full dynamics, joint limits and physical constraints such as friction cones. We demonstrate these capabilities by designing quadruped robots that jump and trot. We show that our method is able to design a more energy-efficient Solo robot for these tasks.


Riemannian Natural Gradient Methods

arXiv.org Artificial Intelligence

This paper studies large-scale optimization problems on Riemannian manifolds whose objective function is a finite sum of negative log-probability losses. Such problems arise in various machine learning and signal processing applications. By introducing the notion of Fisher information matrix in the manifold setting, we propose a novel Riemannian natural gradient method, which can be viewed as a natural extension of the natural gradient method from the Euclidean setting to the manifold setting. We establish the almost-sure global convergence of our proposed method under standard assumptions. Moreover, we show that if the loss function satisfies certain convexity and smoothness conditions and the input-output map satisfies a Riemannian Jacobian stability condition, then our proposed method enjoys a local linear -- or, under the Lipschitz continuity of the Riemannian Jacobian of the input-output map, even quadratic -- rate of convergence. We then prove that the Riemannian Jacobian stability condition will be satisfied by a two-layer fully connected neural network with batch normalization with high probability, provided that the width of the network is sufficiently large. This demonstrates the practical relevance of our convergence rate result. Numerical experiments on applications arising from machine learning demonstrate the advantages of the proposed method over state-of-the-art ones.


Distributed Learning of Neural Lyapunov Functions for Large-Scale Networked Dissipative Systems

arXiv.org Artificial Intelligence

This paper considers the problem of characterizing the stability region of a large-scale networked system comprised of dissipative nonlinear subsystems, in a distributed and computationally tractable way. One standard approach to estimate the stability region of a general nonlinear system is to first find a Lyapunov function for the system and characterize its region of attraction as the stability region. However, classical approaches, such as sum-of-squares methods and quadratic approximation, for finding a Lyapunov function either do not scale to large systems or give very conservative estimates for the stability region. In this context, we propose a new distributed learning based approach by exploiting the dissipativity structure of the subsystems. Our approach has two parts: the first part is a distributed approach to learn the storage functions (similar to the Lyapunov functions) for all the subsystems, and the second part is a distributed optimization approach to find the Lyapunov function for the networked system using the learned storage functions of the subsystems. We demonstrate the superior performance of our proposed approach through extensive case studies in microgrid networks.


Asset Allocation: From Markowitz to Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Asset allocation is an investment strategy that aims to balance risk and reward by constantly redistributing the portfolio's assets according to certain goals, risk tolerance, and investment horizon. Unfortunately, there is no simple formula that can find the right allocation for every individual. As a result, investors may use different asset allocations' strategy to try to fulfil their financial objectives. In this work, we conduct an extensive benchmark study to determine the efficacy and reliability of a number of optimization techniques. In particular, we focus on traditional approaches based on Modern Portfolio Theory, and on machine-learning approaches based on deep reinforcement learning. We assess the model's performance under different market tendency, i.e., both bullish and bearish markets. For reproducibility, we provide the code implementation code in this repository.


Closing the Loop: A Framework for Trustworthy Machine Learning in Power Systems

arXiv.org Artificial Intelligence

Deep decarbonization of the energy sector will require massive penetration of stochastic renewable energy resources and an enormous amount of grid asset coordination; this represents a challenging paradigm for the power system operators who are tasked with maintaining grid stability and security in the face of such changes. With its ability to learn from complex datasets and provide predictive solutions on fast timescales, machine learning (ML) is well-posed to help overcome these challenges as power systems transform in the coming decades. In this work, we outline five key challenges (dataset generation, data pre-processing, model training, model assessment, and model embedding) associated with building trustworthy ML models which learn from physics-based simulation data. We then demonstrate how linking together individual modules, each of which overcomes a respective challenge, at sequential stages in the machine learning pipeline can help enhance the overall performance of the training process. In particular, we implement methods that connect different elements of the learning pipeline through feedback, thus "closing the loop" between model training, performance assessments, and re-training. We demonstrate the effectiveness of this framework, its constituent modules, and its feedback connections by learning the N-1 small-signal stability margin associated with a detailed model of a proposed North Sea Wind Power Hub system.


Neural Networks for Encoding Dynamic Security-Constrained Optimal Power Flow

arXiv.org Artificial Intelligence

This paper introduces a framework to capture previously intractable optimization constraints and transform them to a mixed-integer linear program, through the use of neural networks. We encode the feasible space of optimization problems characterized by both tractable and intractable constraints, e.g. differential equations, to a neural network. Leveraging an exact mixed-integer reformulation of neural networks, we solve mixed-integer linear programs that accurately approximate solutions to the originally intractable non-linear optimization problem. We apply our methods to the AC optimal power flow problem (AC-OPF), where directly including dynamic security constraints renders the AC-OPF intractable. Our proposed approach has the potential to be significantly more scalable than traditional approaches. We demonstrate our approach for power system operation considering N-1 security and small-signal stability, showing how it can efficiently obtain cost-optimal solutions which at the same time satisfy both static and dynamic security constraints.