Goto

Collaborating Authors

 Optimization


Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning

arXiv.org Artificial Intelligence

The action governor is an add-on scheme to a nominal control loop that monitors and adjusts the control actions to enforce safety specifications expressed as pointwise-in-time state and control constraints. In this paper, we introduce the Robust Action Governor (RAG) for systems the dynamics of which can be represented using discrete-time Piecewise Affine (PWA) models with both parametric and additive uncertainties and subject to non-convex constraints. We develop the theoretical properties and computational approaches for the RAG. After that, we introduce the use of the RAG for realizing safe Reinforcement Learning (RL), i.e., ensuring all-time constraint satisfaction during online RL exploration-and-exploitation process. This development enables safe real-time evolution of the control policy and adaptation to changes in the operating environment and system parameters (due to aging, damage, etc.). We illustrate the effectiveness of the RAG in constraint enforcement and safe RL using the RAG by considering their applications to a soft-landing problem of a mass-spring-damper system.


DiffCloth: Differentiable Cloth Simulation with Dry Frictional Contact

arXiv.org Artificial Intelligence

Cloth simulation has wide applications in computer animation, garment design, and robot-assisted dressing. This work presents a differentiable cloth simulator whose additional gradient information facilitates cloth-related applications. Our differentiable simulator extends a state-of-the-art cloth simulator based on Projective Dynamics (PD) and with dry frictional contact. We draw inspiration from previous work to propose a fast and novel method for deriving gradients in PD-based cloth simulation with dry frictional contact. Furthermore, we conduct a comprehensive analysis and evaluation of the usefulness of gradients in contact-rich cloth simulation. Finally, we demonstrate the efficacy of our simulator in a number of downstream applications, including system identification, trajectory optimization for assisted dressing, closed-loop control, inverse design, and real-to-sim transfer. We observe a substantial speedup obtained from using our gradient information in solving most of these applications.


Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP

arXiv.org Artificial Intelligence

The travelling salesperson problem (TSP) is a classic resource allocation problem used to find an optimal order of doing a set of tasks while minimizing (or maximizing) an associated objective function. It is widely used in robotics for applications such as planning and scheduling. In this work, we solve TSP for two objectives using reinforcement learning (RL). Often in multi-objective optimization problems, the associated objective functions can be conflicting in nature. In such cases, the optimality is defined in terms of Pareto optimality. A set of these Pareto optimal solutions in the objective space form a Pareto front (or frontier). Each solution has its trade-off. We present the Pareto frontier approximation network (PA-Net), a network that generates good approximations of the Pareto front for the bi-objective travelling salesperson problem (BTSP). Firstly, BTSP is converted into a constrained optimization problem. We then train our network to solve this constrained problem using the Lagrangian relaxation and policy gradient. With PA-Net we improve the performance over an existing deep RL-based method. The average improvement in the hypervolume metric, which is used to measure the optimality of the Pareto front, is 2.3%. At the same time, PA-Net has 4.5x faster inference time. Finally, we present the application of PA-Net to find optimal visiting order in a robotic navigation task/coverage planning. Our code is available on the project website.


Understanding the Generalization Performance of Spectral Clustering Algorithms

arXiv.org Artificial Intelligence

The theoretical analysis of spectral clustering mainly focuses on consistency, while there is relatively little research on its generalization performance. In this paper, we study the excess risk bounds of the popular spectral clustering algorithms: \emph{relaxed} RatioCut and \emph{relaxed} NCut. Firstly, we show that their excess risk bounds between the empirical continuous optimal solution and the population-level continuous optimal solution have a $\mathcal{O}(1/\sqrt{n})$ convergence rate, where $n$ is the sample size. Secondly, we show the fundamental quantity in influencing the excess risk between the empirical discrete optimal solution and the population-level discrete optimal solution. At the empirical level, algorithms can be designed to reduce this quantity. Based on our theoretical analysis, we propose two novel algorithms that can not only penalize this quantity, but also cluster the out-of-sample data without re-eigendecomposition on the overall sample. Experiments verify the effectiveness of the proposed algorithms.


Uniform Stability for First-Order Empirical Risk Minimization

arXiv.org Artificial Intelligence

We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization. Uniform stability is often used to obtain generalization error bounds for optimization algorithms, and we are interested in a general approach to achieve it. For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a uniformly stable version of the algorithm while maintaining its convergence rate up to logarithmic factors. Using this reduction we obtain a (nearly) optimal algorithm for smooth optimization with convergence rate $\widetilde{O}(1/T^2)$ and uniform stability $O(T^2/n)$, resolving an open problem of Chen et al. (2018); Attia and Koren (2021). For more general geometries, we develop a variant of Mirror Descent for smooth optimization with convergence rate $\widetilde{O}(1/T)$ and uniform stability $O(T/n)$, leaving open the question of devising a general conversion method as in the Euclidean case.


SP2: A Second Order Stochastic Polyak Method

arXiv.org Artificial Intelligence

Recently the "SP" (Stochastic Polyak step size) method has emerged as a competitive adaptive method for setting the step sizes of SGD. SP can be interpreted as a method specialized to interpolated models, since it solves the interpolation equations. SP solves these equation by using local linearizations of the model. We take a step further and develop a method for solving the interpolation equations that uses the local second-order approximation of the model. Our resulting method SP2 uses Hessian-vector products to speed-up the convergence of SP. Furthermore, and rather uniquely among second-order methods, the design of SP2 in no way relies on positive definite Hessian matrices or convexity of the objective function. We show SP2 is very competitive on matrix completion, non-convex test problems and logistic regression. We also provide a convergence theory on sums-of-quadratics.


Personalized PCA: Decoupling Shared and Unique Features

arXiv.org Artificial Intelligence

In this paper, we tackle a significant challenge in PCA: heterogeneity. When data are collected from different sources with heterogeneous trends while still sharing some congruency, it is critical to extract shared knowledge while retaining unique features of each source. To this end, we propose personalized PCA (PerPCA), which uses mutually orthogonal global and local principal components to encode both unique and shared features. We show that, under mild conditions, both unique and shared features can be identified and recovered by a constrained optimization problem, even if the covariance matrices are immensely different. Also, we design a fully federated algorithm inspired by distributed Stiefel gradient descent to solve the problem. The algorithm introduces a new group of operations called generalized retractions to handle orthogonality constraints, and only requires global PCs to be shared across sources. We prove the linear convergence of the algorithm under suitable assumptions. Comprehensive numerical experiments highlight PerPCA's superior performance in feature extraction and prediction from heterogeneous datasets. As a systematic approach to decouple shared and unique features from heterogeneous datasets, PerPCA finds applications in several tasks including video segmentation, topic extraction, and distributed clustering.


A Survey of Decision Making in Adversarial Games

arXiv.org Artificial Intelligence

Game theory has by now found numerous applications in various fields, including economics, industry, jurisprudence, and artificial intelligence, where each player only cares about its own interest in a noncooperative or cooperative manner, but without obvious malice to other players. However, in many practical applications, such as poker, chess, evader pursuing, drug interdiction, coast guard, cyber-security, and national defense, players often have apparently adversarial stances, that is, selfish actions of each player inevitably or intentionally inflict loss or wreak havoc on other players. Along this line, this paper provides a systematic survey on three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games, zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal strategy seeking techniques, prevailing algorithms, and practical applications. Finally, promising future research directions are also discussed for relevant adversarial games.


Risk-averse Stochastic Optimization for Farm Management Practices and Cultivar Selection Under Uncertainty

arXiv.org Artificial Intelligence

Optimizing management practices and selecting the best cultivar for planting play a significant role in increasing agricultural food production and decreasing environmental footprint. In this study, we develop optimization frameworks under uncertainty using conditional value-at-risk in the stochastic programming objective function. We integrate the crop model, APSIM, and a parallel Bayesian optimization algorithm to optimize the management practices and select the best cultivar at different levels of risk aversion. This approach integrates the power of optimization in determining the best decisions and crop model in simulating nature's output corresponding to various decisions. As a case study, we set up the crop model for 25 locations across the US Corn Belt. We optimized the management options (planting date, N fertilizer amount, fertilizing date, and plant density in the farm) and cultivar options (cultivars with different maturity days) three times: a) before, b) at planting and c) after a growing season with known weather. Results indicated that the proposed model produced meaningful connections between weather and optima decisions. Also, we found risk-tolerance farmers get more expected yield than risk-averse ones in wet and non-wet weathers.


Practical tradeoffs between memory, compute, and performance in learned optimizers

arXiv.org Artificial Intelligence

Optimization plays a costly and crucial role in developing machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric functions. The parameters of these functions are then optimized so that the resulting learned optimizer minimizes a target loss on a chosen class of models. Learned optimizers can both reduce the number of required training steps and improve the final test loss. However, they can be expensive to train, and once trained can be expensive to use due to computational and memory overhead for the optimizer itself. In this work, we identify and quantify the design features governing the memory, compute, and performance trade-offs for many learned and hand-designed optimizers. We further leverage our analysis to construct a learned optimizer that is both faster and more memory efficient than previous work. Despite the huge computational costs associated with training large neural models, the set of optimization algorithms used to train them has largely been restricted to simple update functions mapping from gradients to parameter updates (e.g. These algorithms typically depend on a small number of hand-designed features and parameters. However, the last decade in machine learning research has repeatedly seen small, hand-designed models outperformed by parameterized models (such as neural networks) trained to purpose on large amounts of data (LeCun et al., 2015). Thus, a promising direction to improve training performance and reduce costs is to replace hand-designed optimizers with more expressive learned optimizers, trained on problems similar to those encountered in practice. Learned optimizers specify parameter update rules using a flexible parametric form and learn the parameters of this function from a "dataset" of optimization tasks--a procedure typically referred to as meta-training or meta-learning (Andrychowicz et al., 2016; Finn et al., 2017; Hochreiter et al., 2001). Learned optimizers represent a path towards improved optimizer performance, and possess the ability to target different objectives (e.g. Despite being an active area of research (Andrychowicz et al., 2016; Wichrowska et al., 2017; Chen et al., 2020; Metz et al., 2020b; 2021; Almeida et al., 2021; Zheng et al., 2022), they are not yet commonly used in practice. Several challenges have limited the widespread application of learned optimizers: they are typically difficult to meta-train on a task family of interest, they can require significant memory and compute overhead when applied, and they often generalize less well to novel tasks than hand-designed optimizers.