Optimization
SPDY: Accurate Pruning with Speedup Guarantees
The recent focus on the efficiency of deep neural networks (DNNs) has led to significant work on model compression approaches, of which weight pruning is one of the most popular. At the same time, there is rapidly-growing computational support for efficiently executing the unstructured-sparse models obtained via pruning. Yet, most existing pruning methods minimize just the number of remaining weights, i.e. the size of the model, rather than optimizing for inference time. We address this gap by introducing SPDY, a new compression method which automatically determines layer-wise sparsity targets achieving a desired inference speedup on a given system, while minimizing accuracy loss. SPDY is composed of two new techniques: the first is an efficient dynamic programming algorithm for solving the speedup-constrained layer-wise compression problem assuming a set of given layer-wise sensitivity scores; the second is a local search procedure for determining accurate layer-wise sensitivity scores. Experiments across popular vision and language models show that SPDY guarantees speedups while recovering higher accuracy relative to existing strategies, both for one-shot and gradual pruning scenarios, and is compatible with most existing pruning approaches. We also extend our approach to the recently-proposed task of pruning with very little data, where we achieve the best known accuracy recovery when pruning to the GPU-supported 2:4 sparsity pattern.
Contact-timing and Trajectory Optimization for 3D Jumping on Quadruped Robots
Performing highly agile acrobatic motions with a long flight phase requires perfect timing, high accuracy, and coordination of the full-body motion. To address these challenges, we present a novel approach on timings and trajectory optimization framework for legged robots performing aggressive 3D jumping. In our method, we firstly utilize an effective optimization framework using simplified rigid body dynamics to solve for contact timings and a reference trajectory of the robot body. The solution of this module is then used to formulate a full-body trajectory optimization based on the full nonlinear dynamics of the robot. This combination allows us to effectively optimize for contact timings while ensuring that the jumping trajectory can be effectively realized in the robot hardware. We first validate the efficiency of the proposed framework on the A1 robot model for various 3D jumping tasks such as double-backflips off the high altitude of 2m. Experimental validation was then successfully conducted for various aggressive 3D jumping motions such as diagonal jumps, barrel roll, and double barrel roll from a box of heights 0.4m and 0.9m, respectively.
Entropy Regularization for Population Estimation
Chugg, Ben, Henderson, Peter, Goldin, Jacob, Ho, Daniel E.
While most frameworks for online sequential decision-making focus on the objective of maximizing reward, in practice this is rarely the sole objective. Other considerations may involve budget constraints, ensuring fair treatment, or estimating various population characteristics. There has been growing recognition that these other objectives must be formally integrated into sequential decision-making frameworks, especially if such algorithms are to be used in sensitive application areas [21]. In this work, we focus on the problem of maximizing reward while simultaneously estimating the population total (equivalently, mean) in a structured bandit setting. The most natural approach to this problem from a machine learning perspective is to use a model to predict the mean. However, this method is subject to the problem that adaptively collected data are subject to bias, which in turn biases the model estimates [29].
Seven Killer Memory Optimization Techniques Every Pandas User Should Know
Once we load a DataFrame into the Python environment, we typically perform a wide range of modifications on the DataFrame, don't we? These include adding new columns, renaming headers, deleting columns, altering row values, replacing NaN values, and many more. Standard Assignment intends to create a new copy of the DataFrame after transformation, leaving the original DataFrame untouched. As a result of the standard assignment, two distinct Pandas DataFrames (original and transformed) co-exist in the environment (df and df_copy above), doubling the memory utilization. In contrast to the standard assignment operations, inplace assignment operations intend to modify the original DataFrame itself without creating a new Pandas DataFrame object.
A Unified and Modular Model Predictive Control Framework for Soft Continuum Manipulators under Internal and External Constraints
Spinelli, Filippo A., Katzschmann, Robert K.
Fluidically actuated soft robots have promising capabilities such as inherent compliance and user safety. The control of soft robots needs to properly handle nonlinear actuation dynamics, motion constraints, workspace limitations, and variable shape stiffness, so having a unique algorithm for all these issues would be extremely beneficial. In this work, we adapt Model Predictive Control (MPC), popular for rigid robots, to a soft robotic arm called SoPrA. We address the challenges that current control methods are facing, by proposing a framework that handles these in a modular manner. While previous work focused on Joint-Space formulations, we show through simulation and experimental results that Task-Space MPC can be successfully implemented for dynamic soft robotic control. We provide a way to couple the Piece-wise Constant Curvature and Augmented Rigid Body Model assumptions with internal and external constraints and actuation dynamics, delivering an algorithm that unites these aspects and optimizes over them. We believe that a MPC implementation based on our approach could be the way to address most of model-based soft robotics control issues within a unified and modular framework, while allowing to include improvements that usually belong to other control domains such as machine learning techniques.
Optimization of Mobile Robotic Relay Operation for Minimal Average Wait Time
Hurst, Winston, Mostofi, Yasamin
This paper considers trajectory planning for a mobile robot which persistently relays data between pairs of far-away communication nodes. Data accumulates stochastically at each source, and the robot must move to appropriate positions to enable data offload to the corresponding destination. The robot needs to minimize the average time that data waits at a source before being serviced. We are interested in finding optimal robotic routing policies consisting of 1) locations where the robot stops to relay (relay positions) and 2) conditional transition probabilities that determine the sequence in which the pairs are serviced. We first pose this problem as a non-convex problem that optimizes over both relay positions and transition probabilities. To find approximate solutions, we propose a novel algorithm which alternately optimizes relay positions and transition probabilities. For the former, we find efficient convex partitions of the non-convex relay regions, then formulate a mixed-integer second-order cone problem. For the latter, we find optimal transition probabilities via sequential least squares programming. We extensively analyze the proposed approach and mathematically characterize important system properties related to the robot's long-term energy consumption and service rate. Finally, through extensive simulation with real channel parameters, we verify the efficacy of our approach. Significant advances in robotics over the past several years have created new possibilities in the design of communication systems.
FedSSO: A Federated Server-Side Second-Order Optimization Algorithm
Ma, Xin, Bao, Renyi, Jiang, Jinpeng, Liu, Yang, Jiang, Arthur, Yan, Jun, Liu, Xin, Pan, Zhisong
In this work, we propose FedSSO, a server-side second-order optimization method for federated learning (FL). In contrast to previous works in this direction, we employ a server-side approximation for the Quasi-Newton method without requiring any training data from the clients. In this way, we not only shift the computation burden from clients to server, but also eliminate the additional communication for second-order updates between clients and server entirely. We provide theoretical guarantee for convergence of our novel method, and empirically demonstrate our fast convergence and communication savings in both convex and non-convex settings.
The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality
Froese, Vincent | Hertrich, Christoph (TU Berlin) | Niedermeier, Rolf (TU Berlin)
Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension d of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter d and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including lp-loss for all p โ [0, โ]. In particular, we improve a known polynomial-time algorithm for constant d and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases.
One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction
Tรถnshoff, Jan, Kisin, Berke, Lindner, Jakob, Grohe, Martin
We propose a universal Graph Neural Network architecture which can be trained as an end-2-end search heuristic for any Constraint Satisfaction Problem (CSP). Our architecture can be trained unsupervised with policy gradient descent to generate problem specific heuristics for any CSP in a purely data driven manner. The approach is based on a novel graph representation for CSPs that is both generic and compact and enables us to process every possible CSP instance with one GNN, regardless of constraint arity, relations or domain size. Unlike previous RL-based methods, we operate on a global search action space and allow our GNN to modify any number of variables in every step of the stochastic search. This enables our method to properly leverage the inherent parallelism of GNNs. We perform a thorough empirical evaluation where we learn heuristics for well known and important CSPs from random data, including graph coloring, MaxCut, 3-SAT and MAX-k-SAT. Our approach outperforms prior approaches for neural combinatorial optimization by a substantial margin. It can compete with, and even improve upon, conventional search heuristics on test instances that are several orders of magnitude larger and structurally more complex than those seen during training.
Development of a CAV-based Intersection Control System and Corridor Level Impact Assessment
Mirbakhsh, Ardeshir, Lee, Joyoung, Besenski, Dejan
This paper presents a signal-free intersection control system for CAVs by combination of a pixel reservation algorithm and a Deep Reinforcement Learning (DRL) decision-making logic, followed by a corridor-level impact assessment of the proposed model. The pixel reservation algorithm detects potential colliding maneuvers and the DRL logic optimizes vehicles' movements to avoid collision and minimize the overall delay at the intersection. The proposed control system is called Decentralized Sparse Coordination System (DSCLS) since each vehicle has its own control logic and interacts with other vehicles in coordinated states only. Due to the chain impact of taking random actions in the DRL's training course, the trained model can deal with unprecedented volume conditions, which poses the main challenge in intersection management. The performance of the developed model is compared with conventional and CAV-based control systems, including fixed traffic lights, actuated traffic lights, and the Longest Queue First (LQF) control system under three volume regimes in a corridor of four intersections in VISSIM software. The simulation result revealed that the proposed model reduces delay by 50%, 29%, and 23% in moderate, high, and extreme volume regimes compared to the other CAV-based control system. Improvements in travel time, fuel consumption, emission, and Surrogate Safety Measures (SSM) are also noticeable.