admm iteration
Learning to accelerate distributed ADMM using graph neural networks
Doerks, Henri, Häusner, Paul, Escobar, Daniel Hernández, Sjölund, Jens
Distributed optimization is fundamental in large-scale machine learning and control applications. Among existing methods, the Alternating Direction Method of Multipliers (ADMM) has gained popularity due to its strong convergence guarantees and suitability for decentralized computation. However, ADMM often suffers from slow convergence and sensitivity to hyperparameter choices. In this work, we show that distributed ADMM iterations can be naturally represented within the message-passing framework of graph neural networks (GNNs). Building on this connection, we propose to learn adaptive step sizes and communication weights by a graph neural network that predicts the hyperparameters based on the iterates. By unrolling ADMM for a fixed number of iterations, we train the network parameters end-to-end to minimize the final iterates error for a given problem class, while preserving the algorithm's convergence properties. Numerical experiments demonstrate that our learned variant consistently improves convergence speed and solution quality compared to standard ADMM. The code is available at https://github.com/paulhausner/learning-distributed-admm.
Alternating Direction Method of Multipliers-Based Parallel Optimization for Multi-Agent Collision-Free Model Predictive Control
Cheng, Zilong, Ma, Jun, Wang, Wenxin, Zhu, Zicheng, de Silva, Clarence W., Lee, Tong Heng
This paper investigates the collision-free control problem for multi-agent systems. For such multi-agent systems, it is the typical situation where conventional methods using either the usual centralized model predictive control (MPC), or even the distributed counterpart, would suffer from substantial difficulty in balancing optimality and computational efficiency. Additionally, the non-convex characteristics that invariably arise in such collision-free control and optimization problems render it difficult to effectively derive a reliable solution (and also to thoroughly analyze the associated convergence properties). To overcome these challenging issues, this work establishes a suitably novel parallel computation framework through an innovative mathematical problem formulation; and then with this framework and formulation, a parallel algorithm based on alternating direction method of multipliers (ADMM) is presented to solve the sub-problems arising from the resulting parallel structure. Furthermore, an efficient and intuitive initialization procedure is developed to accelerate the optimization process, and the optimum is thus determined with significantly improved computational efficiency. As supported by rigorous proofs, the convergence of the proposed ADMM iterations for this non-convex optimization problem is analyzed and discussed in detail. Finally, a simulation with a group of unmanned aerial vehicles (UAVs) serves as an illustrative example here to demonstrate the effectiveness and efficiency of the proposed approach. Also, the simulation results verify significant improvements in accuracy and computational efficiency compared to other baselines, including primal quadratic mixed integer programming (PQ-MIP), non-convex quadratic mixed integer programming (NC-MIP), and non-convex quadratically constrained quadratic programming (NC-QCQP).
Privacy Amplification by Iteration for ADMM with (Strongly) Convex Objective Functions
Chan, T-H. Hubert, Xie, Hao, Zhao, Mengshi
We examine a private ADMM variant for (strongly) convex objectives which is a primal-dual iterative method. Each iteration has a user with a private function used to update the primal variable, masked by Gaussian noise for local privacy, without directly adding noise to the dual variable. Privacy amplification by iteration explores if noises from later iterations can enhance the privacy guarantee when releasing final variables after the last iteration. Cyffers et al. [ICML 2023] explored privacy amplification by iteration for the proximal ADMM variant, where a user's entire private function is accessed and noise is added to the primal variable. In contrast, we examine a private ADMM variant requiring just one gradient access to a user's function, but both primal and dual variables must be passed between successive iterations. To apply Balle et al.'s [NeurIPS 2019] coupling framework to the gradient ADMM variant, we tackle technical challenges with novel ideas. First, we address the non-expansive mapping issue in ADMM iterations by using a customized norm. Second, because the dual variables are not masked with any noise directly, their privacy guarantees are achieved by treating two consecutive noisy ADMM iterations as a Markov operator. Our main result is that the privacy guarantee for the gradient ADMM variant can be amplified proportionally to the number of iterations. For strongly convex objective functions, this amplification exponentially increases with the number of iterations. These amplification results align with the previously studied special case of stochastic gradient descent.
Learning context-aware adaptive solvers to accelerate quadratic programming
Jung, Haewon, Park, Junyoung, Park, Jinkyoo
Convex quadratic programming (QP) is an important sub-field of mathematical optimization. The alternating direction method of multipliers (ADMM) is a successful method to solve QP. Even though ADMM shows promising results in solving various types of QP, its convergence speed is known to be highly dependent on the step-size parameter $\rho$. Due to the absence of a general rule for setting $\rho$, it is often tuned manually or heuristically. In this paper, we propose CA-ADMM (Context-aware Adaptive ADMM)) which learns to adaptively adjust $\rho$ to accelerate ADMM. CA-ADMM extracts the spatio-temporal context, which captures the dependency of the primal and dual variables of QP and their temporal evolution during the ADMM iterations. CA-ADMM chooses $\rho$ based on the extracted context. Through extensive numerical experiments, we validated that CA-ADMM effectively generalizes to unseen QP problems with different sizes and classes (i.e., having different QP parameter structures). Furthermore, we verified that CA-ADMM could dynamically adjust $\rho$ considering the stage of the optimization process to accelerate the convergence speed further.
Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers
This paper proposes a novel algorithm for training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset. Arbitrary convex and twice-differentiable loss functions and regularization terms are handled by sequential least squares and either a line-search (LS) or a trust-region method of Levenberg-Marquardt (LM) type for ensuring convergence. In addition, to handle non-smooth regularization terms such as $\ell_1$, $\ell_0$, and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). We call the resulting algorithm NAILS (nonconvex ADMM iterations and least squares) in the case line search (LS) is used, or NAILM if a trust-region method (LM) is employed instead. The training method, which is also applicable to feedforward neural networks as a special case, is tested in three nonlinear system identification problems.
Alternating Direction Method of Multipliers for Constrained Iterative LQR in Autonomous Driving
Ma, Jun, Cheng, Zilong, Zhang, Xiaoxue, Tomizuka, Masayoshi, Lee, Tong Heng
In the context of autonomous driving, the iterative linear quadratic regulator (iLQR) is known to be an efficient approach to deal with the nonlinear vehicle model in motion planning problems. Particularly, the constrained iLQR algorithm has shown noteworthy advantageous outcomes of computation efficiency in achieving motion planning tasks under general constraints of different types. However, the constrained iLQR methodology requires a feasible trajectory at the first iteration as a prerequisite when the logarithmic barrier function is used. Also, the methodology leaves open the possibility for incorporation of fast, efficient, and effective optimization methods to further speed up the optimization process such that the requirements of real-time implementation can be successfully fulfilled. In this paper, a well-defined motion planning problem is formulated under nonlinear vehicle dynamics and various constraints, and an alternating direction method of multipliers (ADMM) is utilized to determine the optimal control actions leveraging the iLQR. The approach is able to circumvent the feasibility requirement of the trajectory at the first iteration. An illustrative example of motion planning for autonomous vehicles is then investigated. A noteworthy achievement of high computation efficiency is attained with the proposed development; comparing with the constrained iLQR algorithm based on the logarithmic barrier function, our proposed method reduces the average computation time by 31.93%, 38.52%, and 44.57% in the three driving scenarios; compared with the optimization solver IPOPT, our proposed method reduces the average computation time by 46.02%, 53.26%, and 88.43% in the three driving scenarios. As a result, real-time computation and implementation can be realized through our proposed framework, and thus it provides additional safety to the on-road driving tasks.
Solving Constrained CASH Problems with ADMM
Ram, Parikshit, Liu, Sijia, Vijaykeerthi, Deepak, Wang, Dakuo, Bouneffouf, Djallel, Bramble, Greg, Samulowitz, Horst, Gray, Alexander G.
The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization framework to decompose CASH into multiple small problems and demonstrate how ADMM facilitates incorporation of black-box constraints.
Harnessing the Power of Serverless Runtimes for Large-Scale Optimization
Aytekin, Arda, Johansson, Mikael
The event-driven and elastic nature of serverless runtimes makes them a very efficient and cost-effective alternative for scaling up computations. So far, they have mostly been used for stateless, data parallel and ephemeral computations. In this work, we propose using serverless runtimes to solve generic, large-scale optimization problems. Specifically, we build a master-worker setup using AWS Lambda as the source of our workers, implement a parallel optimization algorithm to solve a regularized logistic regression problem, and show that relative speedups up to 256 workers and efficiencies above 70% up to 64 workers can be expected. We also identify possible algorithmic and system-level bottlenecks, propose improvements, and discuss the limitations and challenges in realizing these improvements.
A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization
Huang, Kejun, Sidiropoulos, Nicholas D., Liavas, Athanasios P.
We propose a general algorithmic framework for constrained matrix and tensor factorization, which is widely used in signal processing and machine learning. The new framework is a hybrid between alternating optimization (AO) and the alternating direction method of multipliers (ADMM): each matrix factor is updated in turn, using ADMM, hence the name AO-ADMM. This combination can naturally accommodate a great variety of constraints on the factor matrices, and almost all possible loss measures for the fitting. Computation caching and warm start strategies are used to ensure that each update is evaluated efficiently, while the outer AO framework exploits recent developments in block coordinate descent (BCD)-type methods which help ensure that every limit point is a stationary point, as well as faster and more robust convergence in practice. Three special cases are studied in detail: non-negative matrix/tensor factorization, constrained matrix/tensor completion, and dictionary learning. Extensive simulations and experiments with real data are used to showcase the effectiveness and broad applicability of the proposed framework.