Goto

Collaborating Authors

 lpgd


LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

arXiv.org Artificial Intelligence

Training such a parameterized optimization model is an Embedding parameterized optimization problems instance of bi-level optimization (Gould et al., 2016), as layers into machine learning architectures which is generally challenging. Whenever it is possible serves as a powerful inductive bias. Training to propagate gradients through the optimization problem such architectures with stochastic gradient via an informative derivative of the solution mapping, descent requires care, as degenerate derivatives the task is typically approached with standard stochastic of the embedded optimization problem often gradient descent (GD) (Amos & Kolter, 2017a; Agrawal render the gradients uninformative. We propose et al., 2019b). However, when the optimization problem has Lagrangian Proximal Gradient Descent (LPGD) discrete solutions, the derivatives are typically degenerate, a flexible framework for training architectures as small perturbations of the input do not affect the optimal with embedded optimization layers that seamlessly solution. Previous works have proposed several methods integrates into automatic differentiation to overcome this challenge, ranging from differentiable libraries. LPGD efficiently computes meaningful relaxations (Wang et al., 2019; Wilder et al., 2019a; Mandi replacements of the degenerate optimization & Guns, 2020; Djolonga & Krause, 2017) and stochastic layer derivatives by re-running the forward solver smoothing (Berthet et al., 2020; Dalle et al., 2022), over oracle on a perturbed input. LPGD captures proxy losses (Paulus et al., 2021), to finite-difference based various previously proposed methods as special techniques (Vlastelica et al., 2020).


Deep Unrolling Networks with Recurrent Momentum Acceleration for Nonlinear Inverse Problems

arXiv.org Artificial Intelligence

Combining the strengths of model-based iterative algorithms and data-driven deep learning solutions, deep unrolling networks (DuNets) have become a popular tool to solve inverse imaging problems. While DuNets have been successfully applied to many linear inverse problems, nonlinear problems tend to impair the performance of the method. Inspired by momentum acceleration techniques that are often used in optimization algorithms, we propose a recurrent momentum acceleration (RMA) framework that uses a long short-term memory recurrent neural network (LSTM-RNN) to simulate the momentum acceleration process. The RMA module leverages the ability of the LSTM-RNN to learn and retain knowledge from the previous gradients. We apply RMA to two popular DuNets -- the learned proximal gradient descent (LPGD) and the learned primal-dual (LPD) methods, resulting in LPGD-RMA and LPD-RMA respectively. We provide experimental results on two nonlinear inverse problems: a nonlinear deconvolution problem, and an electrical impedance tomography problem with limited boundary measurements. In the first experiment we have observed that the improvement due to RMA largely increases with respect to the nonlinearity of the problem. The results of the second example further demonstrate that the RMA schemes can significantly improve the performance of DuNets in strongly ill-posed problems.