LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

Paulus, Anselm, Martius, Georg, Musil, Vít

Jul-8-2024–arXiv.org Artificial Intelligence

Training such a parameterized optimization model is an Embedding parameterized optimization problems instance of bi-level optimization (Gould et al., 2016), as layers into machine learning architectures which is generally challenging. Whenever it is possible serves as a powerful inductive bias. Training to propagate gradients through the optimization problem such architectures with stochastic gradient via an informative derivative of the solution mapping, descent requires care, as degenerate derivatives the task is typically approached with standard stochastic of the embedded optimization problem often gradient descent (GD) (Amos & Kolter, 2017a; Agrawal render the gradients uninformative. We propose et al., 2019b). However, when the optimization problem has Lagrangian Proximal Gradient Descent (LPGD) discrete solutions, the derivatives are typically degenerate, a flexible framework for training architectures as small perturbations of the input do not affect the optimal with embedded optimization layers that seamlessly solution. Previous works have proposed several methods integrates into automatic differentiation to overcome this challenge, ranging from differentiable libraries. LPGD efficiently computes meaningful relaxations (Wang et al., 2019; Wilder et al., 2019a; Mandi replacements of the degenerate optimization & Guns, 2020; Djolonga & Krause, 2017) and stochastic layer derivatives by re-running the forward solver smoothing (Berthet et al., 2020; Dalle et al., 2022), over oracle on a perturbed input. LPGD captures proxy losses (Paulus et al., 2021), to finite-difference based various previously proposed methods as special techniques (Vlastelica et al., 2020).

artificial intelligence, lpgd, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Jul-8-2024

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.14)
  - United States
    - California (0.45)
    - New York > New York County
      - New York City (0.14)

Genre:
- Research Report (1.00)

Industry:
- Energy > Oil & Gas > Upstream (0.86)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning > Gradient Descent (0.76)
  - Representation & Reasoning > Optimization (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found