Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates

Grazzi, Riccardo, Pontil, Massimiliano, Salzo, Saverio

Mar-28-2024–arXiv.org Machine Learning

Important examples are given by hyperparameter optimization and meta-learning (Franceschi et al., 2018; Lee et al., 2019), where (1) expresses the optimality conditions of a lower-level minimization problem. Further examples include learning a surrogate model for data poisoning attacks (Xiao et al., 2015; Muñoz-González et al., 2017), deep equilibrium models (Bai et al., 2019) or OptNet (Amos & Kolter, 2017). All these problems may present nonsmooth mappings Φ. For instance, consider hyperparameter optimization or data poisoning attacks for SVMs, or meta-learning for image classification, where Φ is evaluated through the forward pass of a neural net with RELU activations (Bertinetto et al., 2019; Lee et al., 2019; Rajeswaran et al., 2019). In addition, when such settings are applied to large datasets, evaluating the map Φ would be too costly, but we can usually apply stochastic methods through the composite stochastic structure in (2), where only T involves a computation on the full training set (e.g., a gradient descent step).

artificial intelligence, conservative derivative, machine learning, (15 more...)

arXiv.org Machine Learning

Mar-28-2024

arXiv.org PDF

Add feedback

Country:
- Europe (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.46)
  - Statistical Learning > Gradient Descent (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found