Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions

Mishkin, Aaron, Sahiner, Arda, Pilanci, Mert

Aug-31-2022–arXiv.org Artificial Intelligence

We develop fast algorithms and robust software for convex optimization of two-layer neural networks with ReLU activation functions. Our work leverages a convex reformulation of the standard weight-decay penalized training problem as a set of group-$\ell_1$-regularized data-local models, where locality is enforced by polyhedral cone constraints. In the special case of zero-regularization, we show that this problem is exactly equivalent to unconstrained optimization of a convex "gated ReLU" network. For problems with non-zero regularization, we show that convex gated ReLU models obtain data-dependent approximation bounds for the ReLU training problem. To optimize the convex reformulations, we develop an accelerated proximal gradient method and a practical augmented Lagrangian solver. We show that these approaches are faster than standard training heuristics for the non-convex problem, such as SGD, and outperform commercial interior-point solvers. Experimentally, we verify our theoretical results, explore the group-$\ell_1$ regularization path, and scale convex optimization for neural networks to image classification on MNIST and CIFAR-10.

equivalent model class, fast convex optimization, two-layer relu network, (11 more...)

arXiv.org Artificial Intelligence

Aug-31-2022

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America > United States
  - Maryland > Baltimore (0.04)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
  - Colorado > Denver County
    - Denver (0.04)
  - California
    - Los Angeles County > Los Angeles (0.14)
    - Santa Clara County > Palo Alto (0.04)
- Europe
  - Russia (0.04)
  - Spain > Basque Country
    - Biscay Province > Bilbao (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)
  - Austria > Styria
    - Graz (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report (0.63)

Industry:
- Health & Medicine > Therapeutic Area (0.68)
- Government > Regional Government
  - North America Government > United States Government (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found