Questions for Flat-Minima Optimization of Modern Neural Networks
Kaddour, Jean, Liu, Linqing, Silva, Ricardo, Kusner, Matt J.
For training neural networks, flat-minima optimizers that seek to find parameters in neighborhoods having uniformly low loss (flat minima) have been shown to improve upon stochastic and adaptive gradient-based methods. Two methods for finding flat minima stand out: 1. Averaging methods (i.e., Stochastic Weight Averaging, SWA), and 2. Minimax methods (i.e., Sharpness Aware Minimization, SAM). However, despite similar motivations, there has been limited investigation into their properties and no comprehensive comparison between them. In this work, we investigate the loss surfaces from a systematic benchmarking of these approaches across computer vision, natural language processing, and graph learning tasks. The results lead to a simple hypothesis: since both approaches find different flat solutions, combining them should improve generalization even further. We verify this improves over either flat-minima approach in 39 out of 42 cases. When it does not, we investigate potential reasons. We hope our results across image, graph, and text data will help researchers to improve deep learning optimizers, and practitioners to pinpoint the optimizer for the problem at hand.
Feb-2-2022
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States
- Washington > King County
- Seattle (0.04)
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- Alaska > Anchorage Municipality
- Anchorage (0.04)
- Washington > King County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Ontario > Toronto (0.14)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- France (0.04)
- Austria (0.04)
- Russia (0.04)
- United Kingdom > England
- North Yorkshire > York (0.04)
- Cambridgeshire > Cambridge (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Asia
- Russia (0.04)
- Middle East > Israel
- Haifa District > Haifa (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Oceania > Australia
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Education (0.67)
- Technology: