The Implicit Bias of Minima Stability: A View from Function Space
–Neural Information Processing Systems
The loss terrains of over-parameterized neural networks have multiple global minima. However, it is well known that stochastic gradient descent (SGD) can stably converge only to minima that are sufficiently flat w.r.t.
Neural Information Processing Systems
Dec-24-2025, 12:25:22 GMT
- Technology: