Reviews: Adding One Neuron Can Eliminate All Bad Local Minima

Oct-7-2024, 21:47:10 GMT–Neural Information Processing Systems

This phenomenon is a bit curious and perhaps deserves more elaboration. This, I am afraid, is likely what is going on here (if you drop the separable assumption). The main contribution of this work is to prove that by adding a single exponential function (directly) from input to output and adding a mild l_2 regularizer, the slightly modified, highly nonconvex loss function does not have any non-global local minima. Moreover, all of these local minima actually correspond to the global minima of the original, unmodified nonconvex loss. This surprising result, to the best of my knowledge, is new and of genuine interest.

assumption, local minima, minima, (13 more...)

Neural Information Processing Systems

Oct-7-2024, 21:47:10 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)