Review for NeurIPS paper: Agnostic Learning of a Single Neuron with Gradient Descent

Jan-23-2025, 13:11:16 GMT–Neural Information Processing Systems

Summary and Contributions: The paper considers the problem of agnostically learning a single neuron with respect to the squared loss via gradient descent (GD). The focus of the paper is on specifically understanding the guarantees GD obtains. Under only boundedness of the input distribution, the authors show that for strictly increasing (gradient bounded below by a constant) activation functions, GD finds a point that achieves error O(\sqrt{opt}) \eps where opt is the loss of the best fitting neuron. They extend the result to ReLU under standard anti-concentration assumptions (similar to [1,2]). For ReLU, it is known that you in fact get O(opt) \eps under essentially the same assumptions (see [1]) using an alternate algorithm.

agnostic learning, gradient descent, neurips paper, (6 more...)

Neural Information Processing Systems

Jan-23-2025, 13:11:16 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)