Understanding Square Loss in Training Overparametrized Neural Network Classifiers

Oct-11-2024, 10:57:52 GMT–Neural Information Processing Systems

Deep learning has achieved many breakthroughs in modern classification tasks. Numerous architectures have been proposed for different data structures but when it comes to the loss function, the cross-entropy loss is the predominant choice. Recently, several alternative losses have seen revived interests for deep classifiers. In particular, empirical evidence seems to promote square loss but a theoretical justification is still lacking. In this work, we contribute to the theoretical understanding of square loss in classification by systematically investigating how it performs for overparametrized neural networks in the neural tangent kernel (NTK) regime.

generalization error, square loss, training overparametrized neural network classifier, (2 more...)

Neural Information Processing Systems

Oct-11-2024, 10:57:52 GMT

Conferences Web Page

Add feedback

Genre:
- Play > Prospect > Container > Reservoir (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.40)