New insights into training dynamics of deep classifiers

Mar-9-2023, 12:51:35 GMT–#artificialintelligence

A new study from researchers at MIT and Brown University characterizes several properties that emerge during the training of deep classifiers, a type of artificial neural network commonly used for classification tasks such as image classification, speech recognition, and natural language processing. The paper, "Dynamics in Deep Classifiers trained with the Square Loss: Normalization, Low Rank, Neural Collapse and Generalization Bounds," published today in the journal Research, is the first of its kind to theoretically explore the dynamics of training deep classifiers with the square loss and how properties such as rank minimization, neural collapse, and dualities between the activation of neurons and the weights of the layers are intertwined. In the study, the authors focused on two types of deep classifiers: fully connected deep networks and convolutional neural networks (CNNs). A previous study examined the structural properties that develop in large neural networks at the final stages of training. That study focused on the last layer of the network and found that deep networks trained to fit a training dataset will eventually reach a state known as "neural collapse." When neural collapse occurs, the network maps multiple examples of a particular class (such as images of cats) to a single template of that class.

deep classifier, generalization, neural collapse, (14 more...)

#artificialintelligence

Mar-9-2023, 12:51:35 GMT

News Web Page

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)

Genre:
- Research Report
  - Experimental Study (0.36)
  - New Finding (0.31)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found