Goto

Collaborating Authors

 neural network learn


What Do Neural Networks Learn When Trained With Random Labels?

Neural Information Processing Systems

We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We study this alignment effect by investigating neural networks pre-trained on randomly labelled image data and subsequently fine-tuned on disjoint datasets with random or real labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling. We analyze how competing effects, such as specialization at later layers, may hide the positive transfer. These effects are studied in several network architectures, including VGG16 and ResNet18, on CIFAR10 and ImageNet.


What Do Neural Networks Learn When Trained With Random Labels?

Neural Information Processing Systems

We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We study this alignment effect by investigating neural networks pre-trained on randomly labelled image data and subsequently fine-tuned on disjoint datasets with random or real labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.


Review for NeurIPS paper: What Do Neural Networks Learn When Trained With Random Labels?

Neural Information Processing Systems

Weaknesses: Even though the authors dismiss performing experiments with image augmentations (L 50) as it would introduce a supervisory signal, it could be beneficial to investigate it in the paper. Even though augmentations do add a prior on the expected data distribution, it could be worthwhile to investigate the effect. This is of course another step away from the i.i.d. Along the same lines, I would expect that with increasing kernel size of the convolutions, the correlation between patches increases and with that potentially the misalignment score. If this understanding is correct I would also expect that the experiment in Figure 1 would look very different if only the last layers were transferred instead of the first layers.


Review for NeurIPS paper: What Do Neural Networks Learn When Trained With Random Labels?

Neural Information Processing Systems

Reviewers unanimously agree that the paper presents insights that explain when training with random labels can help in learning transferrable features. The findings of this paper will be of broad interest to the ML community. I recommend that authors revise their paper in accordance to their comments in the rebuttal.


What Do Neural Networks Learn When Trained With Random Labels?

Neural Information Processing Systems

We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We study this alignment effect by investigating neural networks pre-trained on randomly labelled image data and subsequently fine-tuned on disjoint datasets with random or real labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.


Neural networks learn to magnify areas near decision boundaries

Zavatone-Veth, Jacob A., Yang, Sheng, Rubinfien, Julian A., Pehlevan, Cengiz

arXiv.org Machine Learning

In machine learning, there is a long history of trying to build neural networks that can learn from fewer example data by baking in strong geometric priors. However, it is not always clear a priori what geometric constraints are appropriate for a given task. Here, we consider the possibility that one can uncover useful geometric inductive biases by studying how training molds the Riemannian geometry induced by unconstrained neural network feature maps. We first show that at infinite width, neural networks with random parameters induce highly symmetric metrics on input space. This symmetry is broken by feature learning: networks trained to perform classification tasks learn to magnify local areas along decision boundaries. This holds in deep networks trained on high-dimensional image classification tasks, and even in self-supervised representation learning. These results begins to elucidate how training shapes the geometry induced by unconstrained neural network feature maps, laying the groundwork for an understanding of this richly nonlinear form of feature learning.


What do neural networks learn in image classification? A frequency shortcut perspective

Wang, Shunxin, Veldhuis, Raymond, Brune, Christoph, Strisciuglio, Nicola

arXiv.org Artificial Intelligence

Frequency analysis is useful for understanding the mechanisms of representation learning in neural networks (NNs). Most research in this area focuses on the learning dynamics of NNs for regression tasks, while little for classification. This study empirically investigates the latter and expands the understanding of frequency shortcuts. First, we perform experiments on synthetic datasets, designed to have a bias in different frequency bands. Our results demonstrate that NNs tend to find simple solutions for classification, and what they learn first during training depends on the most distinctive frequency characteristics, which can be either low- or high-frequencies. Second, we confirm this phenomenon on natural images. We propose a metric to measure class-wise frequency characteristics and a method to identify frequency shortcuts. The results show that frequency shortcuts can be texture-based or shape-based, depending on what best simplifies the objective. Third, we validate the transferability of frequency shortcuts on out-of-distribution (OOD) test sets. Our results suggest that frequency shortcuts can be transferred across datasets and cannot be fully avoided by larger model capacity and data augmentation. We recommend that future research should focus on effective training schemes mitigating frequency shortcut learning.


Understanding the patterns that neural networks learn from chemical spectra

#artificialintelligence

Analysing spectra from experimental characterization of materials is often time consuming, susceptible to distortions in data, requires specific domain knowledge, and may be susceptible to biases in general heuristics under human analysis. Recent work has shown the potential of using machine learning methods to solve this task, and provide automated and unbiased spectral analysis on- the-fly . We present a simple 1D neural network model to classify infrared spectra from small organic molecules, according to their functional groups. Our model is within range of state-of-the-art performance while being significantly less complex than previous works. A smaller network reduces the risk of overfitting and enables exploring what the model has learned about the underlying physics behind the spectra.


Neural Network Learns to Identify Chromatid Cohesion Defects

#artificialintelligence

Tokyo, Japan – Scientists from Tokyo Metropolitan University have used machine learning to automate the identification of defects in sister chromatid cohesion. They trained a convolutional neural network (CNN) with microscopy images of individual stained chromosomes, identified by researchers as having or not having cohesion defects. After training, it was able to successfully classify 73.1% of new images. Automation promises better statistics, and more insight into the wide range of disorders which cause cohesion defects. Chromosomes consist of long DNA molecules that contain a portion of our genes.


Machine Learning: Learn By Building Web Apps in Python

#artificialintelligence

Machine Learning: Learn By Building Web Apps in Python - Learn basic to advanced Machine Learning algorithms by creating web applications using Flask!! Machine learning is a branch of artificial intelligence (AI) focused on building applications that learn from data and improve their accuracy over time without being programmed to do so. In data science, an algorithm is a sequence of statistical processing steps. In machine learning, algorithms are'trained' to find patterns and features in massive amounts of data in order to make decisions and predictions based on new data. The better the algorithm, the more accurate the decisions and predictions will become as it processes more data. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts.