pretrained network
18d3a2f3068d6c669dcae19ceca1bc24-Paper-Conference.pdf
Thebrain prepares forlearning evenbefore interacting withtheenvironment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However,the mechanism of such aprocess has yet to be understood, and it is unclear whether this process can benefit the algorithm of machine learning.
- North America > Canada > Ontario > Toronto (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
464074179972cbbd75a39abc6954cd12-Supplemental.pdf
GloVE [25] is a 300-dimensional word embedding space. It is an dimensionality-representation representation of word-word co-occurrence statistics. The FlairNLP repository which we used for the preceding represenations can be found here: https://github.com/flairNLP/ Representations were built using a sliding window of 64 words as a context. Representations were built using a sliding window of 64 words as a context.
- Europe > Finland > Uusimaa > Helsinki (0.08)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Pretraining with random noise for uncertainty calibration
Cheon, Jeonghwan, Paik, Se-Bum
Uncertainty calibration, the process of aligning confidence with accuracy, is a hallmark of human intelligence. However, most machine learning models struggle to achieve this alignment, particularly when the training dataset is small relative to the network's capacity. Here, we demonstrate that uncertainty calibration can be effectively achieved through a pretraining method inspired by developmental neuroscience. Specifically, training with random noise before data training allows neural networks to calibrate their uncertainty, ensuring that confidence levels are aligned with actual accuracy. We show that randomly initialized, untrained networks tend to exhibit erroneously high confidence, but pretraining with random noise effectively calibrates these networks, bringing their confidence down to chance levels across input spaces. As a result, networks pretrained with random noise exhibit optimal calibration, with confidence closely aligned with accuracy throughout subsequent data training. These pre-calibrated networks also perform better at identifying "unknown data" by exhibiting lower confidence for out-of-distribution samples. Our findings provide a fundamental solution for uncertainty calibration in both in-distribution and out-of-distribution contexts.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics
Williams, Ben, van Merriënboer, Bart, Dumoulin, Vincent, Hamer, Jenny, Triantafillou, Eleni, Fleishman, Abram B., McKown, Matthew, Munger, Jill E., Rice, Aaron N., Lillis, Ashlee, White, Clemency E., Hobbs, Catherine A. D., Razak, Tries B., Jones, Kate E., Denton, Tom
Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pretraining strategy for a data-deficient domain using coral reef bioacoustics. We assemble ReefSet, a large annotated library of reef sounds, though modest compared to bird libraries at 2% of the sample count. Through testing few-shot transfer learning performance, we observe that pretraining on bird audio provides notably superior generalizability compared to pretraining on ReefSet or unrelated audio alone. However, our key findings show that cross-domain mixing which leverages bird, reef and unrelated audio during pretraining maximizes reef generalizability. SurfPerch, our pretrained network, provides a strong foundation for automated analysis of marine PAM data with minimal annotation and compute costs.
- Asia > Thailand (0.05)
- North America > United States > Hawaii (0.04)
- North America > Belize (0.04)
- (14 more...)
On Transfer in Classification: How Well do Subsets of Classes Generalize?
Baena, Raphael, Drumetz, Lucas, Gripon, Vincent
In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the theoretical roots beyond this phenomenon. In this work, we are interested in laying the foundations of such a theoretical framework for transferability between sets of classes. Namely, we establish a partially ordered set of subsets of classes. This tool allows to represent which subset of classes can generalize to others. In a more practical setting, we explore the ability of our framework to predict which subset of classes can lead to the best performance when testing on all of them. We also explore few-shot learning, where transfer is the golden standard. Our work contributes to better understanding of transfer mechanics and model generalization.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Bespoke: A Block-Level Neural Network Optimization Framework for Low-Cost Deployment
Lee, Jong-Ryul, Moon, Yong-Hyuk
As deep learning models become popular, there is a lot of need for deploying them to diverse device environments. Because it is costly to develop and optimize a neural network for every single environment, there is a line of research to search neural networks for multiple target environments efficiently. However, existing works for such a situation still suffer from requiring many GPUs and expensive costs. Motivated by this, we propose a novel neural network optimization framework named Bespoke for low-cost deployment. Our framework searches for a lightweight model by replacing parts of an original model with randomly selected alternatives, each of which comes from a pretrained neural network or the original model. In the practical sense, Bespoke has two significant merits. One is that it requires near zero cost for designing the search space of neural networks. The other merit is that it exploits the sub-networks of public pretrained neural networks, so the total cost is minimal compared to the existing works. We conduct experiments exploring Bespoke's the merits, and the results show that it finds efficient models for multiple targets with meager cost.
- Asia > South Korea > Daejeon > Daejeon (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Dual Representation Learning for Out-of-Distribution Detection
To classify in-distribution samples, deep neural networks explore strongly label-related information and discard weakly label-related information according to the information bottleneck. Out-of-distribution samples drawn from distributions differing from that of in-distribution samples could be assigned with unexpected high-confidence predictions because they could obtain minimum strongly label-related information. To distinguish in- and out-of-distribution samples, Dual Representation Learning (DRL) makes out-of-distribution samples harder to have high-confidence predictions by exploring both strongly and weakly label-related information from in-distribution samples. For a pretrained network exploring strongly label-related information to learn label-discriminative representations, DRL trains its auxiliary network exploring the remaining weakly label-related information to learn distribution-discriminative representations. Specifically, for a label-discriminative representation, DRL constructs its complementary distribution-discriminative representation by integrating diverse representations less similar to the label-discriminative representation. Accordingly, DRL combines label- and distribution-discriminative representations to detect out-of-distribution samples. Experiments show that DRL outperforms the state-of-the-art methods for out-of-distribution detection.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Out-of-distribution Detection by Cross-class Vicinity Distribution of In-distribution Data
Zhao, Zhilin, Cao, Longbing, Lin, Kun-Yu
Deep neural networks for image classification only learn to map in-distribution inputs to their corresponding ground truth labels in training without differentiating out-of-distribution samples from in-distribution ones. This results from the assumption that all samples are independent and identically distributed (IID) without distributional distinction. Therefore, a pretrained network learned from in-distribution samples treats out-of-distribution samples as in-distribution and makes high-confidence predictions on them in the test phase. To address this issue, we draw out-of-distribution samples from the vicinity distribution of training in-distribution samples for learning to reject the prediction on out-of-distribution inputs. A \textit{Cross-class Vicinity Distribution} is introduced by assuming that an out-of-distribution sample generated by mixing multiple in-distribution samples does not share the same classes of its constituents. We thus improve the discriminability of a pretrained network by finetuning it with out-of-distribution samples drawn from the cross-class vicinity distribution, where each out-of-distribution input corresponds to a complementary label. Experiments on various in-/out-of-distribution datasets show that the proposed method significantly outperforms the existing methods in improving the capacity of discriminating between in- and out-of-distribution samples.
- Asia > China (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Neural Style Transfer Tutorial
Neural Style Transfer is a technique that applies the Style of 1 image to the content of another image. It's a generative algorithm meaning that the algorithm generates an image as the output. As you're probably wondering, how does it work? In this post, we'll be explaining how the vanilla Neural Style Transfer algorithm adds different styles to an image and what makes the algorithm unique and interesting. Both Style Transfer and traditional GANs share the similarity of being able to generate images as the output.