AITopics | pretrained network

Collaborating Authors

pretrained network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

18d3a2f3068d6c669dcae19ceca1bc24-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 02:07:40 GMT

Thebrain prepares forlearning evenbefore interacting withtheenvironment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However,the mechanism of such aprocess has yet to be understood, and it is unclear whether this process can benefit the algorithm of machine learning.

artificial intelligence, justification, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

464074179972cbbd75a39abc6954cd12-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 10:54:52 GMT

GloVE [25] is a 300-dimensional word embedding space. It is an dimensionality-representation representation of word-word co-occurrence statistics. The FlairNLP repository which we used for the preceding represenations can be found here: https://github.com/flairNLP/ Representations were built using a sliding window of 64 words as a context. Representations were built using a sliding window of 64 words as a context.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > Finland > Uusimaa > Helsinki (0.08)
North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Pretraining with Random Noise for Fast and Robust Learning without Weight Transport

Neural Information Processing SystemsOct-9-2025, 19:44:09 GMT

However, the mechanism of such a process has yet to be understood, and it is unclear whether this process can benefit the algorithm of machine learning.

data training, neural network, random noise, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Pretraining with random noise for uncertainty calibration

Cheon, Jeonghwan, Paik, Se-Bum

arXiv.org Artificial IntelligenceDec-23-2024

Uncertainty calibration, the process of aligning confidence with accuracy, is a hallmark of human intelligence. However, most machine learning models struggle to achieve this alignment, particularly when the training dataset is small relative to the network's capacity. Here, we demonstrate that uncertainty calibration can be effectively achieved through a pretraining method inspired by developmental neuroscience. Specifically, training with random noise before data training allows neural networks to calibrate their uncertainty, ensuring that confidence levels are aligned with actual accuracy. We show that randomly initialized, untrained networks tend to exhibit erroneously high confidence, but pretraining with random noise effectively calibrates these networks, bringing their confidence down to chance levels across input spaces. As a result, networks pretrained with random noise exhibit optimal calibration, with confidence closely aligned with accuracy throughout subsequent data training. These pre-calibrated networks also perform better at identifying "unknown data" by exhibiting lower confidence for out-of-distribution samples. Our findings provide a fundamental solution for uncertainty calibration in both in-distribution and out-of-distribution contexts.

artificial intelligence, machine learning, random noise, (17 more...)

arXiv.org Artificial Intelligence

2412.17411

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

Williams, Ben, van Merriënboer, Bart, Dumoulin, Vincent, Hamer, Jenny, Triantafillou, Eleni, Fleishman, Abram B., McKown, Matthew, Munger, Jill E., Rice, Aaron N., Lillis, Ashlee, White, Clemency E., Hobbs, Catherine A. D., Razak, Tries B., Jones, Kate E., Denton, Tom

arXiv.org Artificial IntelligenceMay-7-2024

Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pretraining strategy for a data-deficient domain using coral reef bioacoustics. We assemble ReefSet, a large annotated library of reef sounds, though modest compared to bird libraries at 2% of the sample count. Through testing few-shot transfer learning performance, we observe that pretraining on bird audio provides notably superior generalizability compared to pretraining on ReefSet or unrelated audio alone. However, our key findings show that cross-domain mixing which leverages bird, reef and unrelated audio during pretraining maximizes reef generalizability. SurfPerch, our pretrained network, provides a strong foundation for automated analysis of marine PAM data with minimal annotation and compute costs.

dataset, pretrained network, reefset, (15 more...)

arXiv.org Artificial Intelligence

2404.16436

Country:

Asia > Thailand (0.05)
North America > United States > Hawaii (0.04)
North America > Belize (0.04)
(14 more...)

Genre: Research Report > New Finding (0.88)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.63)

Add feedback

On Transfer in Classification: How Well do Subsets of Classes Generalize?

Baena, Raphael, Drumetz, Lucas, Gripon, Vincent

arXiv.org Artificial IntelligenceMar-6-2024

In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the theoretical roots beyond this phenomenon. In this work, we are interested in laying the foundations of such a theoretical framework for transferability between sets of classes. Namely, we establish a partially ordered set of subsets of classes. This tool allows to represent which subset of classes can generalize to others. In a more practical setting, we explore the ability of our framework to predict which subset of classes can lead to the best performance when testing on all of them. We also explore few-shot learning, where transfer is the golden standard. Our work contributes to better understanding of transfer mechanics and model generalization.

pair separated, separability, subset, (14 more...)

arXiv.org Artificial Intelligence

2403.03569

Country: Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Bespoke: A Block-Level Neural Network Optimization Framework for Low-Cost Deployment

Lee, Jong-Ryul, Moon, Yong-Hyuk

arXiv.org Artificial IntelligenceNov-17-2023

As deep learning models become popular, there is a lot of need for deploying them to diverse device environments. Because it is costly to develop and optimize a neural network for every single environment, there is a line of research to search neural networks for multiple target environments efficiently. However, existing works for such a situation still suffer from requiring many GPUs and expensive costs. Motivated by this, we propose a novel neural network optimization framework named Bespoke for low-cost deployment. Our framework searches for a lightweight model by replacing parts of an original model with randomly selected alternatives, each of which comes from a pretrained neural network or the original model. In the practical sense, Bespoke has two significant merits. One is that it requires near zero cost for designing the search space of neural networks. The other merit is that it exploits the sub-networks of public pretrained neural networks, so the total cost is minimal compared to the existing works. We conduct experiments exploring Bespoke's the merits, and the results show that it finds efficient models for multiple targets with meager cost.

lana, neural network, student model, (14 more...)

arXiv.org Artificial Intelligence

2303.01913

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dual Representation Learning for Out-of-Distribution Detection

Zhao, Zhilin, Cao, Longbing

arXiv.org Artificial IntelligenceAug-27-2023

To classify in-distribution samples, deep neural networks explore strongly label-related information and discard weakly label-related information according to the information bottleneck. Out-of-distribution samples drawn from distributions differing from that of in-distribution samples could be assigned with unexpected high-confidence predictions because they could obtain minimum strongly label-related information. To distinguish in- and out-of-distribution samples, Dual Representation Learning (DRL) makes out-of-distribution samples harder to have high-confidence predictions by exploring both strongly and weakly label-related information from in-distribution samples. For a pretrained network exploring strongly label-related information to learn label-discriminative representations, DRL trains its auxiliary network exploring the remaining weakly label-related information to learn distribution-discriminative representations. Specifically, for a label-discriminative representation, DRL constructs its complementary distribution-discriminative representation by integrating diverse representations less similar to the label-discriminative representation. Accordingly, DRL combines label- and distribution-discriminative representations to detect out-of-distribution samples. Experiments show that DRL outperforms the state-of-the-art methods for out-of-distribution detection.

distribution-discriminative representation, information, representation, (15 more...)

arXiv.org Artificial Intelligence

2206.09387

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

Out-of-distribution Detection by Cross-class Vicinity Distribution of In-distribution Data

Zhao, Zhilin, Cao, Longbing, Lin, Kun-Yu

arXiv.org Artificial IntelligenceAug-27-2023

Deep neural networks for image classification only learn to map in-distribution inputs to their corresponding ground truth labels in training without differentiating out-of-distribution samples from in-distribution ones. This results from the assumption that all samples are independent and identically distributed (IID) without distributional distinction. Therefore, a pretrained network learned from in-distribution samples treats out-of-distribution samples as in-distribution and makes high-confidence predictions on them in the test phase. To address this issue, we draw out-of-distribution samples from the vicinity distribution of training in-distribution samples for learning to reject the prediction on out-of-distribution inputs. A \textit{Cross-class Vicinity Distribution} is introduced by assuming that an out-of-distribution sample generated by mixing multiple in-distribution samples does not share the same classes of its constituents. We thus improve the discriminability of a pretrained network by finetuning it with out-of-distribution samples drawn from the cross-class vicinity distribution, where each out-of-distribution input corresponds to a complementary label. Experiments on various in-/out-of-distribution datasets show that the proposed method significantly outperforms the existing methods in improving the capacity of discriminating between in- and out-of-distribution samples.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2206.09385

Country:

Asia > China (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Neural Style Transfer Tutorial

#artificialintelligenceMar-5-2022, 06:55:17 GMT

Neural Style Transfer is a technique that applies the Style of 1 image to the content of another image. It's a generative algorithm meaning that the algorithm generates an image as the output. As you're probably wondering, how does it work? In this post, we'll be explaining how the vanilla Neural Style Transfer algorithm adds different styles to an image and what makes the algorithm unique and interesting. Both Style Transfer and traditional GANs share the similarity of being able to generate images as the output.

algorithm, content image, style transfer, (12 more...)

#artificialintelligence

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback