AITopics | noisy student

Collaborating Authors

noisy student

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Empirical Investigation of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration

Naganuma, Hiroki, Hataya, Ryuichiro, Mitliagkas, Ioannis

arXiv.org Artificial IntelligenceNov-19-2023

In the realm of out-of-distribution (OOD) generalization tasks, fine-tuning pre-trained models has become a prevalent strategy. Different from most prior work that has focused on advancing learning algorithms, we systematically examined how pre-trained model size, pre-training data scale, and training strategies impact downstream generalization and uncertainty calibration. We evaluated 97 models across diverse pre-trained model sizes, five pre-training datasets, and five data augmentations through extensive experiments on four distribution shift datasets totaling over 100,000 GPU hours. Our results demonstrate the significant impact of pre-trained model selection, with optimal choices substantially improving OOD accuracy over algorithm improvement alone. We find larger models and bigger pre-training data improve OOD performance and calibration, in contrast to some prior studies that found modern deep networks to calibrate worse than classical shallow models. Our work underscores the overlooked importance of pre-trained model selection for out-of-distribution generalization and calibration.

augreg, resnet-50, vit-base, (15 more...)

arXiv.org Artificial Intelligence

2307.08187

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Intriguing properties of generative classifiers

Jaini, Priyank, Clark, Kevin, Geirhos, Robert

arXiv.org Machine LearningSep-28-2023

What is the best paradigm to recognize objects -- discriminative inference (fast but potentially prone to shortcut learning) or using a generative model (slow but potentially more robust)? We build on recent advances in generative modeling that turn text-to-image models into classifiers. This allows us to study their behavior and to compare them against discriminative models and human psychophysical data. We report four intriguing emergent properties of generative classifiers: they show a record-breaking human-like shape bias (99% for Imagen), near human-level out-of-distribution accuracy, state-of-the-art alignment with human classification errors, and they understand certain perceptual illusions. Our results indicate that while the current dominant paradigm for modeling human object recognition is discriminative inference, zero-shot generative models approximate human object recognition data surprisingly well.

bit-m, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2309.16779

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Utilizing Deep Learning for Detecting Hotel Scenes

#artificialintelligenceApr-25-2022, 08:30:05 GMT

You must have heard this line often, but it's not 100% correct. We often judge something by the first thing we see, including choosing a place to stay. Tiket.com is willing to provide the best experience for users to easily spot what the hotel looks like. That means we should put a building or bedroom image as the main image instead of a bathroom image. Given the examples above, we can see that the hotel building is being set as the main image for a particular hotel's detail page. Typically, these images will be placed on the top part of the hotel detail page on tiket.com.

architecture, dataset, shard, (14 more...)

#artificialintelligence

Industry: Consumer Products & Services > Hotels (0.76)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Are You Ready for Vision Transformer (ViT)?

#artificialintelligenceOct-31-2020, 06:45:06 GMT

It is applicable not only for creatures but also for technologies. Technologies in data science have been filled with hypes and biased success stories. Having said that, there are technologies that have lead to the growth of data science: Convolutional Neural Network (CNN). Since AlexNet in 2012, different architectures of CNNs have brought a tremendous contribution to real business operations and academic researches. Residual Networks (ResNet) by Microsoft Research in 2015 brought a real breakthrough to build "deep" CNNs; however, an honorable retirement of this technology would be approaching.

artificial intelligence, machine learning, transformer, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Self-training with Noisy Student improves ImageNet classification

Xie, Qizhe, Hovy, Eduard, Luong, Minh-Thang, Le, Quoc V.

arXiv.org Machine LearningNov-11-2019

We present a simple self-training method that achieves 87.4% top-1 accuracy on ImageNet, which is 1.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves ImageNet-A top-1 accuracy from 16.6% to 74.2%, reduces ImageNet-C mean corruption error from 45.7 to 31.2, and reduces ImageNet-P mean flip rate from 27.8 to 16.1. To achieve this result, we first train an EfficientNet model on labeled ImageNet images and use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher. During the generation of the pseudo labels, the teacher is not noised so that the pseudo labels are as good as possible. But during the learning of the student, we inject noise such as data augmentation, dropout, stochastic depth to the student so that the noised student is forced to learn harder from the pseudo labels.

accuracy, noisy student, unlabeled image, (12 more...)

arXiv.org Machine Learning

1911.04252

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback