AITopics | Goyal, Priya

Plotting

Goyal, Priya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?

Alabdulmohsin, Ibrahim, Wang, Xiao, Steiner, Andreas, Goyal, Priya, D'Amour, Alexander, Zhai, Xiaohua

arXiv.org Artificial IntelligenceMar-7-2024

We study the effectiveness of data-balancing for mitigating biases in contrastive language-image pretraining (CLIP), identifying areas of strength and limitation. First, we reaffirm prior conclusions that CLIP models can inadvertently absorb societal stereotypes. To counter this, we present a novel algorithm, called Multi-Modal Moment Matching (M4), designed to reduce both representation and association biases (i.e. in first- and second-order statistics) in multimodal data. We use M4 to conduct an in-depth analysis taking into account various factors, such as the model, representation, and data size. Our study also explores the dynamic nature of how CLIP learns and unlearns biases. In particular, we find that fine-tuning is effective in countering representation biases, though its impact diminishes for association biases. Also, data balancing has a mixed impact on quality: it tends to improve classification but can hurt retrieval. Interestingly, data and architectural improvements seem to mitigate the negative impact of data balancing on performance; e.g. applying M4 to SigLIP-B/16 with data quality filters improves COCO image-to-text retrieval @5 from 86% (without data balancing) to 87% and ImageNet 0-shot classification from 77% to 77.5%! Finally, we conclude with recommendations for improving the efficacy of data balancing in multimodal systems.

data quality, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.04547

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

Goyal, Priya, Duval, Quentin, Seessel, Isaac, Caron, Mathilde, Singh, Mannat, Misra, Ishan, Sagun, Levent, Joulin, Armand, Bojanowski, Piotr

arXiv.org Artificial IntelligenceFeb-16-2022

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images. Applied to ImageNet, this leads to object centric features that perform on par with supervised features on most object-centric downstream tasks. In this work, we question if using this ability, we can learn any salient and more representative information present in diverse unbounded set of images from across the globe. To do so, we train models on billions of random images without any data pre-processing or prior assumptions about what we want the model to learn. We scale our model size to dense 10 billion parameters to avoid underfitting on a large data size. We extensively study and validate our model performance on over 50 benchmarks including fairness, robustness to distribution shift, geographical diversity, fine grained recognition, image copy detection and many image classification datasets. The resulting model, not only captures well semantic information, it also captures information about artistic style and learns salient information such as geolocations and multilingual word embeddings based on visual content only. More importantly, we discover that such model is more robust, more fair, less harmful and less biased than supervised models or models trained on object centric datasets such as ImageNet.

artificial intelligence, machine learning, uncurated image, (3 more...)

arXiv.org Artificial Intelligence

2202.0836

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Self-supervised Pretraining of Visual Features in the Wild

Goyal, Priya, Caron, Mathilde, Lefaudeux, Benjamin, Xu, Min, Wang, Pengchao, Pai, Vivek, Singh, Mannat, Liptchinsky, Vitaliy, Misra, Ishan, Joulin, Armand, Bojanowski, Piotr

arXiv.org Artificial IntelligenceMar-5-2021

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl

architecture, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2103.01988

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.65)

Add feedback

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Goyal, Priya, Mahajan, Dhruv, Gupta, Abhinav, Misra, Ishan

arXiv.org Artificial IntelligenceJun-6-2019

Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning - the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular self-supervised approaches to 100 million images. We show that by scaling on various axes (including data size and problem 'hardness'), one can largely match or even exceed the performance of supervised pre-training on a variety of tasks such as object detection, surface normal estimation (3D) and visual navigation using reinforcement learning. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not 'hard' enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. We also introduce an extensive benchmark across 9 different datasets and tasks. We believe that such a benchmark along with comparable evaluation settings is necessary to make meaningful progress. Code is at: https://github.com/facebookresearch/fair_self_supervision_benchmark.

dataset, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1905.01235

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback