AITopics | ishan misra

Collaborating Authors

ishan misra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ContrastiveandNon-ContrastiveSelf-Supervised LearningRecoverGlobalandLocalSpectral EmbeddingMethods

Neural Information Processing SystemsFeb-11-2026, 06:46:05 GMT

In doing so, we are able to find the exact settings for which different methodsprovablybecome identical.

artificial intelligence, arxivpreprintarxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging

Meseguer-Brocal, Gabriel, Desblancs, Dorian, Hennequin, Romain

arXiv.org Artificial IntelligenceApr-14-2024

Self-supervised learning has emerged as a powerful way to pre-train generalizable machine learning models on large amounts of unlabeled data. It is particularly compelling in the music domain, where obtaining labeled data is time-consuming, error-prone, and ambiguous. During the self-supervised process, models are trained on pretext tasks, with the primary objective of acquiring robust and informative features that can later be fine-tuned for specific downstream tasks. The choice of the pretext task is critical as it guides the model to shape the feature space with meaningful constraints for information encoding. In the context of music, most works have relied on contrastive learning or masking techniques. In this study, we expand the scope of pretext tasks applied to music by investigating and comparing the performance of new self-supervised methods for music tagging. We open-source a simple ResNet model trained on a diverse catalog of millions of tracks. Our results demonstrate that, although most of these pre-training methods result in similar downstream results, contrastive learning consistently results in better downstream performance compared to other self-supervised pre-training methods. This holds true in a limited-data downstream context.

dataset, learning, pretext task, (15 more...)

arXiv.org Artificial Intelligence

2404.09177

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.89)

Industry:

Media > Music (0.88)
Leisure & Entertainment (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.50)

Add feedback

OmniMAE: Single Model Masked Pretraining on Images and Videos

Girdhar, Rohit, El-Nouby, Alaaeldin, Singh, Mannat, Alwala, Kalyan Vasudev, Joulin, Armand, Misra, Ishan

arXiv.org Artificial IntelligenceMay-31-2023

Transformer-based architectures have become competitive across a variety of visual domains, most notably images and videos. While prior work studies these modalities in isolation, having a common architecture suggests that one can train a single unified model for multiple visual modalities. Prior attempts at unified modeling typically use architectures tailored for vision tasks, or obtain worse performance compared to single modality models. In this work, we show that masked autoencoding can be used to train a simple Vision Transformer on images and videos, without requiring any labeled data. This single model learns visual representations that are comparable to or better than single-modality representations on both image and video benchmarks, while using a much simpler architecture. Furthermore, this model can be learned by dropping 90% of the image and 95% of the video patches, enabling extremely fast training of huge model architectures. In particular, we show that our single ViT-Huge model can be finetuned to achieve 86.6% on ImageNet and 75.5% on the challenging Something Something-v2 video benchmark, setting a new state-of-the-art.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.08356

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

#206 - Ishan Misra: Self-Supervised Deep Learning in Computer Vision

#artificialintelligenceAug-3-2021, 12:05:19 GMT

Ishan Misra is a research scientist at FAIR working on self-supervised visual learning. Please support this podcast by checking out our sponsors: – Onnit: https://lexfridman.com/onnit to get up to 10% off – The Information: https://theinformation.com/lex to get 75% off first month – Grammarly: https://grammarly.com/lex to get 20% off premium – Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil SUPPORT & CONNECT: – Check out the sponsors above, it's the best way to support this podcast – Support on Patreon: https://www.patreon.com/lexfridman On some podcast players you should be able to click the timestamp to jump to that time.

computer vision, ishan misra, self-supervised deep learning, (7 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback