Collaborating Authors


GitHub - jason718/awesome-self-supervised-learning: A curated list of awesome self-supervised methods


Self-Supervised Learning has become an exciting direction in AI community. Predicting What You Already Know Helps: Provable Self-Supervised Learning. For self-supervised learning, Rationality implies generalization, provably. Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? FAIR Self-Supervision Benchmark [pdf] [repo]: various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches.

Self-Supervised Learning and Its Applications -


In the past decade, the research and development in AI have skyrocketed, especially after the results of the ImageNet competition in 2012. The focus was largely on supervised learning methods that require huge amounts of labeled data to train systems for specific use cases. In this article, we will explore Self Supervised Learning (SSL) – a hot research topic in a machine learning community. Self-supervised learning (SSL) is an evolving machine learning technique poised to solve the challenges posed by the over-dependence of labeled data. For many years, building intelligent systems using machine learning methods has been largely dependent on good quality labeled data. Consequently, the cost of high-quality annotated data is a major bottleneck in the overall training process.

Self-supervised learning tutorial: Implementing SimCLR with pytorch lightning


In this hands-on tutorial, we will provide you with a reimplementation of SimCLR self-supervised learning method for pretraining robust feature extractors. This method is fairly general and can be applied to any vision dataset, as well as different downstream tasks. In a previous tutorial, I wrote a bit of a background on the self-supervised learning arena. Time to get into your first project by running SimCLR on a small dataset with 100K unlabelled images called STL10. Code is available on Github.

Meet ZEROGEN: An Extreme Method for Dataset Generation via PLMs for Zero-Shot Learning


The impressive generative capacity of large-scale pretrained language models (PLMs) has inspired machine learning researchers to explore methods for generating model training examples via PLMs and data augmentation procedures, i.e. dataset generation. A novel contribution in this research direction is proposed in the new paper ZeroGen: Efficient Zero-shot Learning via Dataset Generation, from researchers at the University of Hong Kong, Shanghai AI Lab, Huawei Noah's Ark Lab and the University of Washington. The team describes their proposed ZEROGEN as an "extreme instance" of dataset generation via PLMs for zero-shot learning. ZEROGEN is a framework for prompt-based zero-shot learning (PROMPTING). Unlike existing approaches that rely on gigantic PLMs during inference, ZEROGEM introduces a more flexible and efficient approach for conducting zero-shot learning with PLMs.

Enhancing Cross-lingual Prompting with Mask Token Augmentation Artificial Intelligence

Prompting shows promising results in few-shot scenarios. However, its strength for multilingual/cross-lingual problems has not been fully exploited. Zhao and Sch\"utze (2021) made initial explorations in this direction by presenting that cross-lingual prompting outperforms cross-lingual finetuning. In this paper, we conduct empirical analysis on the effect of each component in cross-lingual prompting and derive Universal Prompting across languages, which helps alleviate the discrepancies between source-language training and target-language inference. Based on this, we propose a mask token augmentation framework to further improve the performance of prompt-based cross-lingual transfer. Notably, for XNLI, our method achieves 46.54% with only 16 English training examples per class, significantly better than 34.99% of finetuning.

What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? Artificial Intelligence

Self-supervised learning establishes a new paradigm of learning representations with much fewer or even no label annotations. Recently there has been remarkable progress on large-scale contrastive learning models which require substantial computing resources, yet such models are not practically optimal for small-scale tasks. To fill the gap, we aim to study contrastive learning on the wearable-based activity recognition task. Specifically, we conduct an in-depth study of contrastive learning from both algorithmic-level and task-level perspectives. For algorithmic-level analysis, we decompose contrastive models into several key components and conduct rigorous experimental evaluations to better understand the efficacy and rationale behind contrastive learning. More importantly, for task-level analysis, we show that the wearable-based signals bring unique challenges and opportunities to existing contrastive models, which cannot be readily solved by existing algorithms. Our thorough empirical studies suggest important practices and shed light on future research challenges. In the meantime, this paper presents an open-source PyTorch library \texttt{CL-HAR}, which can serve as a practical tool for researchers. The library is highly modularized and easy to use, which opens up avenues for exploring novel contrastive models quickly in the future.


AAAI Conferences

We describe a novel weakly supervised deep learning framework that combines both the discriminative and generative models to learn meaningful representation in the multiple instance learning (MIL) setting. MIL is a weakly supervised learning problem where labels are associated with groups of instances (referred as bags) instead of individual instances. To address the essential challenge in MIL problems raised from the uncertainty of positive instances label, we use a discriminative model regularized by variational autoencoders (VAEs) to maximize the differences between latent representations of all instances and negative instances. As a result, the hidden layer of the variational autoencoder learns meaningful representation. This representation can effectively be used for MIL problems as illustrated by better performance on the standard benchmark datasets comparing to the state-of-the-art approaches. More importantly, unlike most related studies, the proposed framework can be easily scaled to large dataset problems, as illustrated by the audio event detection and segmentation task. Visualization also confirms the effectiveness of the latent representation in discriminating positive and negative classes.

Measuring and Reducing Model Update Regression in Structured Prediction for NLP Artificial Intelligence

Recent advance in deep learning has led to rapid adoption of machine learning based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor. This work studies model update regression in structured prediction tasks. We choose syntactic dependency parsing and conversational semantic parsing as representative examples of structured prediction tasks in NLP. First, we measure and analyze model update regression in different model update settings. Next, we explore and benchmark existing techniques for reducing model update regression including model ensemble and knowledge distillation. We further propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured output. Experiments show that BCR can better mitigate model update regression than model ensemble and knowledge distillation approaches.

Supervised, Semi-Supervised, Unsupervised, and Self-Supervised Learning


The exponential number of research and publications have introduced many terms and concepts in the domain of machine learning, yet many have degenerated to merely buzzwords without many people fully understanding their differences. The most common, and perhaps THE type that we refer to when talking about machine learning is supervised learning. In simple words, supervised learning provides a set of input-output pairs such that we can learn an intermediate system that maps inputs to correct outputs. A naive example of supervised learning is determining the class (i.e., dogs/cats, etc) of an image based on a dataset of images and their corresponding classes, which we will refer to as their labels. With the given input-label pair, the current popular approach will be to directly train a deep neural network (i.e., a convolutional neural network) to output a label prediction from the given image, compute a differentiable loss between the prediction and the actual correct answers, and backpropagate through the network to update weights to optimise the predictions.

ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition Artificial Intelligence

A major bottleneck in training robust Human-Activity Recognition models (HAR) is the need for large-scale labeled sensor datasets. Because labeling large amounts of sensor data is an expensive task, unsupervised and semi-supervised learning techniques have emerged that can learn good features from the data without requiring any labels. In this paper, we extend this line of research and present a novel technique called Collaborative Self-Supervised Learning (ColloSSL) which leverages unlabeled data collected from multiple devices worn by a user to learn high-quality features of the data. A key insight that underpins the design of ColloSSL is that unlabeled sensor datasets simultaneously captured by multiple devices can be viewed as natural transformations of each other, and leveraged to generate a supervisory signal for representation learning. We present three technical innovations to extend conventional self-supervised learning algorithms to a multi-device setting: a Device Selection approach which selects positive and negative devices to enable contrastive learning, a Contrastive Sampling algorithm which samples positive and negative examples in a multi-device setting, and a loss function called Multi-view Contrastive Loss which extends standard contrastive loss to a multi-device setting. Our experimental results on three multi-device datasets show that ColloSSL outperforms both fully-supervised and semi-supervised learning techniques in majority of the experiment settings, resulting in an absolute increase of upto 7.9% in F_1 score compared to the best performing baselines. We also show that ColloSSL outperforms the fully-supervised methods in a low-data regime, by just using one-tenth of the available labeled data in the best case.