Unsupervised or Indirectly Supervised Learning
USB: A Unified Semi-supervised Learning Benchmark for Classification
Wang, Yidong, Chen, Hao, Fan, Yue, Sun, Wang, Tao, Ran, Hou, Wenxin, Wang, Renjie, Yang, Linyi, Zhou, Zhi, Guo, Lan-Zhe, Qi, Heli, Wu, Zhen, Li, Yu-Feng, Nakamura, Satoshi, Ye, Wei, Savvides, Marios, Raj, Bhiksha, Shinozaki, Takahiro, Schiele, Bernt, Wang, Jindong, Xie, Xing, Zhang, Yue
Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) for classification by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate the dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation of these SSL methods. We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning. USB enables the evaluation of a single SSL algorithm on more tasks from multiple domains but with less cost. Specifically, on a single NVIDIA V100, only 39 GPU days are required to evaluate FixMatch on 15 tasks in USB while 335 GPU days (279 GPU days on 4 CV datasets except for ImageNet) are needed on 5 CV tasks with TorchSSL.
Exploiting Mixed Unlabeled Data for Detecting Samples of Seen and Unseen Out-of-Distribution Classes
Out-of-Distribution (OOD) detection is essential in real-world applications, which has attracted increasing attention in recent years. However, most existing OOD detection methods require many labeled In-Distribution (ID) data, causing a heavy labeling cost. In this paper, we focus on the more realistic scenario, where limited labeled data and abundant unlabeled data are available, and these unlabeled data are mixed with ID and OOD samples. We propose the Adaptive In-Out-aware Learning (AIOL) method, in which we employ the appropriate temperature to adaptively select potential ID and OOD samples from the mixed unlabeled data and consider the entropy over them for OOD detection. Moreover, since the test data in realistic applications may contain OOD samples whose classes are not in the mixed unlabeled data (we call them unseen OOD classes), data augmentation techniques are brought into the method to further improve the performance. The experiments are conducted on various benchmark datasets, which demonstrate the superiority of our method.
Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty
Oh, Jaehoon, Kim, Sungnyun, Ho, Namgyu, Kim, Jin-Hwa, Song, Hwanjun, Yun, Se-Young
Cross-domain few-shot learning (CD-FSL) has drawn increasing attention for handling large differences between the source and target domains--an important concern in real-world scenarios. To overcome these large differences, recent works have considered exploiting small-scale unlabeled data from the target domain during the pre-training stage. This data enables self-supervised pre-training on the target domain, in addition to supervised pre-training on the source domain. In this paper, we empirically investigate which pre-training is preferred based on domain similarity and few-shot difficulty of the target domain. We discover that the performance gain of self-supervised pre-training over supervised pre-training becomes large when the target domain is dissimilar to the source domain, or the target domain itself has low few-shot difficulty. We further design two pre-training schemes, mixed-supervised and two-stage learning, that improve performance. In this light, we present six findings for CD-FSL, which are supported by extensive experiments and analyses on three source and eight target benchmark datasets with varying levels of domain similarity and few-shot difficulty. Our code is available at https://github.com/sungnyun/understanding-cdfsl.
Unsupervised Learning Algorithms: Celebi, M. Emre, Aydin, Kemal: 9783319242095: Amazon.com: Books
The contributors discuss how with the proliferation of massive amounts of unlabeled data, unsupervised learning algorithms, which can automatically discover interesting and useful patterns in such data, have gained popularity among researchers and practitioners. The authors outline how these algorithms have found numerous applications including pattern recognition, market basket analysis, web mining, social network analysis, information retrieval, recommender systems, market research, intrusion detection, and fraud detection. They present how the difficulty of developing theoretically sound approaches that are amenable to objective evaluation have resulted in the proposal of numerous unsupervised learning algorithms over the past half-century. The intended audience includes researchers and practitioners who are increasingly using unsupervised learning algorithms to analyze their data. Topics of interest include anomaly detection, clustering, feature extraction, and applications of unsupervised learning.
Semi-supervised Learning with Deterministic Labeling and Large Margin Projection
Xu, Ji, Ren, Gang, Xiao, Yao, Li, Shaobo, Wang, Guoyin
The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL), but most SSL models select the labeled data randomly. This study first construct a leading forest that forms a partially ordered topological space in an unsupervised way, and select a group of most representative samples to label with one shot (differs from active learning essentially) using property of homeomorphism. Then a kernelized large margin metric is efficiently learned for the selected data to classify the remaining unlabeled sample. Optimal leading forest (OLF) has been observed to have the advantage of revealing the difference evolution along a path within a subtree. Therefore, we formulate an optimization problem based on OLF to select the samples. Also with OLF, the multiple local metrics learning is facilitated to address multi-modal and mix-modal problem in SSL, especially when the number of class is large. Attribute to this novel design, stableness and accuracy of the performance is significantly improved when compared with the state-of-the-art graph SSL methods. The extensive experimental studies have shown that the proposed method achieved encouraging accuracy and efficiency. Code has been made available at https://github.com/alanxuji/DeLaLA.
On the Importance of Calibration in Semi-supervised Learning
Loh, Charlotte, Dangovski, Rumen, Sudalairaj, Shivchander, Han, Seungwook, Han, Ligong, Karlinsky, Leonid, Soljacic, Marin, Srivastava, Akash
State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data by combining techniques of consistency regularization and pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and thus, model calibration is important in mitigating confirmation bias. Yet, many SOTA methods are optimized for model performance, with little focus directed to improve model calibration. In this work, we empirically demonstrate that model calibration is strongly correlated with model performance and propose to improve calibration via approximate Bayesian techniques. We introduce a family of new SSL models that optimizes for calibration and demonstrate their effectiveness across standard vision benchmarks of CIFAR-10, CIFAR-100 and ImageNet, giving up to 15.9% improvement in test accuracy. Furthermore, we also demonstrate their effectiveness in additional realistic and challenging problems, such as class-imbalanced datasets and in photonics science.
Grow and Merge: A Unified Framework for Continuous Categories Discovery
Zhang, Xinwei, Jiang, Jianwen, Feng, Yutong, Wu, Zhi-Fan, Zhao, Xibin, Wan, Hai, Tang, Mingqian, Jin, Rong, Gao, Yue
Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories. In this work, we focus on the application scenarios where unlabeled data are continuously fed into the category discovery system. We refer to it as the {\bf Continuous Category Discovery} ({\bf CCD}) problem, which is significantly more challenging than the static setting. A common challenge faced by novel category discovery is that different sets of features are needed for classification and category discovery: class discriminative features are preferred for classification, while rich and diverse features are more suitable for new category mining. This challenge becomes more severe for dynamic setting as the system is asked to deliver good performance for known classes over time, and at the same time continuously discover new classes from unlabeled data. To address this challenge, we develop a framework of {\bf Grow and Merge} ({\bf GM}) that works by alternating between a growing phase and a merging phase: in the growing phase, it increases the diversity of features through a continuous self-supervised learning for effective category mining, and in the merging phase, it merges the grown model with a static one to ensure satisfying performance for known classes. Our extensive studies verify that the proposed GM framework is significantly more effective than the state-of-the-art approaches for continuous category discovery.
Exemplar Learning for Medical Image Segmentation
Medical image annotation typically requires expert knowledge and hence incurs time-consuming and expensive data annotation costs. To alleviate this burden, we propose a novel learning scenario, Exemplar Learning (EL), to explore automated learning processes for medical image segmentation with a single annotated image example. This innovative learning task is particularly suitable for medical image segmentation, where all categories of organs can be presented in one single image and annotated all at once. To address this challenging EL task, we propose an Exemplar Learning-based Synthesis Net (ELSNet) framework for medical image segmentation that enables innovative exemplar-based data synthesis, pixel-prototype based contrastive embedding learning, and pseudo-label based exploitation of the unlabeled data. Specifically, ELSNet introduces two new modules for image segmentation: an exemplar-guided synthesis module, which enriches and diversifies the training set by synthesizing annotated samples from the given exemplar, and a pixel-prototype based contrastive embedding module, which enhances the discriminative capacity of the base segmentation model via contrastive representation learning. Moreover, we deploy a two-stage process for segmentation model training, which exploits the unlabeled data with predicted pseudo segmentation labels. To evaluate this new learning framework, we conduct extensive experiments on several organ segmentation datasets and present an in-depth analysis. The empirical results show that the proposed exemplar learning framework produces effective segmentation results.
Supervised and Unsupervised Learning of Audio Representations for Music Understanding
McCallum, Matthew C., Korzeniowski, Filip, Oramas, Sergio, Gouyon, Fabien, Ehmann, Andreas F.
In this work, we provide a broad comparative analysis of strategies for pre-training audio understanding models for several tasks in the music domain, including labelling of genre, era, origin, mood, instrumentation, key, pitch, vocal characteristics, tempo and sonority. Specifically, we explore how the domain of pre-training datasets (music or generic audio) and the pre-training methodology (supervised or unsupervised) affects the adequacy of the resulting audio embeddings for downstream tasks. We show that models trained via supervised learning on large-scale expert-annotated music datasets achieve state-of-the-art performance in a wide range of music labelling tasks, each with novel content and vocabularies. This can be done in an efficient manner with models containing less than 100 million parameters that require no fine-tuning or reparameterization for downstream tasks, making this approach practical for industry-scale audio catalogs. Within the class of unsupervised learning strategies, we show that the domain of the training dataset can significantly impact the performance of representations learned by the model. We find that restricting the domain of the pre-training dataset to music allows for training with smaller batch sizes while achieving state-of-the-art in unsupervised learning -- and in some cases, supervised learning -- for music understanding. We also corroborate that, while achieving state-of-the-art performance on many tasks, supervised learning can cause models to specialize to the supervised information provided, somewhat compromising a model's generality.
Unsupervised Few-shot Learning via Deep Laplacian Eigenmaps
Few-shot learning (Fei-Fei et al., 2006) aims to learn a new classification or regression model on a novel task that is not seen during training, given only a few examples in the novel task. Existing few-shot learning methods either rely on episodic meta-learning (Finn et al., 2017, Snell et al., 2017) or standard pretraining (Chen et al., 2019, Tian et al., 2020b) in a supervised manner to extract transferrable knowledge to a new few-shot task. Unfortunately, these methods require many labeled meta-training samples. Acquiring a lot of labeled data is costly or even impossible in practice. Recently, several unsupervised meta-learning approaches have attempted to address this problem by constructing synthetic tasks on unlabeled meta-training data (Hsu et al., 2019, Khodadadeh et al., 2019, 2021) or meta-training on self-supervised pretrained features (Lee et al., 2021a). However, the performance of unsupervised meta-learning approaches is still far from their supervised counterparts. Empirical studies in supervised pretraining show that representation learning via grouping similar samples together (Chen et al., 2019, Dhillon et al., 2020, Laenen and Bertinetto, 2021, Tian et al., 2020b) outperforms a wide range of episodic meta-learning methods, where the definition of similar samples is given by class labels. The motivation of this study is to develop an unsupervised representation learning method by grouping unlabeled meta-training data without episodic training and close the performance gap between supervised and unsupervised few-shot learning. Contrastive self-supervised learning has shown remarkable success in learning representation from unlabeled data, which is competitive with supervised learning on multiple visual tasks (Hénaff et al., 2020, Tian et al., 2020a).