Unsupervised or Indirectly Supervised Learning
Deep Metric Transfer for Label Propagation with Limited Annotated Data
Liu, Bin, Wu, Zhirong, Hu, Han, Lin, Stephen
We study object recognition under the constraint that each object class is only represented by very few observations. In such cases, naive supervised learning would lead to severe over-fitting in deep neural networks due to limited training data. We tackle this problem by creating much more training data through label propagation from the few labeled examples to a vast collection of unannotated images. Our main insight is that such a label propagation scheme can be highly effective when the similarity metric used for propagation is learned and transferred from other related domains with lots of data. We test our approach on semi-supervised learning, transfer learning and few-shot recognition, where we learn our similarity metric using various supervised/unsupervised pretraining methods, and transfer it to unlabeled data across different data distributions. By taking advantage of unlabeled data in this way, we achieve significant improvements on all three tasks. Notably, our approach outperforms current state-of-the-art techniques by an absolute $20\%$ for semi-supervised learning on CIFAR10, $10\%$ for transfer learning from ImageNet to CIFAR10, and $6\%$ for few-shot recognition on mini-ImageNet, when labeled examples are limited.
Semi-supervised mp-MRI Data Synthesis with StitchLayer and Auxiliary Distance Maximization
Wang, Zhiwei, Lin, Yi, Cheng, Kwang-Ting, Yang, Xin
In this paper, we address the problem of synthesizing multi-parameter magnetic resonance imaging (mp-MRI) data, i.e. Apparent Diffusion Coefficients (ADC) and T2-weighted (T2w), containing clinically significant (CS) prostate cancer (PCa) via semi-supervised adversarial learning. Specifically, our synthesizer generates mp-MRI data in a sequential manner: first generating ADC maps from 128-d latent vectors, followed by translating them to the T2w images. The synthesizer is trained in a semisupervised manner. In the supervised training process, a limited amount of paired ADC-T2w images and the corresponding ADC encodings are provided and the synthesizer learns the paired relationship by explicitly minimizing the reconstruction losses between synthetic and real images. To avoid overfitting limited ADC encodings, an unlimited amount of random latent vectors and unpaired ADC-T2w Images are utilized in the unsupervised training process for learning the marginal image distributions of real images. To improve the robustness of synthesizing, we decompose the difficult task of generating full-size images into several simpler tasks which generate sub-images only. A StitchLayer is then employed to fuse sub-images together in an interlaced manner into a full-size image. To enforce the synthetic images to indeed contain distinguishable CS PCa lesions, we propose to also maximize an auxiliary distance of Jensen-Shannon divergence (JSD) between CS and nonCS images. Experimental results show that our method can effectively synthesize a large variety of mpMRI images which contain meaningful CS PCa lesions, display a good visual quality and have the correct paired relationship. Compared to the state-of-the-art synthesis methods, our method achieves a significant improvement in terms of both visual and quantitative evaluation metrics.
Bayesian CycleGAN via Marginalizing Latent Sampling
You, Haoran, Cheng, Yu, Cheng, Tianheng, Li, Chunliang, Zhou, Pan
Recent techniques built on Generative Adversarial Networks (GANs) like CycleGAN are able to learn mappings between domains from unpaired datasets through min-max optimization games between generators and discriminators. However, it remains challenging to stabilize training process and diversify generated results. To address these problems, we present a Bayesian extension of cyclic model and an integrated cyclic framework for inter-domain mappings. The proposed method stimulated by Bayesian GAN explores the full posteriors of Bayesian cyclic model (with latent sampling) and optimizes the model with maximum a posteriori (MAP) estimation. Hence, we name it {\tt Bayesian CycleGAN}. We perform the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo. The quantitative and qualitative evaluations demonstrate the proposed method can achieve more stable training, superior performance and diversified images generating.
A Style-Based Generator Architecture for Generative Adversarial Networks
Authors: Tero Karras (NVIDIA) Samuli Laine (NVIDIA) Timo Aila (NVIDIA) Abstract: We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.
Contrastive Training for Models of Information Cascades
This paper proposes a model of information cascades as directed spanning trees (DSTs) over observed documents. In addition, we propose a contrastive training procedure that exploits partial temporal ordering of node infections in lieu of labeled training links. This combination of model and unsupervised training makes it possible to improve on models that use infection times alone and to exploit arbitrary features of the nodes and of the text content of messages in information cascades. With only basic node and time lag features similar to previous models, the DST model achieves performance with unsupervised training comparable to strong baselines on a blog network inference task. Unsupervised training with additional content features achieves significantly better results, reaching half the accuracy of a fully supervised model.
Adversarial Transfer Learning
Wilson, Garrett, Cook, Diane J.
There is a recent large and growing interest in generative adversarial networks (GANs), which offer powerful features for generative modeling, density estimation, and energy function learning. GANs are difficult to train and evaluate but are capable of creating amazingly realistic, though synthetic, image data. Ideas stemming from GANs such as adversarial losses are creating research opportunities for other challenges such as domain adaptation. In this paper, we look at the field of GANs with emphasis on these areas of emerging research. To provide background for adversarial techniques, we survey the field of GANs, looking at the original formulation, training variants, evaluation methods, and extensions. Then we survey recent work on transfer learning, focusing on comparing different adversarial domain adaptation methods. Finally, we take a look forward to identify open research directions for GANs and domain adaptation, including some promising applications such as sensor-based human behavior modeling.
Ladder Networks for Semi-Supervised Hyperspectral Image Classification
We used the Ladder Network [Rasmus et al. (2015)] to perform Hyperspectral Image Classification in a semi-supervised setting. The Ladder Network distinguishes itself from other semi-supervised methods by jointly optimizing a supervised and unsupervised cost. In many settings this has proven to be more successful than other semi-supervised techniques, such as pretraining using unlabeled data. We furthermore show that the convolutional Ladder Network outperforms most of the current techniques used in hyperspectral image classification and achieves new state-of-the-art performance on the Pavia University dataset given only 5 labeled data points per class.
Semi-supervised Rare Disease Detection Using Generative Adversarial Network
Li, Wenyuan, Wang, Yunlong, Cai, Yong, Arnold, Corey, Zhao, Emily, Yuan, Yilian
Rare diseases affect a relatively small number of people, which limits investment in research for treatments and cures. Developing an efficient method for rare disease detection is a crucial first step towards subsequent clinical research. In this paper, we present a semi-supervised learning framework for rare disease detection using generative adversarial networks. Our method takes advantage of the large amount of unlabeled data for disease detection and achieves the best results in terms of precision-recall score compared to baseline techniques.
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
Locatello, Francesco, Bauer, Stefan, Lucic, Mario, Gelly, Sylvain, Schölkopf, Bernhard, Bachem, Olivier
In recent years, the interest in unsupervised learning of disentangled representations has significantly increased. The key assumption is that real-world data is generated by a few explanatory factors of variation and that these factors can be recovered by unsupervised learning algorithms. A large number of unsupervised learning approaches based on auto-encoding and quantitative evaluation metrics of disentanglement have been proposed; yet, the efficacy of the proposed approaches and utility of proposed notions of disentanglement has not been challenged in prior work. In this paper, we provide a sober look on recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12000 models covering the six most prominent methods, and evaluate them across six disentanglement metrics in a reproducible large-scale experimental study on seven different data sets. On the positive side, we observe that different methods successfully enforce properties "encouraged" by the corresponding losses. On the negative side, we observe in our study that well-disentangled models seemingly cannot be identified without access to ground-truth labels even if we are allowed to transfer hyperparameters across data sets. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. These results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.
Disentangled Variational Auto-Encoder for Semi-supervised Learning
Li, Yang, Pan, Quan, Wang, Suhang, Peng, Haiyun, Yang, Tao, Cambria, Erik
Semi-supervised learning is attracting increasing attention due to the fact that datasets of many domains lack enough labeled data. Variational Auto-Encoder (VAE), in particular, has demonstrated the benefits of semi-supervised learning. The majority of existing semi-supervised VAEs utilize a classifier to exploit label information, where the parameters of the classifier are introduced to the VAE. Given the limited labeled data, learning the parameters for the classifiers may not be an optimal solution for exploiting label information. Therefore, in this paper, we develop a novel approach for semi-supervised VAE without classifier. Specifically, we propose a new model called Semi-supervised Disentangled VAE (SDVAE), which encodes the input data into disentangled representation and non-interpretable representation, then the category information is directly utilized to regularize the disentangled representation via the equality constraint. To further enhance the feature learning ability of the proposed VAE, we incorporate reinforcement learning to relieve the lack of data. The dynamic framework is capable of dealing with both image and text data with its corresponding encoder and decoder networks. Extensive experiments on image and text datasets demonstrate the effectiveness of the proposed framework.