Unsupervised or Indirectly Supervised Learning
What Are The Most Exciting Areas Of Artificial Intelligence Research In 2018?
A few years ago, we could barely generate written digits. Then, the convolutional network came along and suddenly images became hundreds of times easier to deal with. In recent years, the generative adversarial network has brought the most magic with it. We can now generate celebrity faces at almost perfect accuracy, which in my opinion is amazing and something I had not even thought was possible. Also, something that's caught my attention is CycleGAN, which is a way to translate images (for example, cows- horses or daytime- nighttime), WITHOUT the need for training pairs!
HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning
Robert, Thomas, Thome, Nicolas, Cord, Matthieu
In this paper, we introduce a new model for leveraging unlabeled data to improve generalization performances of image classifiers: a two-branch encoder-decoder architecture called HybridNet. The first branch receives supervision signal and is dedicated to the extraction of invariant class-related representations. The second branch is fully unsupervised and dedicated to model information discarded by the first branch to reconstruct input data. To further support the expected behavior of our model, we propose an original training objective. It favors stability in the discriminative branch and complementarity between the learned representations in the two branches. HybridNet is able to outperform state-of-the-art results on CIFAR-10, SVHN and STL-10 in various semi-supervised settings.
Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis
Henter, Gustav Eje, Wang, Xin, Yamagishi, Junichi
Generating versatile and appropriate synthetic speech requires control over the output expression separate from the spoken text. Important non-textual speech variation is seldom annotated, in which case output control must be learned in an unsupervised fashion. In this paper, we perform an in-depth study of methods for unsupervised learning of control in statistical speech synthesis. For example, we show that popular unsupervised training heuristics can be interpreted as variational inference in certain autoencoder models. We additionally connect these models to VQ-VAEs, another, recently-proposed class of deep variational autoencoders, which we show can be derived from a very similar mathematical argument. The implications of these new probabilistic interpretations are discussed. We illustrate the utility of the various approaches with an application to emotional speech synthesis, where the unsupervised methods for learning expression control (without access to emotional labels) are found to give results that in many aspects match or surpass the previous best supervised approach.
A Survey on Multi-Task Learning
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL. First, we classify different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approach, task relation learning approach, and decomposition approach, and then discuss the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, batch MTL models are difficult to handle this situation and online, parallel and distributed MTL models as well as dimensionality reduction and feature hashing are reviewed to reveal their computational and storage advantages. Many real-world applications use MTL to boost their performance and we review representative works. Finally, we present theoretical analyses and discuss several future directions for MTL.
Feature Selection For Unsupervised Learning
After reviewing popular techniques used in supervised, unsupervised and semi-supervised machine learning, we focus on feature selection methods in these different contexts, especially the metrics used to assess the value of a feature or set of features, be it binary, continuous or categorical variables. We go in deeper details and review modern feature selection techniques for unsupervised learning, typically relying on entropy-like criteria. While these criteria are usually model-dependent or scale-dependent, we introduce a new model-free, data-driven methodology in this context, with an application to an interesting number theory problem (simulated data set) in which each feature has a known theoretical entropy. We also briefly discuss high precision computing as it is relevant to this peculiar data set, as well as units of information smaller than the bit.
Unsupervised learning demystified โ Hacker Noon
Unsupervised learning may sound like a fancy way to say "let the kids learn on their own not to touch the hot oven" but it's actually a pattern-finding technique for mining inspiration from your data. It has nothing to do with machines running around without adult supervision, forming their own opinions about things. This post is beginner-friendly, but assumes you're familiar with the story so far: Check out the six instances above. These photographs are not accompanied by labels. No worries, your brain is pretty good at unsupervised learning.
Manifold regularization with GANs for semi-supervised learning
Lecouat, Bruno, Foo, Chuan-Sheng, Zenati, Houssam, Chandrasekhar, Vijay
Generative Adversarial Networks are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating a variant of the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the semi-supervised feature-matching GAN we achieve state-of-the-art results for GAN-based semi-supervised learning on CIFAR-10 and SVHN benchmarks, with a method that is significantly easier to implement than competing methods. We also find that manifold regularization improves the quality of generated images, and is affected by the quality of the GAN used to approximate the regularizer.
Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis
Pham, Hai, Manzini, Thomas, Liang, Paul Pu, Poczos, Barnabas
Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple modalities. In this paper, we propose two methods for unsupervised learning of joint multimodal representations using sequence to sequence (Seq2Seq) methods: a \textit{Seq2Seq Modality Translation Model} and a \textit{Hierarchical Seq2Seq Modality Translation Model}. We also explore multiple different variations on the multimodal inputs and outputs of these seq2seq models. Our experiments on multimodal sentiment analysis using the CMU-MOSI dataset indicate that our methods learn informative multimodal representations that outperform the baselines and achieve improved performance on multimodal sentiment analysis, specifically in the Bimodal case where our model is able to improve F1 Score by twelve points. We also discuss future directions for multimodal Seq2Seq methods.
Augmented Cyclic Adversarial Learning for Domain Adaptation
Hosseini-Asl, Ehsan, Zhou, Yingbo, Xiong, Caiming, Socher, Richard
Training a model to perform a task typically requires a large amount of data from the domains in which the task will be applied. However, it is often the case that data are abundant in some domains but scarce in others. Domain adaptation deals with the challenge of adapting a model trained from a data-rich source domain to perform well in a data-poor target domain. In general, this requires learning plausible mappings between domains. CycleGAN is a powerful framework that efficiently learns to map inputs from one domain to another using adversarial training and a cycle-consistency constraint. However, the conventional approach of enforcing cycle-consistency via reconstruction may be overly restrictive in cases where one or more domains have limited training data. In this paper, we propose an augmented cyclic adversarial learning model that enforces the cycle-consistency constraint through an external task specific model, which encourages the preservation of task-relevant content as opposed to exact reconstruction. This task specific model both relaxes the cycle-consistency constraint and complements the role of the discriminator during training, serving as an augmented information source for learning the mapping. In the experiment, we adopt a speech recognition model from each domain as the task specific model. Our approach improves absolute performance of speech recognition by $2\%$ for female speakers in the TIMIT dataset, where the majority of training samples are from male voices. We also explore digit classification with MNIST and SVHN in a low-resource setting and show that our approach improves absolute performance by $14\%$ and $4\%$ when adapting SVHN to MNIST and vice versa, respectively. Our approach also outperforms unsupervised domain adaptation methods, which require high-resource unlabeled target domain.
GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations
Yang, Zhilin, Zhao, Jake, Dhingra, Bhuwan, He, Kaiming, Cohen, William W., Salakhutdinov, Ruslan, LeCun, Yann
Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks. Our proposed transfer learning framework improves performance on various tasks including question answering, natural language inference, sentiment analysis, and image classification. We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.