Unsupervised or Indirectly Supervised Learning
Semi-Supervised Learning with Self-Supervised Networks
Recent advances in semi-supervised learning have shown tremendous potential in overcoming a major barrier to the success of modern machine learning algorithms: access to vast amounts of human-labeled training data. Algorithms based on self-ensemble learning and virtual adversarial training can harness the abundance of unlabeled data to produce impressive state-of-the-art results on a number of semi-supervised benchmarks, approaching the performance of strong supervised baselines using only a fraction of the available labeled data. However, these methods often require careful tuning of many hyper-parameters and are usually not easy to implement in practice. In this work, we present a conceptually simple yet effective semi-supervised algorithm based on self-supervised learning to combine semantic feature representations from unlabeled data. Our models are efficiently trained end-to-end for the joint, multi-task learning of labeled and unlabeled data in a single stage. Striving for simplicity and practicality, our approach requires no additional hyper-parameters to tune for optimal performance beyond the standard set for training convolutional neural networks. We conduct a comprehensive empirical evaluation of our models for semi-supervised image classification on SVHN, CIFAR-10 and CIFAR-100, and demonstrate results competitive with, and in some cases exceeding, prior state of the art. Reference code and data are available at https://github.com/vuptran/sesemi.
Adversarial Computation of Optimal Transport Maps
Leygonie, Jacob, She, Jennifer, Almahairi, Amjad, Rajeswar, Sai, Courville, Aaron
Computing optimal transport maps between high-dimensional and continuous distributions is a challenging problem in optimal transport (OT). Generative adversarial networks (GANs) are powerful generative models which have been successfully applied to learn maps across high-dimensional domains. However, little is known about the nature of the map learned with a GAN objective. To address this problem, we propose a generative adversarial model in which the discriminator's objective is the $2$-Wasserstein metric. We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions. As a consequence, it reproduces an optimal map at the end of training. We validate our approach empirically in both low-dimensional and high-dimensional continuous settings, and show that it outperforms prior methods on image data.
Evaluating Protein Transfer Learning with TAPE
Rao, Roshan, Bhattacharya, Nicholas, Thomas, Neil, Duan, Yan, Chen, Xi, Canny, John, Abbeel, Pieter, Song, Yun S.
Protein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. We curate tasks into specific training, validation, and test splits to ensure that each task tests biologically relevant generalization that transfers to real-life scenarios. We benchmark a range of approaches to semi-supervised protein representation learning, which span recent work as well as canonical sequence learning techniques. We find that self-supervised pretraining is helpful for almost all models on all tasks, more than doubling performance in some cases. Despite this increase, in several cases features learned by self-supervised pretraining still lag behind features extracted by state-of-the-art non-neural techniques. This gap in performance suggests a huge opportunity for innovative architecture design and improved modeling paradigms that better capture the signal in biological sequences. TAPE will help the machine learning community focus effort on scientifically relevant problems.
Artificial Intelligence Explained: What Are Generative Adversarial Networks (GANs)?
The field of artificial intelligence (AI) is fast-moving, and new breakthroughs are regularly made. One of the more recent terms rising to prominence is Generative Adversarial Network (GAN) โ but what does it mean? Artificial Intelligence Explained: What Are Generative Adversarial Networks (GANs)? The principle behind the GAN was first proposed in 2014, and at its most basic level, it describes a system that pits two AI systems (neural networks) against each other to improve the quality of their results. To understand how they work, imagine a blind forger trying to create copies of paintings by great masters.
Are Labels Required for Improving Adversarial Robustness?
Uesato, Jonathan, Alayrac, Jean-Baptiste, Huang, Po-Sen, Stanforth, Robert, Fawzi, Alhussein, Kohli, Pushmeet
Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that unlabeled data can be a competitive alternative to labeled data for training adversarially robust models. Theoretically, we show that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors. On standard datasets like CIFAR-10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-the-art on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. This demonstrates that our finding extends as well to the more realistic case where unlabeled data is also uncurated, therefore opening a new avenue for improving adversarial training.
Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods
Hoffmann, Franca, Hosseini, Bamdad, Ren, Zhi, Stuart, Andrew M.
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data.
4,382 viewsJun 12, 2019, 12:23am Artificial Intelligence Explained: What Are Generative Adversarial Networks (GANs)?
The field of artificial intelligence (AI) is fast-moving, and new breakthroughs are regularly made. One of the more recent terms rising to prominence is Generative Adversarial Network (GAN) โ but what does it mean? Artificial Intelligence Explained: What Are Generative Adversarial Networks (GANs)? The principle behind the GAN was first proposed in 2014, and at its most basic level, it describes a system that pits two AI systems (neural networks) against each other to improve the quality of their results. To understand how they work, imagine a blind forger trying to create copies of paintings by great masters.
What you need is a more professional teacher
Lin, Liwei, Wang, Xiangdong, Liu, Hong, Qian, Yueliang
We propose a simple and efficient method to combine semi-supervised learning with weakly-supervised learning for deep neural networks. Designing deep neural networks for weakly-supervised learning is always accompanied by a tradeoff between fine-information and coarse-level classification accuracy. While using unlabeled data for semi-supervised learning, in contrast to seeking for this tradeoff, we design two extremely different models for different targets, one of which just pursues finer information for the final target. Another one is more professional to achieve higher coarse-level classification accuracy so that it is regarded as a more professional teacher to teach the former model using unlabeled data. We present an end-to-end semi-supervised learning process termed guided learning for these two different models so that improve the training efficiency. Our approach improves the $1^{st}$ place result on Task4 of the DCASE2018 challenge from $32.4\%$ to $38.3\%$, achieving start-of-art performance.
Stacked Capsule Autoencoders
Kosiorek, Adam R., Sabour, Sara, Teh, Yee Whye, Hinton, Geoffrey E.
An object can be seen as a geometrically organized set of interrelated parts. A system that makes explicit use of these geometric relationships to recognize objects should be naturally robust to changes in viewpoint, because the intrinsic geometric relationships are viewpoint-invariant. We describe an unsupervised version of capsule networks, in which a neural encoder, which looks at all of the parts, is used to infer the presence and poses of object capsules. The encoder is trained by backpropagating through a decoder, which predicts the pose of each already discovered part using a mixture of pose predictions. The parts are discovered directly from an image, in a similar manner, by using a neural encoder, which infers parts and their affine transformations. The corresponding decoder models each image pixel as a mixture of predictions made by affine-transformed parts. We learn object- and their part-capsules on unlabeled data, and then cluster the vectors of presences of object capsules. When told the names of these clusters, we achieve state-of-the-art results for unsupervised classification on SVHN (55%) and near state-of-the-art on MNIST (98.5%).
Gaussian Mixture Models Explained
In the world of Machine Learning, we can distinguish two main areas: Supervised and unsupervised learning. The main difference betweeen both lies in the nature of the data as well as the approaches used to deal with it. Clustering is an unsupervised learning problem where we intend to find clusters of points in our dataset that share some common characteristics. Let's suppose we have a dataset that looks like this: Our job is to find sets of points that appear close together. Please note that we are now introducing some additional notation.