Unsupervised or Indirectly Supervised Learning
Semi-Supervised Learning with the Deep Rendering Mixture Model
Nguyen, Tan, Liu, Wanjia, Perez, Ethan, Baraniuk, Richard G., Patel, Ankit B.
Semi-supervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs) have achieved great success in supervised tasks and as such have been widely employed in the semi-supervised learning. In this paper we leverage the recently developed Deep Rendering Mixture Model (DRMM), a probabilistic generative model that models latent nuisance variation, and whose inference algorithm yields DCNs. We develop an EM algorithm for the DRMM to learn from both labeled and unlabeled data. Guided by the theory of the DRMM, we introduce a novel non-negativity constraint and a variational inference term. We report state-of-the-art performance on MNIST and SVHN and competitive results on CIFAR10. We also probe deeper into how a DRMM trained in a semi-supervised setting represents latent nuisance variation using synthetically rendered images. Taken together, our work provides a unified framework for supervised, unsupervised, and semi-supervised learning.
Max-Margin Deep Generative Models for (Semi-)Supervised Learning
Li, Chongxuan, Zhu, Jun, Zhang, Bo
Deep generative models (DGMs) are effective on learning multilayered representations of complex data and performing inference of input data by exploring the generative ability. However, it is relatively insufficient to empower the discriminative ability of DGMs on making accurate predictions. This paper presents max-margin deep generative models (mmDGMs) and a class-conditional variant (mmDCGMs), which explore the strongly discriminative principle of max-margin learning to improve the predictive performance of DGMs in both supervised and semi-supervised learning, while retaining the generative capability. In semi-supervised learning, we use the predictions of a max-margin classifier as the missing labels instead of performing full posterior inference for efficiency; we also introduce additional max-margin and label-balance regularization terms of unlabeled data for effectiveness. We develop an efficient doubly stochastic subgradient algorithm for the piecewise linear objectives in different settings. Empirical results on various datasets demonstrate that: (1) max-margin learning can significantly improve the prediction performance of DGMs and meanwhile retain the generative ability; (2) in supervised learning, mmDGMs are competitive to the best fully discriminative networks when employing convolutional neural networks as the generative and recognition models; and (3) in semi-supervised learning, mmDCGMs can perform efficient inference and achieve state-of-the-art classification results on several benchmarks.
Spatial contrasting for deep unsupervised learning
Hoffer, Elad, Hubara, Itay, Ailon, Nir
Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training methods. In this work we present a novel approach for unsupervised training of Convolutional networks that is based on contrasting between spatial regions within images. This criterion can be employed within conventional neural networks and trained using standard techniques such as SGD and back-propagation, thus complementing supervised methods.
Machine Learning: Supervision Optional
Machine learning is defined as a subfield of computer science and artificial intelligence which "gives computers the ability to learn without being explicitly programmed" (source). Although the statistical techniques which underpin machine learning have existed for decades recent developments in technology such as the availability/affordability of cloud computing and the ability to store and manipulate big data have accelerated its adoption. This essay is meant to explore the most popular methods currently being employed by data scientists such as supervised and unsupervised methods to people with little to no understanding of the field. Supervised machine learning describes an instance where inputs along with the outputs are known. We know the beginning and the end of the story and the challenge is to find a function (story teller, if you will) which best approximates the output in a generalizable fashion.
Class-prior Estimation for Learning from Positive and Unlabeled Data
Plessis, Marthinus C. du, Niu, Gang, Sugiyama, Masashi
We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized $L_1$-distance gives a computationally efficient algorithm with an analytic solution. The consistency, stability, and estimation error are theoretically analyzed. Finally, we experimentally demonstrate the usefulness of the proposed method.
NYU Using NVIDIA DGX-1 to Push Boundaries of AI NVIDIA Blog
New York University's Center for Data Science is at the cutting edge of fields with revolutionary implications such as machine learning, natural language processing, computer vision and intelligent machines. Because computing speed is critical to accelerating experimentation and advancing research, the center's Computational Intelligence, Learning, Vision and Robotics (CILVR) lab recently acquired a NVIDIA DGX-1 AI supercomputer to fuel this work like never before. The CILVR lab has "unsupervised learning" as its focus. The lab's faculty, research scientists and graduate students are developing techniques that allow machines to learn from raw, unlabeled data by, for example, observing video, looking at images or listening to speech. These techniques are then applied to computer vision applications like self-driving cars that can understand the environment around them, medical image analysis that can detect tumors or disease earlier and more accurately than traditional methods, and natural language processing that can translate languages, answer questions or hold a dialogue with people. "The DGX-1 is going to be used in just about every research project we have here," said Yann LeCun, founding director of the NYU Center for Data Science and a pioneer in the field of AI. "The students here can't wait to get their hands on it."
Semi-Supervised Learning with Generative Adversarial Networks
We extend Generative Adversarial Networks (GANs) to the semi-supervised context by forcing the discriminator network to output class labels. We train a generative model G and a discriminator D on a dataset with inputs belonging to one of N classes. At training time, D is made to predict which of N 1 classes the input belongs to, where an extra class is added to correspond to the outputs of G. We show that this method can be used to create a more data-efficient classifier and that it allows for generating higher quality samples than a regular GAN.
Convex Formulation for Kernel PCA and its Use in Semi-Supervised Learning
Alaรญz, Carlos M., Fanuel, Michaรซl, Suykens, Johan A. K.
In this paper, Kernel PCA is reinterpreted as the solution to a convex optimization problem. Actually, there is a constrained convex problem for each principal component, so that the constraints guarantee that the principal component is indeed a solution, and not a mere saddle point. Although these insights do not imply any algorithmic improvement, they can be used to further understand the method, formulate possible extensions and properly address them. As an example, a new convex optimization problem for semi-supervised classification is proposed, which seems particularly well-suited whenever the number of known labels is small. Our formulation resembles a Least Squares SVM problem with a regularization parameter multiplied by a negative sign, combined with a variational principle for Kernel PCA. Our primal optimization principle for semi-supervised learning is solved in terms of the Lagrange multipliers. Numerical experiments in several classification tasks illustrate the performance of the proposed model in problems with only a few labeled data.
Sebastian Raschka Learning scikit learn - An Introduction to Machine Learning in Python
PyData Chicago 2016 This tutorial provides you with a comprehensive introduction to machine learning in Python using the popular scikit-learn library. We will learn how to tackle common problems in predictive modeling and clustering analysis that can be used in real-world problems, in business and in research applications. And we will implement certain algorithms as scratch as well, to internalize the inner workings This tutorial will teach you the basics of scikit-learn. We will learn how to leverage powerful algorithms from the two main domains of machine learning: supervised and unsupervised learning. In this talk, I will give you a brief overview of the basic concepts of classification and regression analysis, how to build powerful predictive models from labeled data.
NYU Advances Robotics with Nvidia DGX-1 Deep Learning Supercomputer - insideHPC
In this video, NYU researchers describe their plans to advance deep learning with their new Nvidia DGX-1 AI supercomputer. New York University's Center for Data Science is at the cutting edge of fields with revolutionary implications such as machine learning, natural language processing, computer vision and intelligent machines. Because computing speed is critical to accelerating experimentation and advancing research, the center's Computational Intelligence, Learning, Vision and Robotics (CILVR) lab recently acquired a DGX-1 to fuel this work like never before. The CILVR lab has "unsupervised learning" as its focus. The lab's faculty, research scientists and graduate students are developing techniques that allow machines to learn from raw, unlabeled data by, for example, observing video, looking at images or listening to speech.