Unsupervised or Indirectly Supervised Learning
Causal Discovery as Semi-Supervised Learning
Oates, Chris. J., Mukherjee, Sach
In this short report, we discuss an approach to estimating causal graphs in which indicators of causal influence between variables are treated as labels in a machine learning formulation. Available data on the variables of interest are used as "inputs" to estimate the labels. We frame the problem as one of semi-supervised learning: available interventional data or background knowledge provide labels on some edges in the graph and the remaining edges are treated as unlabelled objects. To illustrate the key ideas, we consider a simple approach to feature construction (rooted in bivariate kernel density estimation) and embed this within a semi-supervised manifold framework. Results on yeast knockout data demonstrate that the proposed approach can identify causal relationships as validated by unseen interventional experiments. An advantage of the formulation we propose is that by reframing causal discovery as semi-supervised learning, it allows a range of data-driven approaches to be brought to bear on causal discovery, without demanding specification of full probability models or explicit models of underlying mechanisms.
Graph-based semi-supervised learning for relational networks
We address the problem of semi-supervised learning in relational networks, networks in which nodes are entities and links are the relationships or interactions between them. Typically this problem is confounded with the problem of graph-based semi-supervised learning (GSSL), because both problems represent the data as a graph and predict the missing class labels of nodes. However, not all graphs are created equally. In GSSL a graph is constructed, often from independent data, based on similarity. As such, edges tend to connect instances with the same class label. Relational networks, however, can be more heterogeneous and edges do not always indicate similarity. For instance, instead of links being more likely to connect nodes with the same class label, they may occur more frequently between nodes with different class labels (link-heterogeneity). Or nodes with the same class label do not necessarily have the same type of connectivity across the whole network (class-heterogeneity), e.g. in a network of sexual interactions we may observe links between opposite genders in some parts of the graph and links between the same genders in others. Performing classification in networks with different types of heterogeneity is a hard problem that is made harder still when we do not know a-priori the type or level of heterogeneity. Here we present two scalable approaches for graph-based semi-supervised learning for the more general case of relational networks. We demonstrate these approaches on synthetic and real-world networks that display different link patterns within and between classes. Compared to state-of-the-art approaches, ours give better classification performance without prior knowledge of how classes interact. In particular, our two-step label propagation algorithm gives consistently good accuracy and runs on networks of over 1.6 million nodes and 30 million edges in around 12 seconds.
Generalizable Features From Unsupervised Learning
Mirza, Mehdi, Courville, Aaron, Bengio, Yoshua
Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast to most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al., 2016). In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution. Our task is predicting the stability of towers of square blocks. We demonstrate that an unsupervised model, trained to predict future frames of a video sequence of stable and unstable block configurations, can yield features that support extrapolating stability prediction to blocks configurations outside the training set distribution
Semi-Supervised Learning with the Deep Rendering Mixture Model
Nguyen, Tan, Liu, Wanjia, Perez, Ethan, Baraniuk, Richard G., Patel, Ankit B.
Semi-supervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs) have achieved great success in supervised tasks and as such have been widely employed in the semi-supervised learning. In this paper we leverage the recently developed Deep Rendering Mixture Model (DRMM), a probabilistic generative model that models latent nuisance variation, and whose inference algorithm yields DCNs. We develop an EM algorithm for the DRMM to learn from both labeled and unlabeled data. Guided by the theory of the DRMM, we introduce a novel non-negativity constraint and a variational inference term. We report state-of-the-art performance on MNIST and SVHN and competitive results on CIFAR10. We also probe deeper into how a DRMM trained in a semi-supervised setting represents latent nuisance variation using synthetically rendered images. Taken together, our work provides a unified framework for supervised, unsupervised, and semi-supervised learning.
Spatial contrasting for deep unsupervised learning
Hoffer, Elad, Hubara, Itay, Ailon, Nir
Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training methods. In this work we present a novel approach for unsupervised training of Convolutional networks that is based on contrasting between spatial regions within images. This criterion can be employed within conventional neural networks and trained using standard techniques such as SGD and back-propagation, thus complementing supervised methods.
Machine Learning: Supervision Optional
Machine learning is defined as a subfield of computer science and artificial intelligence which "gives computers the ability to learn without being explicitly programmed" (source). Although the statistical techniques which underpin machine learning have existed for decades recent developments in technology such as the availability/affordability of cloud computing and the ability to store and manipulate big data have accelerated its adoption. This essay is meant to explore the most popular methods currently being employed by data scientists such as supervised and unsupervised methods to people with little to no understanding of the field. Supervised machine learning describes an instance where inputs along with the outputs are known. We know the beginning and the end of the story and the challenge is to find a function (story teller, if you will) which best approximates the output in a generalizable fashion.
Class-prior Estimation for Learning from Positive and Unlabeled Data
Plessis, Marthinus C. du, Niu, Gang, Sugiyama, Masashi
We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized $L_1$-distance gives a computationally efficient algorithm with an analytic solution. The consistency, stability, and estimation error are theoretically analyzed. Finally, we experimentally demonstrate the usefulness of the proposed method.
Semi-Supervised Learning with Generative Adversarial Networks
We extend Generative Adversarial Networks (GANs) to the semi-supervised context by forcing the discriminator network to output class labels. We train a generative model G and a discriminator D on a dataset with inputs belonging to one of N classes. At training time, D is made to predict which of N+1 classes the input belongs to, where an extra class is added to correspond to the outputs of G. We show that this method can be used to create a more data-efficient classifier and that it allows for generating higher quality samples than a regular GAN.
Convex Formulation for Kernel PCA and its Use in Semi-Supervised Learning
Alaรญz, Carlos M., Fanuel, Michaรซl, Suykens, Johan A. K.
In this paper, Kernel PCA is reinterpreted as the solution to a convex optimization problem. Actually, there is a constrained convex problem for each principal component, so that the constraints guarantee that the principal component is indeed a solution, and not a mere saddle point. Although these insights do not imply any algorithmic improvement, they can be used to further understand the method, formulate possible extensions and properly address them. As an example, a new convex optimization problem for semi-supervised classification is proposed, which seems particularly well-suited whenever the number of known labels is small. Our formulation resembles a Least Squares SVM problem with a regularization parameter multiplied by a negative sign, combined with a variational principle for Kernel PCA. Our primal optimization principle for semi-supervised learning is solved in terms of the Lagrange multipliers. Numerical experiments in several classification tasks illustrate the performance of the proposed model in problems with only a few labeled data.
Sebastian Raschka Learning scikit learn - An Introduction to Machine Learning in Python
PyData Chicago 2016 This tutorial provides you with a comprehensive introduction to machine learning in Python using the popular scikit-learn library. We will learn how to tackle common problems in predictive modeling and clustering analysis that can be used in real-world problems, in business and in research applications. And we will implement certain algorithms as scratch as well, to internalize the inner workings This tutorial will teach you the basics of scikit-learn. We will learn how to leverage powerful algorithms from the two main domains of machine learning: supervised and unsupervised learning. In this talk, I will give you a brief overview of the basic concepts of classification and regression analysis, how to build powerful predictive models from labeled data.