Unsupervised or Indirectly Supervised Learning
Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning
Luo, Yawei, Guan, Tao, Yu, Junqing, Liu, Ping, Yang, Yi
Graph convolutional network (GCN) provides a powerful means for graph-based semi-supervised tasks. However, as a localized first-order approximation of spectral graph convolution, the classic GCN can not take full advantage of unlabeled data, especially when the unlabeled node is far from labeled ones. To capitalize on the information from unlabeled nodes to boost the training for GCN, we propose a novel framework named Self-Ensembling GCN (SEGCN), which marries GCN with Mean Teacher - another powerful model in semi-supervised learning. SEGCN contains a student model and a teacher model. As a student, it not only learns to correctly classify the labeled nodes, but also tries to be consistent with the teacher on unlabeled nodes in more challenging situations, such as a high dropout rate and graph collapse. As a teacher, it averages the student model weights and generates more accurate predictions to lead the student. In such a mutual-promoting process, both labeled and unlabeled samples can be fully utilized for backpropagating effective gradients to train GCN. In three article classification tasks, i.e. Citeseer, Cora and Pubmed, we validate that the proposed method matches the state of the arts in the classification accuracy.
Alternate Estimation of a Classifier and the Class-Prior from Positive and Unlabeled Data
Kato, Masahiro, Xu, Liyuan, Niu, Gang, Sugiyama, Masashi
We consider the problem of learning a binary classifier only from positive data and unlabeled data (PU learning). This problem arises in various practical situations, such as information retrieval and outlier detection (Elkan and Noto, 2008; Ward et al., 2009; Scott and Blanchard, 2009; Blanchard et al., 2010; Li et al., 2009; Nguyen et al., 2011). One of the theoretical milestones of PU learning is Elkan and Noto (2008) and there are subsequent researches called unbiased PU learning (du Plessis and Sugiyama, 2014; du Plessis et al., 2015), where the classification risk is estimated in an unbiased manner only from PU data. We consider the case-control scenario (Ward et al., 2009; Elkan and Noto, 2008), where positive data are obtained separately from unlabeled data and unlabeled data is sampled from the whole population. Under this setting, the true class-prior ฯ p(y 1) in unlabeled data is needed for the formulation of unbiased PU learning.
GANs for Medical Image Analysis
Kazeminia, Salome, Baur, Christoph, Kuijper, Arjan, van Ginneken, Bram, Navab, Nassir, Albarqouni, Shadi, Mukhopadhyay, Anirban
Generative Adversarial Networks (GANs) and their extensions have carved open many exciting ways to tackle well known and challenging medical image analysis problems such as medical image denoising, reconstruction, segmentation, data simulation, detection or classification. Furthermore, their ability to synthesize images at unprecedented levels of realism also gives hope that the chronic scarcity of labeled data in the medical field can be resolved with the help of these generative models. In this review paper, a broad overview of recent literature on GANs for medical applications is given, the shortcomings and opportunities of the proposed methods are thoroughly discussed and potential future work is elaborated. A total of 63 papers published until end of July 2018 are reviewed. For quick access, the papers and important details such as the underlying method, datasets and performance are summarized in tables.
Global Bigdata Conference
One must know about the various techniques to make predictions in machine learning by Spark. Supervised learning is to direct the data towards a specific label by training a certain set of unlabelled dataset. It is used to classify data- for example spam filtering or image recognition. Unsupervised learning is used for clustering data based on certain similar features in the set of unlabelled data. This is used to predict purchase patterns of customers on sites like Amazon and also for applications on social networking sites.
Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data
Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be enabled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.
ClusterGAN : Latent Space Clustering in Generative Adversarial Networks
Mukherjee, Sudipto, Asnani, Himanshu, Lin, Eugene, Kannan, Sreeram
Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.
Sample Complexity of Nonparametric Semi-Supervised Learning
Dan, Chen, Leqi, Liu, Aragam, Bryon, Ravikumar, Pradeep, Xing, Eric P.
We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an $\Omega(K\log K)$ labeled sample complexity bound without imposing parametric assumptions, where $K$ is the number of classes. Our results suggest that even in nonparametric settings it is possible to learn a near-optimal classifier using only a few labeled samples. Unlike previous theoretical work which focuses on binary classification, we consider general multiclass classification ($K>2$), which requires solving a difficult permutation learning problem. This permutation defines a classifier whose classification error is controlled by the Wasserstein distance between mixing measures, and we provide finite-sample results characterizing the behaviour of the excess risk of this classifier. Finally, we describe three algorithms for computing these estimators based on a connection to bipartite graph matching, and perform experiments to illustrate the superiority of the MLE over the majority vote estimator.
Wasserstein Soft Label Propagation on Hypergraphs: Algorithm and Generalization Error Bounds
Gao, Tingran, Asoodeh, Shahab, Huang, Yi, Evans, James
Inspired by recent interests of developing machine learning and data mining algorithms on hypergraphs, we investigate in this paper the semi-supervised learning algorithm of propagating "soft labels" (e.g. Furthermore, in a PAC learning framework, we provide generalization error bounds for propagating one-dimensional distributions on graphs and hypergraphs using 2-Wasserstein distance, by establishing the algorithmic stability of the proposed semi-supervised learning algorithm. These theoretical results also shed new lights upon deeper understandings of the Wasserstein propagation on graphs. Recent decades have witnessed a growing interest in developing machine learning and data mining algorithms on hypergraphs [ZHS07], [JM18], [BP09], [LR15], [LM17], [HSJR13], [HZY15]. As a natural generalization of graphs, a hypergraph is a combinatorial structure consisting of vertices and hyperedges, where each hyperedge is allowed to connect any number of vertices.
Semi-supervised Learning on Graphs with Generative Adversarial Nets
Ding, Ming, Tang, Jie, Zhang, Jie
We investigate how generative adversarial nets (GANs) can help semi-supervised learning on graphs. We first provide insights on working principles of adversarial learning over graphs and then present GraphSGAN, a novel approach to semi-supervised learning on graphs. In GraphSGAN, generator and classifier networks play a novel competitive game. At equilibrium, generator generates fake samples in low-density areas between subgraphs. In order to discriminate fake samples from the real, classifier implicitly takes the density property of subgraph into consideration. An efficient adversarial learning algorithm has been developed to improve traditional normalized graph Laplacian regularization with a theoretical guarantee. Experimental results on several different genres of datasets show that the proposed GraphSGAN significantly outperforms several state-of-the-art methods. GraphSGAN can be also trained using mini-batch, thus enjoys the scalability advantage.
Graph reduction by local variation
How can we reduce the size of a graph without significantly altering its basic properties? We approach the graph reduction problem from the perspective of restricted similarity, a modification of a well-known measure for graph approximation. Our choice is motivated by the observation that restricted similarity implies strong spectral guarantees and can be used to prove statements about certain unsupervised learning problems. The paper then focuses on coarsening, a popular type of graph reduction. We derive sufficient conditions for a small graph to approximate a larger one in the sense of restricted similarity. Our theoretical findings give rise to a novel quasi-linear algorithm. Compared to both standard and advanced graph reduction methods, the proposed algorithm finds coarse graphs of improved quality -often by a large margin- without sacrificing speed.