Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Adversarially Robust Generalization Just Requires More Unlabeled Data

arXiv.org Machine Learning

Neural network robustness has recently been highlighted by the existence of adversarial examples. Many previous works show that the learned networks do not perform well on perturbed test data, and significantly more labeled data is required to achieve adversarially robust generalization. In this paper, we theoretically and empirically show that with just more unlabeled data, we can learn a model with better adversarially robust generalization. The key insight of our results is based on a risk decomposition theorem, in which the expected robust risk is separated into two parts: the stability part which measures the prediction stability in the presence of perturbations, and the accuracy part which evaluates the standard classification accuracy. As the stability part does not depend on any label information, we can optimize this part using unlabeled data. We further prove that for a specific Gaussian mixture problem illustrated by [35], adversarially robust generalization can be almost as easy as the standard generalization in supervised learning if a sufficiently large amount of unlabeled data is provided. Inspired by the theoretical findings, we propose a new algorithm called PASS by leveraging unlabeled data during adversarial training. We show that in the transductive and semi-supervised settings, PASS achieves higher robust accuracy and defense success rate on the Cifar-10 task.


Encoder-Powered Generative Adversarial Networks

arXiv.org Machine Learning

We present an encoder-powered generative adversarial network (EncGAN) that is able to learn both the multi-manifold structure and the abstract features of data. Unlike the conventional decoder-based GANs, EncGAN uses an encoder to model the manifold structure and invert the encoder to generate data. This unique scheme enables the proposed model to exclude discrete features from the smooth structure modeling and learn multi-manifold data without being hindered by the disconnections. Also, as EncGAN requires a single latent space to carry the information for all the manifolds, it builds abstract features shared among the manifolds in the latent space. For an efficient computation, we formulate EncGAN using a simple regularizer, and mathematically prove its validity. We also experimentally demonstrate that EncGAN successfully learns the multi-manifold structure and the abstract features of MNIST, 3D-chair and UT-Zap50k datasets. Our analysis shows that the learned abstract features are disentangled and make a good style-transfer even when the source data is off the trained distribution.


Top 5 Programming Languages For Machine Learning

#artificialintelligence

Machine learning has been defined by Andrew Ng, a computer scientist at Stanford University, as "the science of getting computers to act without being explicitly programmed." It was first conceived in the 1950s, but experienced limited progress until around the turn of the 21st century. Since then, machine learning has been a driving force behind a number of innovations, most notably artificial intelligence. Machine learning can be broken down into several categories, including supervised, unsupervised, semi-supervised and reinforcement learning. While supervised learning relies on labeled input data in order to infer its relationships with output results, unsupervised learning detects patterns among unlabeled input data. Semi-supervised learning employs a combination of both methods, and reinforcement learning motivates programs to repeat or elaborate on processes with desirable outcomes while avoiding errors.


r/MachineLearning - [D] CycleGAN implementation just learning identity mapping

#artificialintelligence

Hi, don't know where else to ask but I just don't know what else I could try out with my code. I'm trying to reimplement CycleGAN in a Jupyter notbook and (for me) the code looks good, but somehow my generators just learn to map an input to itself (so what I put into it comes out at the other end). What's odd is that the GAN loss is going up, which is probably why the generators don't learn anything meaningful other than the identity mapping. I also got the feeling that my discriminators just learn to distinguish fake from real images, but nothing about horses or zebras. I would be so happy if somebody could give me a hint.


What is AI? - In a simple way

#artificialintelligence

The simplest way to discuss about AI is by considering the perspective of humans. We know that humans are the most intellectual creatures in this world. So, it is better to compare Artificial Intelligence with Human Intelligence to get a clear vision of AI. AI, a wide branch of Computer Science, is used to create intelligent machines that can recognize human speech, detect objects, solve problems and learn like humans. Humans can write and read text-data in any language.


NVIDIA Blog: Supervised Vs. Unsupervised Learning

#artificialintelligence

There are a few different ways to build IKEA furniture. Each will, ideally, lead to a completed couch or chair. But depending on the details, one approach will make more sense than the others. Getting the hang of it? Toss the manual aside and go solo.


Semi-Supervised Learning, Causality and the Conditional Cluster Assumption

arXiv.org Machine Learning

While the success of semi-supervised learning (SSL) is still not fully understood, Sch\"olkopf et al. (2012) have established a link to the principle of independent causal mechanisms. They conclude that SSL should be impossible when predicting a target variable from its causes, but possible when predicting it from its effects. Since both these cases are somewhat restrictive, we extend their work by considering classification using cause and effect features at the same time, such as predicting a disease from both risk factors and symptoms. While standard SSL exploits information contained in the marginal distribution of the inputs (to improve our estimate of the conditional distribution of target given inputs), we argue that in our more general setting we can use information in the conditional of effect features given causal features. We explore how this insight generalizes the previous understanding, and how it relates to and can be exploited for SSL.


When can unlabeled data improve the learning rate?

arXiv.org Machine Learning

In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to produce a better classifier than with labeled data alone. However, the conditions under which such an improvement is possible are not fully understood yet. Our analysis focuses on improvements in the minimax learning rate in terms of the number of labeled examples (with the number of unlabeled examples being allowed to depend on the number of labeled ones). We argue that for such improvements to be realistic and indisputable, certain specific conditions should be satisfied and previous analyses have failed to meet those conditions. We then demonstrate examples where these conditions can be met, in particular showing rate changes from $1/\sqrt{\ell}$ to $e^{-c\ell}$ and from $1/\sqrt{\ell}$ to $1/\ell$. These results improve our understanding of what is and isn't possible in semi-supervised learning.


Local Label Propagation for Large-Scale Semi-Supervised Learning

arXiv.org Artificial Intelligence

A significant issue in training deep neural networks to solve supervised learning tasks is the need for large numbers of labelled datapoints. The goal of semi-supervised learning is to leverage ubiquitous unlabelled data, together with small quantities of labelled data, to achieve high task performance. Though substantial recent progress has been made in developing semi-supervised algorithms that are effective for comparatively small datasets, many of these techniques do not scale readily to the large (unlaballed) datasets characteristic of real-world applications. In this paper we introduce a novel approach to scalable semi-supervised learning, called Local Label Propagation (LLP). Extending ideas from recent work on unsupervised embedding learning, LLP first embeds datapoints, labelled and otherwise, in a common latent space using a deep neural network. It then propagates pseudolabels from known to unknown datapoints in a manner that depends on the local geometry of the embedding, taking into account both inter-point distance and local data density as a weighting on propagation likelihood. The parameters of the deep embedding are then trained to simultaneously maximize pseudolabel categorization performance as well as a metric of the clustering of datapoints within each psuedo-label group, iteratively alternating stages of network training and label propagation. We illustrate the utility of the LLP method on the ImageNet dataset, achieving results that outperform previous state-of-the-art scalable semi-supervised learning algorithms by large margins, consistently across a wide variety of training regimes. We also show that the feature representation learned with LLP transfers well to scene recognition in the Places 205 dataset.


A Flexible Generative Framework for Graph-based Semi-supervised Learning

arXiv.org Machine Learning

We consider a family of problems that are concerned about making predictions for the majority of unlabeled, graph-structured data samples based on a small proportion of labeled examples. Relational information among the data samples, often encoded in the graph or network structure, is shown to be helpful for these semi-supervised learning tasks. Conventional graph-based regularization methods and recent graph neural networks do not fully leverage the interrelations between the features, the graph, and the labels. We propose a flexible generative framework for graph-based semi-supervised learning, which approaches the joint distribution of the node features, labels, and the graph structure. Borrowing insights from random graph models in network science literature, this joint distribution can be instantiated using various distribution families. For the inference of missing labels, we exploit recent advances of scalable variational inference techniques to approximate the Bayesian posterior. We conduct thorough experiments on benchmark datasets for graph-based semi-supervised learning. Results show that the proposed methods outperform state-of-the-art models under most settings.