Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Choosing a Machine Learning Model

#artificialintelligence

Ever wonder how we can apply machine learning algorithms to a problem in order to analyze, visualize, discover trends & find correlations within data? In this article, I'm going to discuss common steps for setting up a machine learning model as well as approaches in selecting the right model for your data. This article was inspired by common interview questions that were asked about how I go along with my approach with a data science problem and why I choose said model. Machine learning tasks can be classified into either supervised learning, unsupervised learning, semi-supervised learning & reinforcement learning. In this article we don't focus on the last two, however, I'll give some idea of what they're.


Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)

#artificialintelligence

Despite the significant advances in recent years, Generative Adversarial Networks (GANs) are still notoriously hard to train. In this paper, we propose three novel curriculum learning strategies for training GANs. All strategies are first based on ranking the training images by their difficulty scores, which are estimated by a state-of-the-art image difficulty predictor. Our first strategy is to divide images into gradually more difficult batches. Our second strategy introduces a novel curriculum loss function for the discriminator that takes into account the difficulty scores of the real images.


Papers With Code : Billion-scale semi-supervised learning for image classification

#artificialintelligence

This paper presents a study of semi-supervised learning with large convolutional networks. We propose a pipeline, based on a teacher/student paradigm, that leverages a large collection of unlabelled images (up to 1 billion)... Our main goal is to improve the performance for a given target architecture, like ResNet-50 or ResNext. We provide an extensive analysis of the success factors of our approach, which leads us to formulate some recommendations to produce high-accuracy models for image classification with semi-supervised learning. As a result, our approach brings important gains to standard architectures for image, video and fine-grained classification. For instance, by leveraging one billion unlabelled images, our learned vanilla ResNet-50 achieves 81.2% top-1 accuracy on the ImageNet benchmark.


Unsupervised Learning

#artificialintelligence

Consider the problem of splitting photos into one of two categories: cat and dog. First, imagine you are taking a supervised approach to this problem. With a supervised learning algorithm, the agent will be given photos of various dogs and cats as well as labels for each image. The labels will either be "cat" or "dog". As the agent trains, it will learn what features distinguish dogs from cats.


Billion-scale semi-supervised learning for state-of-the-art image and video classification

#artificialintelligence

Accurate image and video classification is important for a wide range of computer vision applications, from identifying harmful content, to making products more accessible to the visually impaired, to helping people more easily buy and sell things on products like Marketplace. Facebook AI is developing alternative ways to train our AI systems so that we can do more with less labeled training data overall, and also deliver accurate results even when large, high-quality labeled data sets are simply not available. Today, we are sharing details on a versatile new model training technique that delivers state-of-the-art accuracy for image and video classification systems. This approach, which we call semi-weak supervision, is a new way to combine the merits of two different training methods: semi-supervised learning and weakly supervised learning. It opens the door the door to creating more accurate, efficient production classification models by using a teacher-student model training paradigm and billion-scale weakly supervised data sets.


Semi-supervised Learning using Adversarial Training with Good and Bad Samples

arXiv.org Machine Learning

In this work, we investigate semi-supervised learning (SSL) for image classification using adversarial training. Previous results have illustrated that generative adversarial networks (GANs) can be used for multiple purposes. Triple-GAN, which aims to jointly optimize model components by incorporating three players, generates suitable image-label pairs to compensate for the lack of labeled data in SSL with improved benchmark performance. Conversely, Bad (or complementary) GAN, optimizes generation to produce complementary data-label pairs and force a classifier's decision boundary to lie between data manifolds. Although it generally outperforms Triple-GAN, Bad GAN is highly sensitive to the amount of labeled data used for training. Unifying these two approaches, we present unified-GAN (UGAN), a novel framework that enables a classifier to simultaneously learn from both good and bad samples through adversarial training. We perform extensive experiments on various datasets and demonstrate that UGAN: 1) achieves state-of-the-art performance among other deep generative models, and 2) is robust to variations in the amount of labeled data used for training.


Learning Classifiers on Positive and Unlabeled Data with Policy Gradient

arXiv.org Machine Learning

--Existing algorithms aiming to learn a binary classifier from positive (P) and unlabeled (U) data require estimating the class prior or label noise ahead of building a classification model. However, the estimation and classifier learning are normally conducted in a pipeline instead of being jointly optimized. In this paper, we propose to alternatively train the two steps using reinforcement learning. Our proposal adopts a policy network to adaptively make assumptions on the labels of unlabeled data, while a classifier is built upon the output of the policy network and provides rewards to learn a better policy. The dynamic and interactive training between the policy maker and the classifier can exploit the unlabeled data in a more effective manner and yield a significant improvement in terms of classification performance. Furthermore, we present two different approaches to represent the actions taken by the policy. The first approach considers continuous actions as soft labels, while the other uses discrete actions as hard assignment of labels for unlabeled examples. We validate the effectiveness of the proposed method on two public benchmark datasets as well as one e-commerce dataset. The results show that the proposed method is able to consistently outperform state-of-the-art methods in various settings. PU learning refers to the problem of learning from a dataset where only a subset of examples are positively labeled and the rest are not annotated at all. It is a critical task due to its prevalence in various real-world applications [1], [2], [3]. In many common situations only positive data are available, for instance, an e-commerce website may only record users who have clicked on advertisements or purchased items. Meanwhile, it is not possible to simply assume that unlabeled instances are negative.


Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space

arXiv.org Machine Learning

This work studies an explicit embedding of the set of probability measures into a Hilbert space, defined using optimal transport maps from a reference probability density. This embedding linearizes to some extent the 2 -Wasserstein space, and enables the direct use of generic supervised and unsupervised learning algorithms on measure data. Our main result is that the embedding is (bi-)Hölder continuous, when the reference density is uniform over a convex set, and can be equivalently phrased as a dimension-independent Hölder-stability results for optimal transport maps. 1. Introduction Numerous problems involve the comparison of point clouds, i.e. sets of points that lie in a metric space and for which the spatial distribution is of interest. Seeing the point clouds as discrete probability measures in a metric space, it is natural to compare them using Wasserstein distances defined by the optimal transport theory [37]. These distances have indeed been successfully used in a variety of applications in machine learning [11, 3, 25, 23, 19, 1] and in statistics [39, 12, 8, 35]. In the discrete setting, many efficient algorithms have been proposed to compute or approximate the Wasserstein distances, such as Sinkhorn-Knopp and auction algorithms - see [34] and references therein.


Unsupervised learning of landmarks by Descriptor Vector Exchange

#artificialintelligence

Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision. However, this method does not explicitly guarantee that the learned landmarks are consistent with changes between different instances of the same object, such as different facial identities. In this paper, we develop a new perspective on the equivariance approach by noting that dense landmark detectors can be interpreted as local image descriptors equipped with invariance to intra-category variations. We then propose a direct method to enforce such an invariance in the standard equivariant loss. We do so by exchanging descriptor vectors between images of different object instances prior to matching them geometrically.


Residual Encoder-Decoder Network for Deep Subspace Clustering

arXiv.org Machine Learning

Subspace clustering aims to cluster unlabeled data that lies in a union of low-dimensional linear subspaces. Deep subspace clustering approaches based on auto-encoders have become very popular to solve subspace clustering problems. However, the training of current deep methods converges slowly, which is much less efficient than traditional approaches. We propose a Residual Encoder-Decoder network for deep Subspace Clustering (RED-SC), which symmetrically links convolutional and deconvolutional layers with skip-layer connections, with which the training converges much faster. We use a self-expressive layer to generate more accurate linear representation coefficients through different latent representations from multiple latent spaces. Experiments show the superiority of RED-SC in training efficiency and clustering accuracy. Moreover, we are the first one to apply residual encoder-decoder on unsupervised learning tasks.