Unsupervised or Indirectly Supervised Learning
Semi-supervised Learning with Contrastive Predicative Coding
Wang, Jiaxing, Zheng, Yin, Chen, Xiaoshuang, Huang, Junzhou, Cheng, Jian
Semi-supervised learning (SSL) provides a powerful framework for leveraging unlabeled data when labels are limited or expensive to obtain. SSL algorithms based on deep neural networks have recently proven successful on standard benchmark tasks. However, many of them have thus far been either inflexible, inefficient or non-scalable. This paper explores recently developed contrastive predictive coding technique to improve discriminative power of deep learning models when a large portion of labels are absent. Two models, cpc-SSL and a class conditional variant~(ccpc-SSL) are presented. They effectively exploit the unlabeled data by extracting shared information between different parts of the (high-dimensional) data. The proposed approaches are inductive, and scale well to very large datasets like ImageNet, making them good candidates in real-world large scale applications.
D$\textbf{S}^3$L: Deep Self-Semi-Supervised Learning for Image Recognition
Tsai, Tsung Wei, Li, Chongxuan, Zhu, Jun
Despite the recent progress in deep semi-supervised learning (Semi-SL), the amount of labels still plays a dominant role. The success in self-supervised learning (Self-SL) hints a promising direction to exploit the vast unlabeled data by leveraging an additional set of deterministic labels. In this paper, we propose Deep Self-Semi-Supervised learning (D$S^3$L), a flexible multi-task framework with shared parameters that integrates the rotation task in Self-SL with the consistency-based methods in deep Semi-SL. Our method is easy to implement and is complementary to all consistency-based approaches. The experiments demonstrate that our method significantly improves over the published state-of-the-art methods on several standard benchmarks, especially when fewer labels are presented.
Evaluation Metrics for Unsupervised Learning Algorithms
Palacio-Niรฑo, Julio-Omar, Berzal, Fernando
Alternatively, a similarity function might also be used. Machine learning techniques are usually classified into supervised and unsupervised techniques. Supervised machine learning starts from prior knowledge of the desired result 1) Scale Invariance: The first of Kleinberg's axioms states in the form of labeled data sets, which allows to guide the that f(d) f(ฮฑ ยท d) for any distance function d and any training process, whereas unsupervised machine learning scaling factor ฮฑ 0. [3] works directly on unlabeled data. In the absence of labels to orient the learning process, these labels must be "discovered" This simple axiom indicates that a clustering algorithm by the learning algorithm.
The Journey is the Reward: Unsupervised Learning of Influential Trajectories
Binas, Jonathan, Ozair, Sherjil, Bengio, Yoshua
Unsupervised exploration and representation learning become increasingly important when learning in diverse and sparse environments. The information-theoretic principle of empowerment formalizes an unsupervised exploration objective through an agent trying to maximize its influence on the future states of its environment. Previous approaches carry certain limitations in that they either do not employ closed-loop feedback or do not have an internal state. As a consequence, a privileged final state is taken as an influence measure, rather than the full trajectory. We provide a model-free method which takes into account the whole trajectory while still offering the benefits of option-based approaches. We successfully apply our approach to settings with large action spaces, where discovery of meaningful action sequences is particularly difficult.
Semi-Supervised Learning with Scarce Annotations
Rebuffi, Sylvestre-Alvise, Ehrhardt, Sebastien, Han, Kai, Vedaldi, Andrea, Zisserman, Andrew
While semi-supervised learning (SSL) algorithms provide an efficient way to make use of both labelled and unlabelled data, they generally struggle when the number of annotated samples is very small. In this work, we consider the problem of SSL multi-class classification with very few labelled instances. We introduce two key ideas. The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label. The second idea is a new algorithm for SSL that can exploit well such a pre-trained representation. The algorithm works by alternating two phases, one fitting the labelled points and one fitting the unlabelled ones, with carefully-controlled information flow between them. The benefits are greatly reducing overfitting of the labelled data and avoiding issue with balancing labelled and unlabelled losses during training. We show empirically that this method can successfully train competitive models with as few as 10 labelled data points per class. More in general, we show that the idea of bootstrapping features using self-supervised learning always improves SSL on standard benchmarks. We show that our algorithm works increasingly well compared to other methods when refining from other tasks or datasets.
This eye does not exist
Since I had zero experience with generative adversarial networks, I thought I should document some problems I had to overcome. Quoting Wikipedia: "A generative adversarial network (GAN) is a class of machine learning systems. This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics. It is a form of unsupervised learning." I'm not doing any introduction about how a GAN works since there are a lot of materials online with far better insights than the ones I could give.
Semi-Supervised Learning by Augmented Distribution Alignment
Wang, Qin, Li, Wen, Van Gool, Luc
In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited amount of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data. To this end, we propose to align the empirical distributions of labeled and unlabeled data to alleviate the bias. On one hand, we adopt an adversarial training strategy to minimize the distribution distance between labeled and unlabeled data as inspired by domain adaptation works. On the other hand, to deal with the small sample size issue of labeled data, we also propose a simple interpolation strategy to generate pseudo training samples. Those two strategies can be easily implemented into existing deep neural networks. We demonstrate the effectiveness of our proposed approach on the benchmark SVHN and CIFAR10 datasets, on which we achieve new state-of-the-art error rates of $3.54\%$ and $10.09\%$, respectively. Our code will be available at \url{https://github.com/qinenergy/adanet}.
Hands-On Unsupervised Learning Using Python: How to Build Applied Machine Learning Solutions from Unlabeled Data: Ankur A. Patel: 9781492035640: Amazon.com: Books
Most of the successful commercial applications to date--in areas such as computer vision, speech recognition, machine translation, and natural language processing--have involved supervised learning, taking advantage of labeled datasets. However, most of the world's data is unlabeled. In this book, we will cover the field of unsupervised learning (which is a branch of machine learning used to find hidden patterns) and learn the underlying structure in unlabeled data. According to many industry experts, such as Yann LeCun, the Director of AI Research at Facebook and a professor at NYU, unsupervised learning is the next frontier in AI and may hold the key to AGI. For this and many other reasons, unsupervised learning is one of the trendiest topics in AI today.
Unsupervised Learning through Temporal Smoothing and Entropy Maximization
This paper proposes a method for machine learning from unlabeled data in the form of a time-series. The mapping that is learned is shown to extract slowly evolving information that would be useful for control applications, while efficiently filtering out unwanted, higher-frequency noise. The method consists of training a feedforward artificial neural network with backpropagation using two opposing objectives. The first of these is to minimize the squared changes in activations between time steps of each unit in the network. This "temporal smoothing" has the effect of correlating inputs that occur close in time with outputs that are close in the L2-norm. The second objective is to maximize the log determinant of the covariance matrix of activations in each layer of the network. This objective ensures that information from each layer is passed through to the next. This second objective acts as a balance to the first, which on its own would result in a network with all input weights equal to zero.
Adversarial Variational Embedding for Robust Semi-supervised Learning
Zhang, Xiang, Yao, Lina, Yuan, Feng
Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models (e.g., Variational Autoencoder (VAE)) and semisupervised Generative Adversarial Networks (GANs) have recently shown promising performance in semi-supervised classification for the excellent discriminative representing ability. However, the latent code learned by the traditional VAE is not exclusive (repeatable) for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution. Moreover, the semi-supervised GAN models generate data from pre-defined distribution (e.g., Gaussian noises) which is independent of the input data distribution and may obstruct the convergence and is difficult to control the distribution of the generated data. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding (AVAE) framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. The proposed approach first produces an exclusive latent code by the model which we call VAE++, and meanwhile, provides a meaningful prior distribution for the generator of GAN. The proposed approach is evaluated over four different real-world applications and we show that our method outperforms the state-of-the-art models, which confirms that the combination of VAE++ and GAN can provide significant improvements in semisupervised classification.