AITopics | Unsupervised or Indirectly Supervised Learning

Collaborating Authors

Unsupervised or Indirectly Supervised Learning

Unsupervised learning is a branch of machine learning that learns from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Discriminative Semi-Supervised Dictionary Learning with Entropy Regularization for Pattern Classification

Yang, Meng (Shenzhen University) | Chen, Lin (Shenzhen University)

AAAI ConferencesFeb-14-2017

Dictionary learning has played an important role in the success of sparse representation, which triggers the rapid developments of unsupervised and supervised dictionary learning methods. However, in most practical applications, there are usually quite limited labeled training samples while it is relatively easy to acquire abundant unlabeled training samples. Thus semi-supervised dictionary learning that aims to effectively explore the discrimination of unlabeled training data has attracted much attention of researchers. Although various regularizations have been introduced in the prevailing semi-supervised dictionary learning, how to design an effective unified model of dictionary learning and unlabeled-data class estimating and how to well explore the discrimination in the labeled and unlabeled data are still open. In this paper, we propose a novel discriminative semi-supervised dictionary learning model (DSSDL) by introducing discriminative representation, an identical coding of unlabeled data to the coding of testing data final classification, and an entropy regularization term. The coding strategy of unlabeled data can not only avoid the affect of its incorrect class estimation, but also make the learned discrimination be well exploited in the final classification. The introduced regularization of entropy can avoid overemphasizing on some uncertain estimated classes for unlabeled samples. Apart from the enhanced discrimination in the learned dictionary by the discriminative representation, an extended dictionary is used to mainly explore the discrimination embedded in the unlabeled data. Extensive experiments on face recognition, digit recognition and texture classification show the effectiveness of the proposed method.

artificial intelligence, dictionary learning, machine learning, (19 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China > Guangdong Province (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)

Add feedback

Learning Safe Prediction for Semi-Supervised Regression

Li, Yu-Feng (Nanjing University) | Zha, Han-Wen (University of California, Santa Barbara) | Zhou, Zhi-Hua (Nanjing University)

AAAI ConferencesFeb-14-2017

Semi-supervised learning (SSL) concerns how to improve performance via the usage of unlabeled data. Recent studies indicate that the usage of unlabeled data might even deteriorate performance. Although some proposals have been developed to alleviate such a fundamental challenge for semi-supervised classification, the efforts on semi-supervised regression (SSR) remain to be limited. In this work we consider the learning of a safe prediction from multiple semi-supervised regressors, which is not worse than a direct supervised learner with only labeled data. We cast it as a geometric projection issue with an efficient algorithm. Furthermore, we show that the proposal is provably safe and has already achieved the maximal performance gain, if the ground-truth label assignment is realized by a convex linear combination of base regressors. This provides insight to help understand safe SSR. Experimental results on a broad range of datasets validate the effectiveness of our proposal.

artificial intelligence, machine learning, regressor, (16 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia (0.28)
North America > United States (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.77)

Add feedback

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

Sutherland, Dougal J., Tung, Hsiao-Yu, Strathmann, Heiko, De, Soumyajit, Ramdas, Aaditya, Smola, Alex, Gretton, Arthur

arXiv.org Machine LearningFeb-10-2017

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model's samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier.

artificial intelligence, kernel, neural network, (16 more...)

arXiv.org Machine Learning

1611.04488

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

[R] Face Aging with Conditional Generative Adversarial Networks • /r/MachineLearning

#artificialintelligenceFeb-8-2017, 07:35:35 GMT

The main novel tactic here is the use of a pre-built facial recognition/embedding network (FaceNet) to help determine the distance between the reference image and possible reconstruction vectors (the "Identity Preserving" approximation) before doing the age modifications and sample generation. I'm curious if there's a way to generate larger samples, though (in theory this could be combined with StackGAN or something, I guess?), since that seems like a requisite for a lot of the practical applications.

artificial intelligence, machinelearning, neural network, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.36)

Add feedback

Truncated Variational EM for Semi-Supervised Neural Simpletrons

Forster, Dennis, Lücke, Jörg

arXiv.org Machine LearningFeb-7-2017

Inference and learning for probabilistic generative networks is often very challenging and typically prevents scalability to as large networks as used for deep discriminative approaches. To obtain efficiently trainable, large-scale and well performing generative networks for semi-supervised learning, we here combine two recent developments: a neural network reformulation of hierarchical Poisson mixtures (Neural Simpletrons), and a novel truncated variational EM approach (TV-EM). TV-EM provides theoretical guarantees for learning in generative networks, and its application to Neural Simpletrons results in particularly compact, yet approximately optimal, modifications of learning equations. If applied to standard benchmarks, we empirically find, that learning converges in fewer EM iterations, that the complexity per EM iteration is reduced, and that final likelihood values are higher on average. For the task of classification on data sets with few labels, learning improvements result in consistently lower error rates if compared to applications without truncation. Experiments on the MNIST data set herein allow for comparison to standard and state-of-the-art models in the semi-supervised setting. Further experiments on the NIST SD19 data set show the scalability of the approach when a manifold of additional unlabeled data is available.

deep learning, hidden layer, neural network, (17 more...)

arXiv.org Machine Learning

1702.01997

Country: Europe > Germany (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Estimating the class prior and posterior from noisy positives and unlabeled data

Jain, Shantanu, White, Martha, Radivojac, Predrag

arXiv.org Machine LearningJan-31-2017

We develop a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data. In recent years, several algorithms have been proposed to learn from positive-unlabeled data; however, many of these contributions remain theoretical, performing poorly on real high-dimensional data that is typically contaminated with noise. We build on this previous work to develop two practical classification algorithms that explicitly model the noise in the positive labels and utilize univariate transforms built on discriminative classifiers. We prove that these univariate transforms preserve the class prior, enabling estimation in the univariate space and avoiding kernel density estimation for high-dimensional data. The theoretical development and both parametric and nonparametric algorithms proposed here constitutes an important step towards wide-spread use of robust classification algorithms for positive-unlabeled data.

algorithm, artificial intelligence, health & medicine, (18 more...)

arXiv.org Machine Learning

1606.08561

Country: North America > United States > Indiana (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Generative Adversarial Networks are the hotness at NIPS 2016

#artificialintelligenceJan-2-2017, 21:25:52 GMT

In all circumstances, be very honest to thyself.

artificial intelligence, generative adversarial network, neural network, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Multi-step learning and underlying structure in statistical models

Fraser, Maia

Neural Information Processing SystemsDec-31-2016

In multi-step learning, where a final learning task is accomplished via a sequence of intermediate learning tasks, the intuition is that successive steps or levels transform the initial data into representations more and more ``suited" to the final learning task. A related principle arises in transfer-learning where Baxter (2000) proposed a theoretical framework to study how learning multiple tasks transforms the inductive bias of a learner. The most widespread multi-step learning approach is semi-supervised learning with two steps: unsupervised, then supervised. Several authors (Castelli-Cover, 1996; Balcan-Blum, 2005; Niyogi, 2008; Ben-David et al, 2008; Urner et al, 2011) have analyzed SSL, with Balcan-Blum (2005) proposing a version of the PAC learning framework augmented by a ``compatibility function" to link concept class and unlabeled data distribution. We propose to analyze SSL and other multi-step learning approaches, much in the spirit of Baxter's framework, by defining a learning problem generatively as a joint statistical model on $X \times Y$. This determines in a natural way the class of conditional distributions that are possible with each marginal, and amounts to an abstract form of compatibility function. It also allows to analyze both discrete and non-discrete settings. As tool for our analysis, we define a notion of $\gamma$-uniform shattering for statistical models. We use this to give conditions on the marginal and conditional models which imply an advantage for multi-step learning approaches. In particular, we recover a more general version of a result of Poggio et al (2012): under mild hypotheses a multi-step approach which learns features invariant under successive factors of a finite group of invariances has sample complexity requirements that are additive rather than multiplicative in the size of the subgroups.

artificial intelligence, learning, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Non-generative Framework and Convex Relaxations for Unsupervised Learning

Hazan, Elad, Ma, Tengyu

Neural Information Processing SystemsDec-31-2016

We give a novel formal theoretical framework for unsupervised learning with two distinctive characteristics. First, it does not assume any generative model and based on a worst-case performance metric. Second, it is comparative, namely performance is measured with respect to a given hypothesis class. This allows to avoid known computational hardness results and improper algorithms based on convex relaxations. We show how several families of unsupervised learning models, which were previously only analyzed under probabilistic assumptions and are otherwise provably intractable, can be efficiently learned in our framework by convex optimization.

algorithm, deep learning, neural network, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > Scotland (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning

Sajjadi, Mehdi, Javanmardi, Mehran, Tasdizen, Tolga

Neural Information Processing SystemsDec-31-2016

Effective convolutional neural networks are trained on large sets of labeled data. However, creating large labeled datasets is a very costly and time-consuming task. Semi-supervised learning uses unlabeled data to train a model with higher accuracy when there is a limited set of labeled data available. In this paper, we consider the problem of semi-supervised learning with convolutional neural networks. Techniques such as randomized data augmentation, dropout and random max-pooling provide better generalization and stability for classifiers that are trained using gradient descent. Multiple passes of an individual sample through the network might lead to different predictions due to the non-deterministic behavior of these techniques. We propose an unsupervised loss function that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network. We evaluate the proposed method on several benchmark datasets.

deep learning, loss function, neural network, (17 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback