Unsupervised or Indirectly Supervised Learning
Learning with Local and Global Consistency
Zhou, Dengyong, Bousquet, Olivier, Lal, Thomas N., Weston, Jason, Schölkopf, Bernhard
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. Aprincipled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problemsand demonstrates effective use of unlabeled data.
Cluster Kernels for Semi-Supervised Learning
Chapelle, Olivier, Weston, Jason, Schölkopf, Bernhard
One of the first semi-supervised algorithms [1] was applied to web page classification. This is a typical example where the number of unlabeled examples can be made as large as possible since there are billions of web page, but labeling is expensive since it requires human intervention. Since then, there has been a lot of interest for this paradigm in the machine learning community; an extensive review of existing techniques can be found in [10]. It has been shown experimentally that under certain conditions, the decision function can be estimated more accurately, yielding lower generalization error [1, 4, 6]. However, in a discriminative framework, it is not obvious to determine how unlabeled data or even the perfect knowledge of the input distribution P(x) can help in the estimation of the decision function.
Adaptation and Unsupervised Learning
Dayan, Peter, Sahani, Maneesh, Deback, Gregoire
Adaptation is a ubiquitous neural and psychological phenomenon, with a wealth of instantiations and implications. Although a basic form of plasticity, it has, bar some notable exceptions, attracted computational theory of only one main variety. In this paper, we study adaptation from the perspective of factor analysis, a paradigmatic technique of unsupervised learning. We use factor analysis to reinterpret a standard view of adaptation, and apply our new model to some recent data on adaptation in the domain of face discrimination.
Adaptation and Unsupervised Learning
Dayan, Peter, Sahani, Maneesh, Deback, Gregoire
Adaptation is a ubiquitous neural and psychological phenomenon, with a wealth of instantiations and implications. Although a basic form of plasticity, it has, bar some notable exceptions, attracted computational theory of only one main variety. In this paper, we study adaptation from the perspective of factor analysis, a paradigmatic technique of unsupervised learning.We use factor analysis to reinterpret a standard view of adaptation, and apply our new model to some recent data on adaptation in the domain of face discrimination.
Cluster Kernels for Semi-Supervised Learning
Chapelle, Olivier, Weston, Jason, Schölkopf, Bernhard
One of the first semi-supervised algorithms [1] was applied to web page classification. This is a typical example where the number of unlabeled examples can be made as large as possible since there are billions of web page, but labeling is expensive since it requires human intervention. Since then, there has been a lot of interest for this paradigm in the machine learning community; an extensive review of existing techniques can be found in [10]. It has been shown experimentally that under certain conditions, the decision function canbe estimated more accurately, yielding lower generalization error [1, 4, 6] . However, in a discriminative framework, it is not obvious to determine how unlabeled dataor even the perfect knowledge of the input distribution P(x) can help in the estimation of the decision function.
Unsupervised Learning of Human Motion Models
Song, Yang, Goncalves, Luis, Perona, Pietro
This paper presents an unsupervised learning algorithm that can derive the probabilistic dependence structure of parts of an object (a moving human body in our examples) automatically from unlabeled data. The distinguished part of this work is that it is based on unlabeled data, i.e., the training features include both useful foreground parts and background clutter and the correspondence between the parts and detected features are unknown. We use decomposable triangulated graphs to depict the probabilistic independence of parts, but the unsupervised technique is not limited to this type of graph. In the new approach, labeling of the data (part assignments) is taken as hidden variables and the EM algorithm is applied. A greedy algorithm is developed to select parts and to search for the optimal structure based on the differential entropy of these variables. The success of our algorithm is demonstrated by applying it to generate models of human motion automatically from unlabeled real image sequences.
Probabilistic principles in unsupervised learning of visual structure: human data and a model
Edelman, Shimon, Hiles, Benjamin P., Yang, Hwajin, Intrator, Nathan
To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.
Probabilistic principles in unsupervised learning of visual structure: human data and a model
Edelman, Shimon, Hiles, Benjamin P., Yang, Hwajin, Intrator, Nathan
To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.
Probabilistic principles in unsupervised learning of visual structure: human data and a model
Edelman, Shimon, Hiles, Benjamin P., Yang, Hwajin, Intrator, Nathan
To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilitiesof the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response timesfor various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets thatcontained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance oftheir co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.
Semi-supervised MarginBoost
d', Alché-Buc, Florence, Grandvalet, Yves, Ambroise, Christophe
In many discrimination problems a large amount of data is available but only a few of them are labeled. This provides a strong motivation to improve or develop methods for semi-supervised learning. In this paper, boosting is generalized to this task within the optimization framework of MarginBoost. We extend the margin definition to unlabeled data and develop the gradient descent algorithm that corresponds to the resulting margin cost function. This meta-learning scheme can be applied to any base classifier able to benefit from unlabeled data. We propose here to apply it to mixture models trained with an Expectation-Maximization algorithm. Promising results are presented on benchmarks with different rates of labeled data.