supervising unsupervised learning
Supervising Unsupervised Learning
We introduce a framework to transfer knowledge acquired from a repository of (heterogeneous) supervised datasets to new unsupervised datasets. Our perspective avoids the subjectivity inherent in unsupervised learning by reducing it to supervised learning, and provides a principled way to evaluate unsupervised algorithms. We demonstrate the versatility of our framework via rigorous agnostic bounds on a variety of unsupervised problems. In the context of clustering, our approach helps choose the number of clusters and the clustering algorithm, remove the outliers, and provably circumvent Kleinberg's impossibility result. Experiments across hundreds of problems demonstrate improvements in performance on unsupervised data with simple algorithms despite the fact our problems come from heterogeneous domains. Additionally, our framework lets us leverage deep networks to learn common features across many small datasets, and perform zero shot learning.
Supervising Unsupervised Learning
We introduce a framework to transfer knowledge acquired from a repository of (heterogeneous) supervised datasets to new unsupervised datasets. Our perspective avoids the subjectivity inherent in unsupervised learning by reducing it to supervised learning, and provides a principled way to evaluate unsupervised algorithms. We demonstrate the versatility of our framework via rigorous agnostic bounds on a variety of unsupervised problems. In the context of clustering, our approach helps choose the number of clusters and the clustering algorithm, remove the outliers, and provably circumvent Kleinberg's impossibility result. Experiments across hundreds of problems demonstrate improvements in performance on unsupervised data with simple algorithms despite the fact our problems come from heterogeneous domains.
Reviews: Supervising Unsupervised Learning
By considering a probability distribution over a family of supervised datasets, the authors propose to select a clustering algorithm from a finite family of algorithms or to choose the number of clusters among other tasks by solving a supervised learning problem that matches some features of the input dataset to the output dataset. For instance, in the case of selecting the number of clusters, they regress this number from a family of datasets learning a function that gives a "correct" number of clusters. The submission seems technically sound; the authors support the claim of the possibility of agnostic learning in two specific settings with a theoretical analysis: choosing an algorithm from a finite family of algorithms and choosing an algorithm from a family of single-linkage algorithms. Their framework also allows proposing an alternative to the desirable property of Scale-Invariance introduced by Kleinberg (2003) by letting the training datasets to establish a scale; this is translated into the Meta-Scale-Invariance desirable property. The authors then show that, with this version of the Scale-Invariance property, it is possible to learn a clustering algorithm that is also Consistent and Rich (as defined by Kleinberg (2003)).
Supervising Unsupervised Learning
We introduce a framework to transfer knowledge acquired from a repository of (heterogeneous) supervised datasets to new unsupervised datasets. Our perspective avoids the subjectivity inherent in unsupervised learning by reducing it to supervised learning, and provides a principled way to evaluate unsupervised algorithms. We demonstrate the versatility of our framework via rigorous agnostic bounds on a variety of unsupervised problems. In the context of clustering, our approach helps choose the number of clusters and the clustering algorithm, remove the outliers, and provably circumvent Kleinberg's impossibility result. Experiments across hundreds of problems demonstrate improvements in performance on unsupervised data with simple algorithms despite the fact our problems come from heterogeneous domains.
Supervising Unsupervised Learning with Evolutionary Algorithm in Deep Neural Network
A method to control results of gradient descent unsupervised learning in a deep neural network by using evolutionary algorithm is proposed. To process crossover of unsupervisedly trained models, the algorithm evaluates pointwise fitness of individual nodes in neural network. Labeled training data is randomly sampled and breeding process selects nodes by calculating degree of their consistency on different sets of sampled data. This method supervises unsupervised training by evolutionary process. We also introduce modified Restricted Boltzmann Machine which contains repulsive force among nodes in a neural network and it contributes to isolate network nodes each other to avoid accidental degeneration of nodes by evolutionary process. These new methods are applied to document classification problem and it results better accuracy than a traditional fully supervised classifier implemented with linear regression algorithm.