Unsupervised or Indirectly Supervised Learning
Muffled Semi-Supervised Learning
Balsubramani, Akshay, Freund, Yoav
We explore a novel approach to semi-supervised learning. This approach is contrary to the common approach in that the unlabeled examples serve to "muffle," rather than enhance, the guidance provided by the labeled examples. We provide several variants of the basic algorithm and show experimentally that they can achieve significantly higher AUC than boosted trees, random forests and logistic regression when unlabeled examples are available.
Unsupervised Machine Learning Could Help Us Solve the Unsolvable
In contrast, unsupervised learning systems freely analyze'patterns' in unlabeled data, with no corresponding error or reward linked to a conclusion. It works with'unlabeled data' and is similar to'associative' or'discovery' learning in humans, something that we do very well (and often take for granted). For example, when an unsupervised system is asked to sort or arrange fruits based on raw observations, the system might'choose' to arrange the fruit based on recognition of color, placing strawberries and cherries in the'red' category; or, the system might sort based on observed sizes, grouping pears, apples, and oranges in a'medium-sized' fruit category. This latter method is commonly known as'clustering' and the accepted approach used by these systems to categorize information. Unsupervised learning is a stepping stone, a means to another end such as categorization or finding potential correlations or solutions unable to be spotted by humans or supervised learning systems alone.
Question regarding convergence proof for Generative Adversarial Networks โข /r/MachineLearning
While reading the Generative Adversarial Networks paper I encountered the following statement in section 4.2: "Proof. Consider V (G, D) U(pg, D) as a function of pg as done in the above criterion. Note that U(pg, D) is convex in pg." As far as I understand, pg is a probability distribution which in this case means that U(pg, D) is a function of many variables (the parameters of both pg and D combined). To say anything about the convexity of U(pg, D) we need to calculate the Hessian of U(pg, D), don't we?
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
Traditional graph-based semi-supervised learning (SSL) approaches, even though widely applied, are not suited for massive data and large label scenarios since they scale linearly with the number of edges $|E|$ and distinct labels $m$. To deal with the large label size problem, recent works propose sketch-based methods to approximate the distribution on labels per node thereby achieving a space reduction from $O(m)$ to $O(\log m)$, under certain conditions. In this paper, we present a novel streaming graph-based SSL approximation that captures the sparsity of the label distribution and ensures the algorithm propagates labels accurately, and further reduces the space complexity per node to $O(1)$. We also provide a distributed version of the algorithm that scales well to large data sizes. Experiments on real-world datasets demonstrate that the new method achieves better performance than existing state-of-the-art algorithms with significant reduction in memory footprint. We also study different graph construction mechanisms for natural language applications and propose a robust graph augmentation strategy trained using state-of-the-art unsupervised deep learning architectures that yields further significant quality gains.
What Is Unsupervised Learning?
The one point that I want to emphasize here is that the adjective "unsupervised" does not mean that these algorithms run by themselves without human supervision. It simply indicates the absence of a desired or ideal output corresponding to each input. An analyst (or a data scientist) who is training an unsupervised learning model has to exercise a similar kind of modeling discipline as does the one who is training a supervised model. Alternatively, an analyst who is training an unsupervised learning model can exercise a similar amount of control on the resulting output by configuring model parameters as does the one who is training a supervised model. While supervised algorithms derive a mapping function from x to y so as to accurately estimate the y's corresponding to new x's, unsupervised algorithms employ predefined distance/similarity functions to map the distribution of input x's.
Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model. The resulting algorithm can either be interpreted as a natural generalization of the generative adversarial networks (GAN) framework or as an extension of the regularized information maximization (RIM) framework to robust classification against an optimal adversary. We empirically evaluate our method - which we dub categorical generative adversarial networks (or CatGAN) - on synthetic data as well as on challenging image classification tasks, demonstrating the robustness of the learned classifiers. We further qualitatively assess the fidelity of samples generated by the adversarial generator that is learned alongside the discriminative classifier, and identify links between the CatGAN objective and discriminative clustering algorithms (such as RIM).
Unsupervised learning with Geospacial & Location data, best place to start? โข /r/MachineLearning
Unsupervised learning with Geospacial & Location data, best place to start? Say I have a dataset of thousands of events and their location & time. I want to predict the location of future events. You may want to look into spatial-temporal modelling. As far as I know, there are more statistical modeling out there than machine learning in this field.
Towards Safe Semi-Supervised Learning for Multivariate Performance Measures
Li, Yu-Feng (Nanjing University) | Kwok, James T. (Hong Kong University of Science and Technology) | Zhou, Zhi-Hua (Nanjing University, China)
Semi-supervised learning (SSL) is an important research problem in machine learning. While it is usually expected that the use of unlabeled data can improve performance, in many cases SSL is outperformed by supervised learning using only labeled data. To this end, the construction of a performance-safe SSL method has become a key issue of SSL study. To alleviate this problem, we propose in this paper the UMVP (safe semi-sUpervised learning for MultiVariate Performance measure) method, because of the need of various performance measures in practical tasks. The proposed method integrates multiple semi-supervised learners, and maximizes the worst-case performance gain to derive the final prediction. The overall problem is formulated as a maximin optimization. In oder to solve the resultant difficult maximin optimization, this paper shows that when the performance measure is the Top- k Precision, F ฮฒ score or AUC, a minimax convex relaxation of the maximin optimization can be solved efficiently. Experimental results show that the proposed method can effectively improve the safeness of SSL under multiple multivariate performance measures.
Unsupervised Learning of HTNs in Complex Adversarial Domains
Leece, Michael A. (University of California, Santa Cruz)
While Hierarchical Task Networks are frequently cited as flexible and powerful planning models, they are often ignored due to the intensive labor cost for experts/programmers, due to the need to create and refine the model by hand. While recent work has begun to address this issue by working towards learning aspects of an HTN model from demonstration, or even the whole framework, the focus so far has been on simple domains, which lack many of the challenges faced in the real world such as imperfect information and real-time environments. I plan to extend this work using the domain of real-time strategy (RTS) games, which have gained recent popularity as a challenging and complex domain for AI research.
Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver
Zhang, Yan-Ming (Institute of Automation, Chinese Academy of Sciences) | Zhang, Xu-Yao (Institute of Automation, Chinese Academy of Sciences) | Yuan, Xiao-Tong (Nanjing University of Information Science and Technology) | Liu, Cheng-Lin (Institute of Automation, Chinese Academy of Sciences)
Graph-based Semi-Supervised learning is one of the most popular and successful semi-supervised learning methods. Typically, it predicts the labels of unlabeled data by minimizing a quadratic objective induced by the graph, which is unfortunately a procedure of polynomial complexity in the sample size $n$. In this paper, we address this scalability issue by proposing a method that approximately solves the quadratic objective in nearly linear time. The method consists of two steps: it first approximates a graph by a minimum spanning tree, and then solves the tree-induced quadratic objective function in O(n) time which is the main contribution of this work. Extensive experiments show the significant scalability improvement over existing scalable semi-supervised learning methods.