Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Semi-Supervised AUC Optimization Without Guessing Labels of Unlabeled Data

AAAI Conferences

Semi-supervised learning, which aims to construct learners that automatically exploit the large amount of unlabeled data in addition to the limited labeled data, has been widely applied in many real-world applications. AUC is a well-known performance measure for a learner, and directly optimizing AUC may result in a better prediction performance. Thus, semi-supervised AUC optimization has drawn much attention. Existing semi-supervised AUC optimization methods exploit unlabeled data by explicitly or implicitly estimating the possible labels of the unlabeled data based on various distributional assumptions. However, these assumptions may be violated in many real-world applications, and estimating labels based on the violated assumption may lead to poor performance. In this paper, we argue that, in semi-supervised AUC optimization, it is unnecessary to guess the possible labels of the unlabeled data or prior probability based on any distributional assumptions. We analytically show that the AUC risk can be estimated unbiasedly by simply treating the unlabeled data as both positive and negative. Based on this finding, two semi-supervised AUC optimization methods named Samult and Sampura are proposed. Experimental results indicate that the proposed methods outperform the existing methods.


Interpretable Graph-Based Semi-Supervised Learning via Flows

AAAI Conferences

In this paper, we consider the interpretability of the foundational Laplacian-based semi-supervised learning approaches on graphs. We introduce a novel flow-based learning framework that subsumes the foundational approaches and additionally provides a detailed, transparent, and easily understood expression of the learning process in terms of graph flows. As a result, one can visualize and interactively explore the precise subgraph along which the information from labeled nodes flows to an unlabeled node of interest. Surprisingly, the proposed framework avoids trading accuracy for interpretability, but in fact leads to improved prediction accuracy, which is supported both by theoretical considerations and empirical results. The flow-based framework guarantees the maximum principle by construction and can handle directed graphs in an out-of-the-box manner.


Consistent and Specific Multi-View Subspace Clustering

AAAI Conferences

Multi-view clustering has attracted intensive attention due to the effectiveness of exploiting multiple views of data. However, most existing multi-view clustering methods only aim to explore the consistency or enhance the diversity of different views. In this paper, we propose a novel multi-view subspace clustering method (CSMSC), where consistency and specificity are jointly exploited for subspace representation learning. We formulate the multi-view self-representation property using a shared consistent representation and a set of specific representations, which better fits the real-world datasets. Specifically, consistency models the common properties among all views, while specificity captures the inherent difference in each view. In addition, to optimize the non-convex problem, we introduce a convex relaxation and develop an alternating optimization algorithm to recover the corresponding data representations. Experimental evaluations on four benchmark datasets demonstrate that the proposed approach achieves better performance over several state-of-the-arts.


Algorithms for Generalized Topic Modeling

AAAI Conferences

Recently there has been significant activity in developing algorithms with provable guarantees for topic modeling. In this work we consider a broad generalization of the traditional topic modeling framework, where we no longer assume that words are drawn i.i.d. and instead view a topic as a complex distribution over sequences of paragraphs. Since one could not hope to even represent such a distribution in general (even if paragraphs are given using some natural feature representation), we aim instead to directly learn a predictor that given a new document, accurately predicts its topic mixture, without learning the distributions explicitly. We present several natural conditions under which one can do this from unlabeled data only, and give efficient algorithms to do so, also discussing issues such as noise tolerance and sample complexity. More generally, our model can be viewed as a generalization of the multi-view or co-training setting in machine learning.


Model-Free Iterative Temporal Appliance Discovery for Unsupervised Electricity Disaggregation

AAAI Conferences

Electricity disaggregation identifies individual appliances from one or more aggregate data streams and has immense potential to reduce residential and commercial electrical waste. Since supervised learning methods rely on meticulously labeled training samples that are expensive to obtain, unsupervised methods show the most promise for wide-spread application. However, unsupervised learning methods previously applied to electricity disaggregation suffer from critical limitations. This paper introduces the concept of iterative appliance discovery, a novel unsupervised disaggregation method that progressively identifies the "easiest to find" or "most likely" appliances first. Once these simpler appliances have been identified, the computational complexity of the search space can be significantly reduced, enabling iterative discovery to identify more complex appliances. We test iterative appliance discovery against an existing competitive unsupervised method using two publicly available datasets. Results using different sampling rates show iterative discovery has faster runtimes and produces better accuracy. Furthermore, iterative discovery does not require prior knowledge of appliance characteristics and demonstrates unprecedented scalability to identify long, overlapped sequences that other unsupervised learning algorithms cannot.


Contrastive Training for Models of Information Cascades

AAAI Conferences

This paper proposes a model of information cascades as directed spanning trees (DSTs) over observed documents. In addition, we propose a contrastive training procedure that exploits partial temporal ordering of node infections in lieu of labeled training links. This combination of model and unsupervised training makes it possible to improve on models that use infection times alone and to exploit arbitrary features of the nodes and of the text content of messages in information cascades. With only basic node and time lag features similar to previous models, the DST model achieves performance with unsupervised training comparable to strong baselines on a blog network inference task. Unsupervised training with additional content features achieves significantly better results, reaching half the accuracy of a fully supervised model.


Activation Maximization Generative Adversarial Nets

arXiv.org Artificial Intelligence

Class labels have been empirically shown useful in improving the sample quality of generative adversarial nets (GANs). In this paper, we mathematically study the properties of the current variants of GANs that make use of class label information. With class aware gradient and cross-entropy decomposition, we reveal how class labels and associated losses influence GAN's training. Based on that, we propose Activation Maximization Generative Adversarial Networks (AM-GAN) as an advanced solution. Comprehensive experiments have been conducted to validate our analysis and evaluate the effectiveness of our solution, where AM-GAN outperforms other strong baselines and achieves state-of-the-art Inception Score (8.91) on CIFAR-10. In addition, we demonstrate that, with the Inception ImageNet classifier, Inception Score mainly tracks the diversity of the generator, and there is, however, no reliable evidence that it can reflect the true sample quality. We thus propose a new metric, called AM Score, to provide more accurate estimation on the sample quality. Our proposed model also outperforms the baseline methods in the new metric.


A Dozen Times Artificial Intelligence Startled The World

@machinelearnbot

Generative Adversarial Networks (GANs) are some of the most fascinating ways to "teach" computers to do human tasks. We've always heard that competition can boost performance, but now GANs are taking "learning from Competition" to an industrial scale. Generative Adversarial Networks are defined by AI entities (Neural Networks) that compete with each other to get better at their respective tasks. Imagine a Malware bot competing against a Security bot, each relentlessly trying to execute its own objective (e.g. First coined by Ian Goodfellow from the University of Montreal, GANs have recently shown us the power of "Unsupervised Learning" due to their widespread success.


Today's Deep Dive: Innovative Unsupervised Learning in AI

#artificialintelligence

Categorically, artificial intelligence (AI) can appear be an odd juxtaposition of order and disorder -- we direct the AI with algorithms, yet the system produces new insights seemingly magically. Most of the well-known applications of machine learning and computational AI involve supervised learning. The modeler amasses a vast set of existing data (e.g., financial transactions, internet photographs, or the texts of tweets) and a base-level "ground truth" outcome that is already known, perhaps in retrospect or by expensive human investigation. Equipped with any number of computational algorithms, the scientist becomes the "supervisor" whose code trains the model to reproduce, in the lab, the known outcomes with a low probability of error. The models are then deployed to live a happy life scoring credit risk and fraud likelihood, finding pictures of Chihuahuas and muffins, or flagging insulting tweets.


Decoder from seq2seq for Generative Adversarial Networks

#artificialintelligence

I am currently researching on free text generation using Generative Adversarial Networks. Before somebody tells me that GANs doesn't work well with discrete data (at least when they are trained with gradient descent), it's true, and I know:D, still I am getting some results and I would like to continue this study line:D. I am struggling a little bit with architecture of the generator. I am using TensorFlow for this purpose. My generator consists of a decoder network (the last part of a seq2seq), like the one in the image below.