Goto

Collaborating Authors

 Statistical Learning


G-Optimal Design with Laplacian Regularization

AAAI Conferences

In many real world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attempts to discover the most informative data points for labeling. Recently, Optimal Experimental Design (OED) techniques have attracted an increasing amount of attention. OED is concerned with the design of experiments that minimizes variances of a parameterized model. Typical design criteria include D-, A-, and E-optimality. However, all these criteria are based on an ordinary linear regression model which aims to minimize the empirical error whereas the geometrical structure of the data space is not well respected. In this paper, we propose a novel optimal experimental design approach for active learning, called Laplacian G-Optimal Design (LapGOD), which considers both discriminating and geometrical structures. By using Laplacian Regularized Least Squares which incorporates manifold regularization into linear regression, our proposed algorithm selects those data points that minimizes the maximum variance of the predicted values on the data manifold. We also extend our algorithm to nonlinear case by using kernel trick. The experimental results on various image databases have shown that our proposed LapGOD active learning algorithm can significantly enhance the classification accuracy if the selected data points are used as training data.


Dirichlet Process Mixtures of Generalized Linear Models

arXiv.org Machine Learning

We propose Dirichlet Process mixtures of Generalized Linear Models (DP-GLM), a new method of nonparametric regression that accommodates continuous and categorical inputs, and responses that can be modeled by a generalized linear model. We prove conditions for the asymptotic unbiasedness of the DP-GLM regression mean function estimate. We also give examples for when those conditions hold, including models for compactly supported continuous distributions and a model with continuous covariates and categorical response. We empirically analyze the properties of the DP-GLM and why it provides better results than existing Dirichlet process mixture regression models. We evaluate DP-GLM on several data sets, comparing it to modern methods of nonparametric regression like CART, Bayesian trees and Gaussian processes. Compared to existing techniques, the DP-GLM provides a single model (and corresponding inference algorithms) that performs well in many regression settings.


Gaussian Mixture Modeling with Gaussian Process Latent Variable Models

arXiv.org Machine Learning

Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets.


Maximum Causal Entropy Correlated Equilibria for Markov Games

AAAI Conferences

In this work, we present maximum causal entropy correlated equilibria, a new solution concept that we apply to Markov games. This contribution extends the existing solution concept of maximum entropy correlated equilibria for normal-form games to settings with elements of dynamic interaction with a stochastic environment by employing the recently developed principle of maximum causal entropy. This solution concept is justified for two purposes: as a mechanism for prescribing actions, it reveals the least additional information about the agents' motives possible; and as a predictive estimator of actions for a group of agents assumed to behave according to an unknown correlated equilibrium, it has the fewest additional assumptions and minimizes worst-case action prediction log-loss. Importantly, equilibria for this solution concept are guaranteed to be unique and Markovian, enabling efficient algorithms for finding them.


Envisioning a Robust, Scalable Metacognitive Architecture Built on Dimensionality Reduction

AAAI Conferences

One major challenge of implementing a metacognitive architecture lies in its scalability and flexibility. We postulate that the difference between a reasoner and a metareasoner need not extend beyond what inputs they take, and we envision a network made of many instances of a few types of simple but powerful reasoning units to serve both roles. In this paper, we present a vision and motivation for such a framework with reusable, robust, and scalable components. This framework, called Scruffy Metacognition , is built on a symbolic representation that lends itself to processing using dimensionality reduction and principal component analysis. We discuss the components of such as system and how they work together for metacognitive reasoning. Additionally, we discuss evaluative tasks for our system focusing on social agent role-playing and object classification.


Learning to Extract Quality Discourse in Online Communities

AAAI Conferences

Collaborative filtering systems have been developed to manage information overload and improve discourse in online communities. In such systems, users rank content provided by other users on the validity or usefulness within their particular context. The goal is that "good" content will rise to prominence and "bad" content will fade into obscurity. These filtering mechanisms are not well-understood and have known weaknesses. For example, they depend on the presence of a large crowd to rate content, but such a crowd may not be present. Additionally, the community's decisions determine which voices will reach a large audience and which will be silenced, but it is not known if these decisions represent "the wisdom of crowds" or a "censoring mob." Our approach uses statistical machine learning to predict community ratings. By extracting features that replicate the community's verdict, we can better understand collaborative filtering, improve the way the community uses the ratings of their members, and design agents that augment community decision-making. Slashdot is an example of such a community where peers will rate each others' comments based on their relevance to the post. This work extracts a wide variety of features from the Slashdot metadata and posts' linguistic contents to identify features that can predict the community rating. We find that author reputation, use of pronouns, and author sentiment are salient. We achieve 76% accuracy predicting community ratings as good, neutral, or bad.


Relational Learning for Collective Classification of Entities in Images

AAAI Conferences

We consider the problem of discrete multi-label entity classification in images. We argue that the framework of Markov Logic can provide a unified, well-grounded mechanism to incorporate arbitrary logical relationships between entities to improve classification in images, and thus generalizes much of the recent work on exploiting local and global context in object recognition and scene understanding. Furthermore, we show that Markov Logic can provide a powerful new set of contexts that can relate entities across images in a database for joint classification of all entities in a test set simultaneously. We relate this collective classification of images to graph-based semi-supervised learning approaches, and show that Markov Logic can effectively provide a method to unify context-related work with semi-supervised approaches in a way that neither techniques could easily do on their own. Finally, we show the efficacy of these techniques on a face recognition task on three datasets showing that adding contextual relations dramatically improves accuracy over semi-supervised learning approaches alone.


Reducing the Dimensionality of Data Streams using Common Sense

AAAI Conferences

Increasingly, we need to computationally understand real-time streams of information in places such as news feeds, speech streams, and social networks. We present Streaming AnalogySpace, an efficient technique that discovers correlations in and makes predictions about sparse natural-language data that arrives in a real-time stream. AnalogySpace is a noise-resistant PCA-based inference technique designed for use with collaboratively collected common sense knowledge and semantic networks. Streaming AnalogySpace advances this work by computing it incrementally using CCIPCA, and keeping a dense cache of recently-used features to efficiently represent a sparse and open domain. We show that Streaming AnalogySpace converges to the results of standard AnalogySpace, and verify this by evaluating its accuracy empirically on common-sense predictions against standard AnalogySpace.


Clustering Stability: An Overview

arXiv.org Machine Learning

A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the behavior of this method from a theoretical point of view. However, the results are very technical and difficult to interpret for non-experts. In this paper we give a high-level overview about the existing literature on clustering stability. In addition to presenting the results in a slightly informal but accessible way, we relate them to each other and discuss their different implications.


Application of Data Mining to Network Intrusion Detection: Classifier Selection Model

arXiv.org Artificial Intelligence

As network attacks have increased in number and severity over the past few years, intrusion detection system (IDS) is increasingly becoming a critical component to secure the network. Due to large volumes of security audit data as well as complex and dynamic properties of intrusion behaviors, optimizing performance of IDS becomes an important open problem that is receiving more and more attention from the research community. The uncertainty to explore if certain algorithms perform better for certain attack classes constitutes the motivation for the reported herein. In this paper, we evaluate performance of a comprehensive set of classifier algorithms using KDD99 dataset. Based on evaluation results, best algorithms for each attack category is chosen and two classifier algorithm selection models are proposed. The simulation result comparison indicates that noticeable performance improvement and real-time intrusion detection can be achieved as we apply the proposed models to detect different kinds of network attacks.