Goto

Collaborating Authors

 Learning Graphical Models


Large Margin Boltzmann Machines

AAAI Conferences

Boltzmann Machines are a powerful class of undirected graphical models. Originally proposed as artificial neural networks, they can be regarded as a type of Markov Random Field in which the connection weights between nodes are symmetric and learned from data. They are also closely related to recent models such as Markov logic networks and Conditional Random Fields. A major challenge for Boltzmann machines (as well as other graphical models) is speeding up learning for large-scale problems. The heart of the problem lies in efficiently and effectively approximating the partition function. In this paper, we propose a new efficient learning algorithm for Boltzmann machines that allows them to be applied to problems with large numbers of random variables. We introduce a new large-margin variational approximation to the partition function that allows Boltzmann machines to be trained using a support vector machine (SVM) style learning algorithm. For discriminative learning tasks, these large margin Boltzmann machines provide an alternative approach to structural SVMs. We show that these machines have low sample complexity and derive a generalization bound. Our results demonstrate that on multi-label classification problems, large margin Boltzmann machines achieve orders of magnitude faster performance than structural SVMs and also outperform structural SVMs on problems with large numbers of labels.


Probabilistic Models for Concurrent Chatting Activity Recognition

AAAI Conferences

Recognition of chatting activities in social interactions is useful for constructing human social networks. However, the existence of multiple people involved in multiple dialogues presents special challenges. To model the conversational dynamics of concurrent chatting behaviors, this paper advocates Factorial Conditional Random Fields (FCRFs) as a model to accommodate co-temporal relationships among multiple activity states. In addition, to avoid the use of inefficient Loopy Belief Propagation (LBP) algorithm, we propose using Iterative Classification Algorithm (ICA) as the inference method for FCRFs. We designed experiments to compare our FCRFs model with two dynamic probabilistic models, Parallel Condition Random Fields (PCRFs) and Hidden Markov Models (HMMs), in learning and decoding based on auditory data. The experimental results show that FCRFs outperform PCRFs and HMM-like models. We also discover that FCRFs using the ICA inference approach not only improves the recognition accuracy but also takes significantly less time than the LBP inference method.


Inverse Reinforcement Learning in Partially Observable Environments

AAAI Conferences

Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert's environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this paper, we present an extension of the classical IRL algorithm by Ng and Russell to partially observable environments. We discuss technical issues and challenges, and present the experimental results on some of the benchmark partially observable domains.


Bayesian Extreme Components Analysis

AAAI Conferences

Extreme Components Analysis (XCA) is a statistical method based on a single eigenvalue decomposition to recover the optimal combination of principal and minor components in the data. Unfortunately, minor components are notoriously sensitive to overfitting when the number of data items is small relative to the number of attributes. We present a Bayesian extension of XCA by introducing a conjugate prior for the parameters of the XCA model. This Bayesian-XCA is shown to outperform plain vanilla XCA as well as Bayesian-PCA and XCA based on a frequentist correction to the sample spectrum. Moreover, we show that minor components are only picked when they represent genuine constraints in the data, even for very small sample sizes. An extension to mixtures of Bayesian XCA models is also explored.


Exponential Family Hybrid Semi-Supervised Learning

AAAI Conferences

We present an approach to semi-supervised learning based on an exponential family characterization. Our approach generalizes previous work on coupled priors for hybrid generative/discriminative models. Our model is more flexible and natural than previous approaches. Experimental results on several data sets show that our approach also performs better in practice. 


Evaluating Abductive Hypotheses using an EM Algorithm on BDDs

AAAI Conferences

Abductive inference is an important AI reasoning technique to find explanations of observations, and has recently been applied to scientific discovery.  To find best hypotheses among many logically possible hypotheses, we need to evaluate hypotheses obtained from the process of hypothesis generation.  We propose an abductive inference architecture combined with an EM algorithm working on binary decision diagrams (BDDs).  This work opens a way of applying BDDs to compress multiple hypotheses and to select most probable ones from them.  An implemented system has been applied to inference of inhibition in metabolic pathways in the domain of systems biology.


A New Bayesian Approach to Multiple Intermittent Fault Diagnosis

AAAI Conferences

Logic reasoning approaches to fault diagnosis account for the fact that a component c j may fail intermittently by introducing a parameter g j that expresses the probability the component exhibits correct behavior. This component parameter g j , in conjunction with a priori fault probability, is usedin a Bayesian framework to compute the posterior fault candidate probabilities. Usually, information on g j is not known a priori. While proper estimation of g j can have a great impact on the diagnostic accuracy, at present, only approximations have been proposed. We present a novel framework, BARINEL, that computes exact estimations of g j as integral part of the posterior candidate probability computation. BARINEL’s diagnostic performance is evaluated for both synthetic and real software systems. Our results show that our approach is superior to approaches based on classical persistent fault models as well as previously proposed intermittent fault models.


Canadian Traveler Problem with Remote Sensing

AAAI Conferences

The Canadian Traveler Problem (CTP) is a navigation problem where a graph is initially known,  but some edges may be blocked with a known probability.  The task is to minimize travel effort of reaching the goal. We generalize CTP to allow for remote sensing actions, now requiring minimization  of the sum of the travel cost and the remote sensing cost. Finding optimal policies for both versions is intractable. We provide optimal solutions for special case graphs. We then develop a framework that utilizes heuristics to determine when and where to sense the environment in order to minimize total costs. Several such heuristics, based on the expected total cost are introduced.  Empirical evaluations show the benefits of our heuristics and support some of the theoretical results.


Eliciting Honest Reputation Feedback in a Markov Setting

AAAI Conferences

Recently, online reputation mechanisms have been proposed that reward agents for honest feedback about products and services with fixed quality. Many real-world settings, however, are inherently dynamic. As an example, consider a web service that wishes to publish the expected download speed of a file mirrored on different server sites. In contrast to the models of Miller, Resnick and Zeckhauser and of Jurca and Faltings, the quality of the service (e. g., a server’s available bandwidth) changes over time and future agents are solely interested in the present quality levels. We show that hidden Markov models (HMM) provide natural generalizations of these static models and design a payment scheme that elicits honest reports from the agents after they have experienced the quality of the service.


Investigations of Continual Computation

AAAI Conferences

Autonomous agents that sense, reason, and act in real-world environments for extended periods often need to solve streams of incoming problems. Traditionally, effort is applied only to problems that have already arrived and have been noted. We examine continual computation methods that allow agents to ideally allocate time to solving current as well as potential future problems under uncertainty. We first review prior work on continual computation. Then, we present new directions and results, including the consideration of shared subtasks and multiple tasks. We present results on the computational complexity of the continual-computation problem and provide approximations for arbitrary models of computational performance. Finally, we review special formulations for addressing uncertainty about the best algorithm to apply, learning about performance, and considering costs associated with delayed use of results.