Goto

Collaborating Authors

 Undirected Networks


From Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets

AAAI Conferences

We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.


MDP-Based Cost Sensitive Classification Using Decision Trees

AAAI Conferences

In classification, an algorithm learns to classify a given instance based on a set of observed attribute values. In many real world cases testing the value of an attribute incurs a cost. Furthermore, there can also be a cost associated with the misclassification of an instance. Cost sensitive classification attempts to minimize the expected cost of classification, by deciding after each observed attribute value, which attribute to measure next. In this paper we suggest Markov Decision Processes as a modeling tool for cost sensitive classification. We construct standard decision trees over all attribute subsets, and the leaves of these trees become the state space of our MDP. At each phase we decide on the next attribute to measure, balancing the cost of the measurement and the classification accuracy. We compare our approach to a set of previous approaches, showing our approach to work better for a range of misclassification costs.


Learning Predictive State Representations From Non-Uniform Sampling

AAAI Conferences

Predictive state representations (PSR) have emerged as a powerful method for modelling partially observable environments. PSR learning algorithms can build models for predicting all observable variables, or predicting only some of them conditioned on others (e.g., actions or exogenous variables). In the latter case, which we call conditional modelling, the accuracy of different estimates of the conditional probabilities for a fixed dataset can vary significantly, due to the limited sampling of certain conditions. This can have negative consequences on the PSR parameter estimation process, which are not taken into account by the current state-of-the-art PSR spectral learning algorithms. In this paper, we examine closely conditional modelling within the PSR framework. We first establish a new positive but surprisingly non-trivial result: a conditional model can never be larger than the complete model. Then, we address the core shortcoming of existing PSR spectral learning methods for conditional models by incorporating an additional step in the process, which can be seen as a type of matrix denoising. We further refine this objective by adding penalty terms for violations of the system dynamics matrix structure, which improves the PSR predictive performance. Empirical evaluations on both synthetic and real datasets highlight the advantages of the proposed approach.


Norm Conflict Resolution in Stochastic Domains

AAAI Conferences

Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approaches lack principled ways of determining which normative alternatives exist in a given environment. We propose a hybrid approach, using Linear Temporal Logic (LTL) representations in Markov Decision Processes (MDPs), that manages norm conflicts in a systematic manner while accommodating domain stochasticity. We provide a proof-of-concept implementation in a simulated vacuum cleaning domain.


Efficiency and Safety in Autonomous Vehicles Through Planning With Uncertainty

AAAI Conferences

Autonomous vehicles are quickly becoming an important part of human society for transportation, monitoring, agriculture, and other applications. In these applications, there is a fundamental tradeoff between safety and efficiency that is especially salient when the autonomous vehicles interact directly with humans. A key to maintaining safety without sacrificing efficiency is dealing with uncertainty properly so that robots can be assertive when it is appropriate and careful in dangerous situations. The research that will be presented in my thesis uses the partially observable Markov decision process framework to approach this challenge, exploring several applications and proposing a new solution approach that is able to handle continuous action and observation spaces, a qualitative improvement over current methods.


Learning Mixtures of MLNs

AAAI Conferences

Weight learning is a challenging problem in Markov Logic Networks (MLNs) due to the large size of the ground propositional probabilistic graphical model that underlies the first-order representation of MLNs. Though more sophisticated weight learning methods that use lifted inference have been proposed, such methods can typically scale up only in the absence of evidence, namely in generative weight learning. In discriminative learning, where the evidence typically destroys symmetries, existing approaches are lacking in scalability. In this paper, we propose a novel, intuitive approach for learning MLNs discriminatively by utilizing approximate symmetries. Specifically, we reduce the size of the training database by clustering approximately symmetric atoms together and selecting a representative atom from each cluster. However, each choice made from the clusters induces a different distribution, increasing the uncertainty in our learned model. To reduce this uncertainty, we learn a finite mixture model by stacking the different distributions, where the parameters of the model are learned using an EM approach. Our results on several benchmarks show that our approach is much more scalable and accurate as compared to existing state-of-the-art MLN learning methods.


Lifted Generalized Dual Decomposition

AAAI Conferences

Many real-world problems, such as Markov Logic Networks (MLNs) with evidence, can be represented as a highly symmetric graphical model perturbed by additional potentials. In these models, variational inference approaches that exploit exact model symmetries are often forced to ground the entire problem, while methods that exploit approximate symmetries (such as by constructing an over-symmetric approximate model) offer no guarantees on solution quality. In this paper, we present a method based on a lifted variant of the generalized dual decomposition (GenDD) for marginal MAP inference which provides a principled way to exploit symmetric sub-structures in a graphical model. We develop a coarse-to-fine inference procedure that provides any-time upper bounds on the objective. The upper bound property of GenDD provides a principled way to guide the refinement process, providing good any-time performance and eventually arriving at the ground optimal solution.


Knowledge-Based Policies for Qualitative Decentralized POMDPs

AAAI Conferences

Qualitative Decentralized Partially Observable Markov Decision Problems (QDec-POMDPs) constitute a very general class of decision problems. They involve multiple agents, decentralized execution, sequential decision, partial observability, and uncertainty. Typically, joint policies, which prescribe to each agent an action to take depending on its full history of (local) actions and observations, are huge, which makes it difficult to store them onboard, at execution time, and also hampers the computation of joint plans. We propose and investigate a new representation for joint policies in QDec-POMDPs, which we call Multi-Agent Knowledge-Based Programs (MAKBPs), and which uses epistemic logic for compactly representing conditions on histories. Contrary to standard representations, executing an MAKBP requires reasoning at execution time, but we show that MAKBPs can be exponentially more succinct than any reactive representation.


Planning and Learning for Decentralized MDPs With Event Driven Rewards

AAAI Conferences

Decentralized (PO)MDPs provide a rigorous framework for sequential multiagent decision making under uncertainty. However, their high computational complexity limits the practical impact. To address scalability and real-world impact, we focus on settings where a large number of agents primarily interact through complex joint-rewards that depend on their entire histories of states and actions. Such history-based rewards encapsulate the notion of events or tasks such that the team reward is given only when the joint-task is completed. Algorithmically, we contribute---1) A nonlinear programming (NLP) formulation for such event-based planning model; 2) A probabilistic inference based approach that scales much better than NLP solvers for a large number of agents; 3) A policy gradient based multiagent reinforcement learning approach that scales well even for exponential state-spaces. Our inference and RL-based advances enable us to solve a large real-world multiagent coverage problem modeling schedule coordination of agents in a real urban subway network where other approaches fail to scale.


Multi-attention Recurrent Network for Human Communication Comprehension

AAAI Conferences

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape the communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state- of-the-art results performance in all the datasets.