Goto

Collaborating Authors

 Undirected Networks


Risk-Variant Policy Switching to Exceed Reward Thresholds

AAAI Conferences

This paper presents a decision-theoretic planning approach for probabilistic environments where the agent's goal is to win, which we model as maximizing the probability of being above a given reward threshold. In competitive domains, second is as good as last, and it is often desirable to take risks if one is in danger of losing, even if the risk does not pay off very often. Our algorithm maximizes the probability of being above a particular reward threshold by dynamically switching between a suite of policies, each of which encodes a different level of risk. This method does not explicitly encode time or reward into the state space, and decides when to switch between policies during each execution step. We compare a risk-neutral policy to switching among different risk-sensitive policies, and show that our approach improves the agent's probability of winning.


Gradient Computation In Linear-Chain Conditional Random Fields Using The Entropy Message Passing Algorithm

arXiv.org Artificial Intelligence

The paper proposes a numerically stable recursive algorithm for the exact computation of the linear-chain conditional random field gradient. It operates as a forward algorithm over the log-domain expectation semiring and has the purpose of enhancing memory efficiency when applied to long observation sequences. Unlike the traditional algorithm based on the forward-backward recursions, the memory complexity of our algorithm does not depend on the sequence length. The experiments on real data show that it can be useful for the problems which deal with long sequences.


Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds

arXiv.org Machine Learning

Large Markov decision processes (MDPs) are common in reinforcement learning and operations research and are often solved by approximate dynamic programming (ADP). Many ADP algorithms have been developed and studied, often with impressive empirical performance. However, because many ADP methods must be carefully tuned to work well and offer insufficient theoretical guarantees, it is important to develop new methods that have both good theoretical guarantees and empirical performance. Approximate linear programming (ALP)--an ADP method--has been developed with the goal of achieving convergence and good theoretical guarantees (de Farias & van Roy, 2003). Approximate bilinear programming (ABP) improves on the theoretical properties of ALP at the cost of additional computational complexity (Petrik & Zilberstein, 2009, 2011).


Focused Grounding for Markov Logic Networks

AAAI Conferences

Markov logic networks have been successfully applied to many problems in AI. However, the computational complexity of the inference procedures has limited their application. Previous work in lifted inference, lazy inference and cutting plane inference has identified cases where the entire ground network need not be constructed. These approaches are specific to particular inference procedures, and apply well only to certain classes of problems. We introduce a method of focused grounding that can use either general purpose or domain specific heuristics to produce only the most relevant ground formulas. Though a solution to the focused grounding is not, in general, a solution to the complete grounding, we show empirically that the smaller search space of a focused grounding makes it easier to locate a good solution. We evaluate focused grounding on two diverse domains, joint entity resolution and abductive plan recognition. We show improved results and decreased computation cost for the entity resolution domain relative to a complete grounding. Focused grounding in abductive plan recognition produces state of the art results in a domain where complete grounding proved intractable.


Maritime Threat Detection Using Probabilistic Graphical Models

AAAI Conferences

Maritime threat detection is a challenging problem because maritime environments can involve a complex combination of concurrent vessel activities, and only a small fraction of these may be irregular, suspicious, or threatening. Previous work on this task has been limited to analyses of single vessels using simple rule-based models that alert watchstanders when a proximity threshold is breached. We claim that Probabilistic Graphical Models (PGMs) can be used to more effectively model complex maritime situations. In this paper, we study the performance of PGMs for detecting (small boat) maritime attacks. We describe three types of PGMs that vary in their representational expressiveness and evaluate them on a threat recognition task using track data obtained from force protection naval exercises involving unmanned sea surface vehicles. We found that the best-performing PGMs can outperform the deployed rule-based approach on these tasks, though some PGMs require substantial engineering and are computationally expensive.


Model-based Utility Functions

arXiv.org Artificial Intelligence

Orseau and Ring, as well as Dewey, have recently described problems, including self-delusion, with the behavior of agents using various definitions of utility functions. An agent's utility function is defined in terms of the agent's history of interactions with its environment. This paper argues, via two examples, that the behavior problems can be avoided by formulating the utility function in two steps: 1) inferring a model of the environment from interactions, and 2) computing utility as a function of the environment model. Basing a utility function on a model that the agent must learn implies that the utility function must initially be expressed in terms of specifications to be matched to structures in the learned model. These specifications constitute prior assumptions about the environment so this approach will not work with arbitrary environments. But the approach should work for agents designed by humans to act in the physical world. The paper also addresses the issue of self-modifying agents and shows that if provided with the possibility to modify their utility functions agents will not choose to do so, under some usual assumptions.


Counting Belief Propagation

arXiv.org Artificial Intelligence

A major benefit of graphical models is that most knowledge is captured in the model structure. Many models, however, produce inference problems with a lot of symmetries not reflected in the graphical structure and hence not exploitable by efficient inference techniques such as belief propagation (BP). In this paper, we present a new and simple BP algorithm, called counting BP, that exploits such additional symmetries. Starting from a given factor graph, counting BP first constructs a compressed factor graph of clusternodes and clusterfactors, corresponding to sets of nodes and factors that are indistinguishable given the evidence. Then it runs a modified BP algorithm on the compressed graph that is equivalent to running BP on the original factor graph. Our experiments show that counting BP is applicable to a variety of important AI tasks such as (dynamic) relational models and boolean model counting, and that significant efficiency gains are obtainable, often by orders of magnitude.


Correlated Non-Parametric Latent Feature Models

arXiv.org Machine Learning

We are often interested in explaining data through a set of hidden factors or features. When the number of hidden features is unknown, the Indian Buffet Process (IBP) is a nonparametric latent feature model that does not bound the number of active features in dataset. However, the IBP assumes that all latent features are uncorrelated, making it inadequate for many realworld problems. We introduce a framework for correlated nonparametric feature models, generalising the IBP. We use this framework to generate several specific models and demonstrate applications on realworld datasets.


New inference strategies for solving Markov Decision Processes using reversible jump MCMC

arXiv.org Machine Learning

In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.


Multiple Source Adaptation and the Renyi Divergence

arXiv.org Machine Learning

This paper presents a novel theoretical study of the general problem of multiple source adaptation using the notion of Renyi divergence. Our results build on our previous work [12], but significantly broaden the scope of that work in several directions. We extend previous multiple source loss guarantees based on distribution weighted combinations to arbitrary target distributions P, not necessarily mixtures of the source distributions, analyze both known and unknown target distribution cases, and prove a lower bound. We further extend our bounds to deal with the case where the learner receives an approximate distribution for each source instead of the exact one, and show that similar loss guarantees can be achieved depending on the divergence between the approximate and true distributions. We also analyze the case where the labeling functions of the source domains are somewhat different. Finally, we report the results of experiments with both an artificial data set and a sentiment analysis task, showing the performance benefits of the distribution weighted combinations and the quality of our bounds based on the Renyi divergence.