Goto

Collaborating Authors

 Learning Graphical Models


A new Hedging algorithm and its application to inferring latent random variables

arXiv.org Artificial Intelligence

We present a new online learning algorithm for cumulative discounted gain. This learning algorithm does not use exponential weights on the experts. Instead, it uses a weighting scheme that depends on the regret of the master algorithm relative to the experts. In particular, experts whose discounted cumulative gain is smaller (worse) than that of the master algorithm receive zero weight. We also sketch how a regret-based algorithm can be used as an alternative to Bayesian averaging in the context of inferring latent random variables.


Refining the Execution of Abstract Actions with Learned Action Models

Journal of Artificial Intelligence Research

Robots reason about abstract actions, such as "go to position `l'", in order to decide what to do or to generate plans for their intended course of action. The use of abstract actions enables robots to employ small action libraries, which reduces the search space for decision making. When executing the actions, however, the robot must tailor the abstract actions to the specific task and situation context at hand. In this article we propose a novel robot action execution system that learns success and performance models for possible specializations of abstract actions. At execution time, the robot uses these models to optimize the execution of abstract actions to the respective task contexts. The robot can so use abstract actions for efficient reasoning, without compromising the performance of action execution. We show the impact of our action execution model in three robotic domains and on two kinds of action execution problems: (1) the instantiation of free action parameters to optimize the expected performance of action sequences; (2) the automatic introduction of additional subgoals to make action sequences more reliable.


Adaptive Stochastic Resource Control: A Machine Learning Approach

Journal of Artificial Intelligence Research

The paper investigates stochastic resource allocation problems with scarce, reusable resources and non-preemtive, time-dependent, interconnected tasks. This approach is a natural generalization of several standard resource management problems, such as scheduling and transportation problems. First, reactive solutions are considered and defined as control policies of suitably reformulated Markov decision processes (MDPs). We argue that this reformulation has several favorable properties, such as it has finite state and action spaces, it is aperiodic, hence all policies are proper and the space of control policies can be safely restricted. Next, approximate dynamic programming (ADP) methods, such as fitted Q-learning, are suggested for computing an efficient control policy. In order to compactly maintain the cost-to-go function, two representations are studied: hash tables and support vector regression (SVR), particularly, nu-SVRs. Several additional improvements, such as the application of limited-lookahead rollout algorithms in the initial phases, action space decomposition, task clustering and distributed sampling are investigated, too. Finally, experimental results on both benchmark and industry-related data are presented.


A Bayesian Approach to Network Modularity

arXiv.org Machine Learning

We present an efficient, principled, and interpretable technique for inferring module assignments and for identifying the optimal number of modules in a given network. We show how several existing methods for finding modules can be described as variant, special, or limiting cases of our work, and how the method overcomes the resolution limit problem, accurately recovering the true number of modules. Our approach is based on Bayesian methods for model selection which have been used with success for almost a century, implemented using a variational technique developed only in the past decade. We apply the technique to synthetic and real networks and outline how the method naturally allows selection among competing models.


Conditioning Probabilistic Databases

arXiv.org Artificial Intelligence

Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is materialized for subsequent query processing or further refinement. It turns out that the conditioning problem is closely related to the problem of computing exact tuple confidence values. It is known that exact confidence computation is an NP-hard problem. This has led researchers to consider approximation techniques for confidence computation. However, neither conditioning nor exact confidence computation can be solved using such techniques. In this paper we present efficient techniques for both problems. We study several problem decomposition methods and heuristics that are based on the most successful search techniques from constraint satisfaction, such as the Davis-Putnam algorithm. We complement this with a thorough experimental evaluation of the algorithms proposed. Our experiments show that our exact algorithms scale well to realistic database sizes and can in some scenarios compete with the most efficient previous approximation algorithms.


The end of Sleeping Beauty's nightmare

arXiv.org Artificial Intelligence

The way a rational agent changes her belief in certain propositions/hypotheses in the light of new evidence lies at the heart of Bayesian inference. The basic natural assumption, as summarized in van Fraassen's Reflection Principle ([1984]), would be that in the absence of new evidence the belief should not change. Yet, there are examples that are claimed to violate this assumption. The apparent paradox presented by such examples, if not settled, would demonstrate the inconsistency and/or incompleteness of the Bayesian approach and without eliminating this inconsistency, the approach cannot be regarded as scientific. The Sleeping Beauty Problem is just such an example. The existing attempts to solve the problem fall into three categories. The first two share the view that new evidence is absent, but differ about the conclusion of whether Sleeping Beauty should change her belief or not, and why. The third category is characterized by the view that, after all, new evidence (although hidden from the initial view) is involved. My solution is radically different and does not fall in either of these categories. I deflate the paradox by arguing that the two different degrees of belief presented in the Sleeping Beauty Problem are in fact beliefs in two different propositions, i.e. there is no need to explain the (un)change of belief.


Optimal and Approximate Q-value Functions for Decentralized POMDPs

Journal of Artificial Intelligence Research

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem.


Intuitive visualization of the intelligence for the run-down of terrorist wire-pullers

arXiv.org Artificial Intelligence

The investigation of the terrorist attack is a time-critical task. The investigators have a limited time window to diagnose the organizational background of the terrorists, to run down and arrest the wire-pullers, and to take an action to prevent or eradicate the terrorist attack. The intuitive interface to visualize the intelligence data set stimulates the investigators' experience and knowledge, and aids them in decision-making for an immediately effective action. This paper presents a computational method to analyze the intelligence data set on the collective actions of the perpetrators of the attack, and to visualize it into the form of a social network diagram which predicts the positions where the wire-pullers conceals themselves.


Communication-Based Decomposition Mechanisms for Decentralized MDPs

Journal of Artificial Intelligence Research

Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.


Causal models have no complete axiomatic characterization

arXiv.org Artificial Intelligence

Markov networks and Bayesian networks are effective graphic representations of the dependencies embedded in probabilistic models. It is well known that independencies captured by Markov networks (called graph-isomorphs) have a finite axiomatic characterization. This paper, however, shows that independencies captured by Bayesian networks (called causal models) have no axiomatization by using even countably many Horn or disjunctive clauses. This is because a sub-independency model of a causal model may be not causal, while graph-isomorphs are closed under sub-models.