Genre
Incremental Model-based Learners With Formal Learning-Time Guarantees
Strehl, Alexander L., Li, Lihong, Littman, Michael L.
Model-based learning algorithms have been shown to use experience efficiently when learning to solve Markov Decision Processes (MDPs) with finite state and action spaces. However, their high computational cost due to repeatedly solving an internal model inhibits their use in large-scale problems. We propose a method based on real-time dynamic programming (RTDP) to speed up two model-based algorithms, RMAX and MBIE (model-based interval estimation), resulting in computationally much faster algorithms with little loss compared to existing bounds. Specifically, our two new learning algorithms, RTDP-RMAX and RTDP-IE, have considerably smaller computational demands than RMAX and MBIE. We develop a general theoretical framework that allows us to prove that both are efficient learners in a PAC (probably approximately correct) sense. We also present an experimental evaluation of these new algorithms that helps quantify the tradeoff between computational and experience demands.
Recognizing Activities and Spatial Context Using Wearable Sensors
Subramanya, Amarnag, Raj, Alvin, Bilmes, Jeff A., Fox, Dieter
We introduce a new dynamic model with the capability of recognizing both activities that an individual is performing as well as where that ndividual is located. Our model is novel in that it utilizes a dynamic graphical model to jointly estimate both activity and spatial context over time based on the simultaneous use of asynchronous observations consisting of GPS measurements, and measurements from a small mountable sensor board. Joint inference is quite desirable as it has the ability to improve accuracy of the model. A key goal, however, in designing our overall system is to be able to perform accurate inference decisions while minimizing the amount of hardware an individual must wear. This minimization leads to greater comfort and flexibility, decreased power requirements and therefore increased battery life, and reduced cost. We show results indicating that our joint measurement model outperforms measurements from either the sensor board or GPS alone, using two types of probabilistic inference procedures, namely particle filtering and pruned exact inference.
Axiomatic Foundations for a Class of Generalized Expected Utility: Algebraic Expected Utility
Expected Utility: Algebraic Expected Utility In this paper, we provide two axiomatizations of algebraic expected utility, which is a particular generalized expected utility, in a von Neumann-Morgenstern setting, i.e. uncertainty representation is supposed to be given and here to be described by a plausibility measure valued on a semiring, which could be partially ordered. We show that axioms identical to those for expected utility entail that preferences are represented by an algebraic expected utility. This algebraic approach allows many previous propositions (expected utility, binary possibilistic utility,...) to be unified in a same general framework and proves that the obtained utility enjoys the same nice features as expected utility: linearity, dynamic consistency, autoduality of the underlying uncertainty measure, autoduality of the decision criterion and possibility of modeling decision maker's attitude toward uncertainty.
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Wood, Frank, Griffiths, Thomas, Ghahramani, Zoubin
We present a non-parametric Bayesian approach to structure learning with hidden causes. Previous Bayesian treatments of this problem define a prior over the number of hidden causes and use algorithms such as reversible jump Markov chain Monte Carlo to move between solutions. In contrast, we assume that the number of hidden causes is unbounded, but only a finite number influence observable variables. This makes it possible to use a Gibbs sampler to approximate the distribution over causal structures. We evaluate the performance of both approaches in discovering hidden causes in simulated data, and use our non-parametric approach to discover hidden causes in a real medical dataset.
On the Number of Samples Needed to Learn the Correct Structure of a Bayesian Network
Zuk, Or, Margel, Shiri, Domany, Eytan
Bayesian Networks (BNs) are useful tools giving a natural and compact representation of joint probability distributions. In many applications one needs to learn a Bayesian Network (BN) from data. In this context, it is important to understand the number of samples needed in order to guarantee a successful learning. Previous work have studied BNs sample complexity, yet it mainly focused on the requirement that the learned distribution will be close to the original distribution which generated the data. In this work, we study a different aspect of the learning, namely the number of samples needed in order to learn the correct structure of the network. We give both asymptotic results, valid in the large sample limit, and experimental results, demonstrating the learning behavior for feasible sample sizes. We show that structure learning is a more difficult task, compared to approximating the correct distribution, in the sense that it requires a much larger number of samples, regardless of the computational power available for the learner.
Stratified Analysis of `Probabilities of Causation'
This paper derives new bounds for the probabilities of causation defined by Pearl (2000), namely, the probability that one observed event was a necessary (or sufficient, or both) cause of another. Tian and Pearl (2000a, 2000b) showed how to bound these probabilities using information from experimental and observational studies,with minimal assumptions about the data-generating process. We derive narrower bounds using covariates measurements that might be available in the studies. In addition, we provide identifiable case under no-prevention assumption and discuss the covariate selection problem from the viewpoint of estimation accuracy. These results provides more accurate information for public policy, legal determination of responsibility and personal decision making.
Reasoning about Uncertainty in Metric Spaces
We set up a model for reasoning about metric spaces with belief theoretic measures. The uncertainty in these spaces stems from both probability and metric. To represent both aspect of uncertainty, we choose an expected distance function as a measure of uncertainty. A formal logical system is constructed for the reasoning about expected distance. Soundness and completeness are shown for this logic. For reasoning on product metric space with uncertainty, a new metric is defined and shown to have good properties.
Belief Update in CLG Bayesian Networks With Lazy Propagation
In recent years Bayesian networks (BNs) with a mixture of continuous and discrete variables have received an increasing level of attention. We present an architecture for exact belief update in Conditional Linear Gaussian BNs (CLG BNs). The architecture is an extension of lazy propagation using operations of Lauritzen & Jensen [6] and Cow-ell [2]. By decomposing clique and separator potentials into sets of factors, the proposed architecture takes advantage of independence and irrelevance properties induced by the structure of the graph and the evidence. The resulting benefits are illustrated by examples. Results of a preliminary empirical performance evaluation indicate a significant potential of the proposed architecture.
A theoretical study of Y structures for causal discovery
Mani, Subramani, Spirtes, Peter L., Cooper, Gregory F.
There are several existing algorithms that under appropriate assumptions can reliably identify a subset of the underlying causal relationships from observational data. This paper introduces the first computationally feasible score-based algorithm that can reliably identify causal relationships in the large sample limit for discrete models, while allowing for the possibility that there are unobserved common causes. In doing so, the algorithm does not ever need to assign scores to causal structures with unobserved common causes. The algorithm is based on the identification of so called Y substructures within Bayesian network structures that can be learned from observational data. An example of a Y substructure is A -> C, B -> C, C -> D. After providing background on causal discovery, the paper proves the conditions under which the algorithm is reliable in the large sample limit.
A compact, hierarchical Q-function decomposition
Marthi, Bhaskara, Russell, Stuart, Andre, David
Previous work in hierarchical reinforcement learning has faced a dilemma: either ignore the values of different possible exit states from a subroutine, thereby risking suboptimal behavior, or represent those values explicitly thereby incurring a possibly large representation cost because exit values refer to nonlocal aspects of the world (i.e., all subsequent rewards). This paper shows that, in many cases, one can avoid both of these problems. The solution is based on recursively decomposing the exit value function in terms of Q-functions at higher levels of the hierarchy. This leads to an intuitively appealing runtime architecture in which a parent subroutine passes to its child a value function on the exit states and the child reasons about how its choices affect the exit value. We also identify structural conditions on the value function and transition distributions that allow much more concise representations of exit state distributions, leading to further state abstraction. In essence, the only variables whose exit values need be considered are those that the parent cares about and the child affects. We demonstrate the utility of our algorithms on a series of increasingly complex environments.